View the step-by-step solution to:

corpora I took this code from this link : https://www.

hello,

Im new to python and I'm working on a homework to Convert Arabic Wikipedia xml dump file to text corpus using WikiCorpus from gensim.corpora

I took this code from this link : https://www.kdnuggets.com/2017/11/building-wikipedia-text-corpus-nlp.html


-----------------------------------------------------------------------------------------------

import sysfrom gensim.corpora import WikiCorpus

def make_corpus(in_f, out_f):

"""Convert Wikipedia xml dump file to text corpus"""

output = open(out_f, 'w') wiki = WikiCorpus(in_f)

i = 0 for text in wiki.get_texts(): output.write(bytes(' '.join(text), 'utf-8').decode('utf-8') + 'n') i = i + 1 if (i % 10000 == 0): print('Processed ' + str(i) + ' articles') output.close() print('Processing complete!')


if __name__ == '__main__':

if len(sys.argv) != 3: print('Usage: python make_wiki_corpus.py <wikipedia_dump_file> <processed_text_file>') sys.exit(1) in_f = sys.argv[1] out_f = sys.argv[2] make_corpus(in_f, out_f)

--------------------------------------


When I run it, the system exits because the arguments (sys.argv) are less than 3!


please help me what can i do

Recently Asked Questions

Why Join Course Hero?

Course Hero has all the homework and study help you need to succeed! We’ve got course-specific notes, study guides, and practice tests along with expert tutors.

-

Educational Resources
  • -

    Study Documents

    Find the best study resources around, tagged to your specific courses. Share your own to gain free Course Hero access.

    Browse Documents
  • -

    Question & Answers

    Get one-on-one homework help from our expert tutors—available online 24/7. Ask your own questions or browse existing Q&A threads. Satisfaction guaranteed!

    Ask a Question