ASSIGNMENT6 edwisor.docx - ASSIGNMENT \u2013 ADVANCE PREDICTIVE MODELLING QUESTION 1How will you treat text having short cut words(like bcz u thr etc\u2026

ASSIGNMENT6 edwisor.docx - ASSIGNMENT u2013 ADVANCE...

This preview shows page 1 - 3 out of 4 pages.

ASSIGNMENT – ADVANCE PREDICTIVE MODELLING QUESTION 1- How will you treat text having short cut words (like bcz, u, thr etc…)? ANSWER 1- If we have a text having short cut words like bcz , u , thr ,… etc , then : Stemming can bring the words in root form , though stemming object group needs to be defined for these words. Stemming reduce tokens to root form of words to recognize morphological variation. Correct morphological analysis is language specific and can be complex. QUESTION 2- Write R and python code to replace “bcz” with “because” in whole text? ANSWER 2- PYTHON- import regex as re def remove_words(my_line): new_line ='' for i in my_line.split(): if i in compiler_bcz.findall(my_line): new_line = new_line + ' ' + 'because' else:
Image of page 1
new_line = new_line + ' ' + i return new_line R- #preprocessing #convert bcz to because because = bcz(vector source) writeline(as.character(because)) #because = tm_map(because , plaintextdocuments) QUESTION 3- How do you deal with the English text having Hindi words in between?
Image of page 2
Image of page 3

You've reached the end of your free preview.

Want to read all 4 pages?

  • Summer '20
  • QUESTION 2Write R

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture