Written Assignment 3: The Levenshtein Distance A spell checker is a word processing program that makes suggestions when it finds a word not in the...
View the step-by-step solution to:

Question

# The levenshein distance src="/qa/attachment/10754888/" alt="20191014_151016.jpg" /> Attachment 1 Attachment 2 ATTACHMENT PREVIEW Download attachment 20191014_151016.jpg Written Assignment 3: The Levenshtein Distance A spell checker is a word processing program that makes suggestions when it finds a word not in the dictionary . To determine what words to suggest, it tries to find similar words. One measure of word similarity is the Levenshtein distance, which measures the number of substitutions, additions, or deletions that are required to change one word into another. For example, the words spit and spot are a distance of 1 apart; changing spit to spot requires one substitution (i for o). Likewise, spit is distance 1 from pit since the change requires one deletion (the s). The word spite is also distance 1 from spit since it requires one addition (the e). The word soot is distance 2 from spit since two substitutions wo ns would be required (i for o and p for o). This situation can be represented using the graph below whose vertices are the words and the edges connect words at distance one. spite spit spot pit soot Here is another example. There are several words at distance 1 from the misspelled word &quot;aed&quot;: aid, and, led, med. These words are included in the following graph, together with th words mad and let that are at distance 2 from aed. Note that the three words aed, aid, and and only differ by the middle letter. So they are all at distance 1 from each other forming 'triangle' in the graph. led let aed med mad aid and 107 ATTACHMENT PREVIEW Download attachment 20191014_151024.jpg a. Create a graph using words as vertices, and edges connecting words with a Levenshtein distance of 1. Use the misspelled word &quot;moke&quot; as the center, and try to find at least 10 connected dictionary words. How might a spell checker use this graph? Caprin nouslob 10oz blow of'T (s droo nonoutlawinT .Co rot q one . sno soundaib is abrow Noornow eagbe alt bure abrown b. Improve the method from above by assigning a weight to each edge based on the likelihood of making the substitution, addition, or deletion. You can base the weights on any reasonable approach: proximity of keys on a keyboard, common language errors, etc. Include the weights on your graph from part (a) and explain how you assigned the weights. word of astjol slobim ed distrib vino ban bern Note that these weights and Dijkstra's algorithm can be used to find the shortest path the spell checker. from any word to &quot;moke&quot;. A word with shortest distance to &quot;moke&quot; is a good candidate for 108

### Why Join Course Hero?

Course Hero has all the homework and study help you need to succeed! We’ve got course-specific notes, study guides, and practice tests along with expert tutors.

• ### -

Study Documents

Find the best study resources around, tagged to your specific courses. Share your own to gain free Course Hero access.

Browse Documents