# From the upper right corner to read o the alignment

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 8 N 2 3 4 5 6 7 8 7 8 7 I 1 2 3 4 5 6 7 6 7 8 # 0 1 2 3 4 5 6 7 8 9 # E X E C U T I O N Minimum Edit Distance Compu'ng Minimum Edit Distance Minimum Edit Distance Backtrace for Compu'ng Alignments Dan Jurafsky Compu9ng alignments •  Edit distance isn’t suﬃcient •  We oBen need to align each character of the two strings to each other •  We do this by keeping a “backtrace” •  Every 'me we enter a cell, remember where we came from •  When we reach the end, •  Trace back the path from the upper right corner to read oﬀ the alignment Dan Jurafsky Edit Distance N 9 O 8 I 7 T 6 N 5 E 4 T 3 N 2 I 1 # 0 1 2 3 4 5 6 7 8 9 # E X E C U T I O N Dan Jurafsky MinEdit with Backtrace Dan Jurafsky •  Adding Backtrace to Minimum Edit Distance Base condi'ons: Termina'on: D(i,0) = i •  D(0,j) = j D(N,M) is distance Recurrence Rela'on: For each i = 1…M! ! For each j = 1…N D(i-1,j) + 1! D(i,j)= min ptr(i,j)= ! dele'on inser'on D(i,j-1) + 1! subs'tu'on D(i-1,j-1) + 2; if X(i) ≠ Y(j) ! 0; if X(i) = Y(j)! LEFT! inser'on DOWN! dele'on DIAG! subs'tu'on Dan Jurafsky xN The Distance Matrix Every non ­decreasing path from (0,0) to (M, N) corresponds to an alignment of the two sequences x0 y0 Slide adapted from Seraﬁm Batzoglou yM An optimal alignment is composed of optimal subalignments Dan Jurafsky Result of Backtrace •  Two strings and their alignment: Dan Jurafsky Performance •  Time: O(nm) O(nm) •  Space: •  Backtrace O(n+m) Minimum Edit Distance Backtrace for Compu'ng Alignments Minimum Edit Distance Weighted Minimum Edit Distance Dan Jurafsky Weighted Edit Distance •  Why would we add weights to the computa'on? •  Spell Correc'on: some leXers are more likely to be mistyped than others...
View Full Document

## This document was uploaded on 02/14/2014.

Ask a homework question - tutors are online