Ritaitthe storm will arrive at the texas louisiana

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: as if you are among those whom we could not cause to become civilized Dan Jurafsky Word Segmenta-on Are word boundaries marked in wri-ng? •  Some wri1ng systems: boundaries between words not marked •  Chinese, Japanese, Thai •  Word segmenta1on becomes an important part of text normaliza1on for MT •  Some languages tend to have sentences that are quite long, closer to English paragraphs than sentences: •  Modern Standard Arabic, Chinese •  Sentence segmenta1on may be necessary for MT between these languages and languages like English Dan Jurafsky Inferen-al Load: cold vs. hot languages Balthasar Bickel. 2003. Referen1al density in discourse and syntac1c typology. Language 79:2, 708 ­36 •  Hot languages: •  Who did what to whom is marked explicitly •  English •  Cold languages: •  The hearer has more “figuring out” of who the var...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online