IR-part1

How much storage do we need 26 introducon to informaon

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: uence into word tokens §༊  Deal with “John’s”, a state- of- the- art solu<on §༊  Normaliza*on §༊  Map text and query term to same form §༊  You want U.S.A. and USA to match §༊  Stemming §༊  We may wish different forms of a root to match §༊  authorize, authoriza<on §༊  Stop words §༊  We may omit very common words (or not) §༊  the, a, to, of Introduc)on to Informa)on Retrieval S ec. 1.2 Indexer steps: Token sequence §༊  Sequence of (Modified token, Document ID) pairs. Doc 1 I did enact Julius Caesar I was killed i’ the Capitol; Brutus killed me. Doc 2 So let it be with Caesar. The noble Brutus hath told you Caesar was ambitious Introduc)on to Informa)on Retrieval Indexer steps: Sort §༊  Sort by terms §༊  And then docID Core indexing step Sec. 1.2 Introduc)on to Informa)on Retrieval Sec. 1.2 Indexer steps: Dic*onary & Pos*ngs §༊  Mul*ple term entrie...
View Full Document

Ask a homework question - tutors are online