lecture9-queryexpansion-handout-6-per

491 070991 soviets may adapt parts of ss20 missile

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ianespace 3.004 bundespost 2.806 ss 2.790 rocket 2.053 scien*st 2.003 broadcast 1.172 earth 0.836 oil 0.646 measure Introduc)on to Informa)on Retrieval S ec. 9.1.1 Results for expanded query Expanded query aker relevance feedback                   Introduc)on to Informa)on Retrieval 3. 0.493, 08/07/89, When the Pentagon Launches a Secret Satellite, Space Sleuths Do Some Spy Work of Their Own 4. 0.493, 07/31/89, NASA Uses Warm Superconductors For Fast Circuit 8 5. 0.492, 12/02/87, Telecommunica*ons Tale of Two Companies 6. 0.491, 07/09/91, Soviets May Adapt Parts of SS ­20 Missile For Commercial Use 7. 0.490, 07/12/88, Gaping Gap: Pentagon Lags in Race To Match the Soviets In Rocket Launchers 8. 0.490, 06/14/90, Rescue of Satellite By Space Agency To Cost $90 Million Sec. 9.1.1 Introduc)on to Informa)on Retrieval Sec. 9.1.1 Key concept: Centroid Rocchio Algorithm   The centroid is the center of mass of a set of points   Recall that we represent documents as points in a high ­dimensional space   Defini*on: Centroid 1 µ (C ) = d   The Rocchio algorithm uses the vector space model to pick a relevance feedback query   Rocchio seeks the query qopt that maximizes qopt = arg max [cos(q, µ (Cr )) − cos( q, µ (Cnr ))] ∑ | C | d∈C where C is a set of documents. q   Tries to separate docs marked relevant and non ­ relevant 1 1 qopt = ∑d Cr d j ∈Cr j − Cnr ∑d d j ∉Cr j   Problem: we don t know the truly relevant docs Introduc)on to Informa)on Retrieval Sec. 9.1.1 Th...
View Full Document

This document was uploaded on 02/26/2014.

Ask a homework question - tutors are online