Section6.0 - Module 6 Introduction Using one word at a time...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Module 6 Introduction Using one word at a time to search for webpages is often inadequate: sometimes the pages returned do not have the information you are seeking, and other times far too many pages are returned so that you cannot find the one(s) that you need. The goals of this second unit examining search engines are: to explore more powerful methods to express what you are seeking; to learn how to use Boolean operators to express queries; to learn how search engines can support queries involving phrases and other proximity queries; to learn how search engines could support queries that restrict search terms to appear within specific HTML elements on a page. More specifically, by the end of the module, you will be able to: express queries using Boolean and , or and and not ; interpret Boolean queries using Venn diagrams; explain how search engines can use postings lists to find pages that satisfy Boolean queries; explain how search engines can use postings lists to find pages that include multiword phrases, even if the phrases include stop words; explain how postings lists for HTML tags can be used to limit the elements within which search terms are to be matched to index terms.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
CS 100 Module 6 6.2 6.0 Advanced Querying © 2009, University of Waterloo When you are searching for some information, you express your needs as a query , which could be as simple as writing down one or two words. As we saw in the last module, the search engine uses postings lists for the index terms that appear on those pages. The task on the engine in servicing your request is to match your query against the collection of index terms to find the pages that match your needs. In this module we’ll examine more advanced ways to express those needs more precisely. Basis for information retrieval User query Index terms matching
Background image of page 2
CS 100 Module 6 6.3 6.1 Boolean Queries As we saw at the end of the last module, searching for individual words often does not provide enough power to express what you’re really seeking. Sometimes webpage authors choose from a variety of words to describe one concept, and other times a single word has a wide variety of meanings, each of which would return too many irrelevant pages for your purposes. The solution is to use several words, and to indicate explicitly how the search engine should relate those to the index terms extracted from webpages. 6.1.1 Boolean Operators A postings list represents the set of webpages that includes a particular term. When you are searching for certain webpages, you use those same terms to describe the set of pages that you wish to retrieve. In effect, you need to describe your desired set of pages as a combination of the sets known to the search engine. In 1854, George Boole devised a simple system for expressing combinations of sets, and this so-called Boolean algebra is the most common system used today.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/04/2011 for the course CS 100 taught by Professor Bb during the Spring '11 term at University of Warsaw.

Page1 / 21

Section6.0 - Module 6 Introduction Using one word at a time...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online