This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: CS100M Fall 2006 Project 3 due Thursday 10/5 at 6pm You must work either on your own or with one partner. You may discuss background issues and general solution strategies with others, but the project you submit must be the work of just you (and your partner). If you work with a partner, you and your partner must register as a group in CMS and submit your work as a group. Objectives Completing this project will help you learn about 1-d and 2-d arrays, characters, and strings. The objective is for you to learn to manipulate arrays (vectors and matrices); these arrays just happen to store characters. Therefore, we require that you write code to work with arrays cell-by-cell—do not use any built-in string functions. As you work on this project, you also will learn about information retrieval , an important topic in the computer science field of artificial intelligence . Have fun! Information Retrieval Information retrieval (IR) is the science of searching for information in documents, searching for documents themselves, searching for metadata which describe documents, or searching within databases. The term “information retrieval” was coined by Calvin Mooers in 1948–50. IR is a broad interdisciplinary field that draws on many other disciplines. It stands at the junction of many established fields and draws upon cognitive psychology, information architecture, information design, human information behavior, linguistics, information science, computer science, librarianship, and statistics. Automated information retrieval systems were originally used to manage the information explosion in scientific literature in the last few decades. Many universities and public libraries use IR systems to provide access to books, journals, and other documents. You probably have used such systems in libraries (or on the Internet) when you needed to find specific books or books on a specific topic. IR systems are often related to queries . Queries are formal statements of information needs that are put to an IR system by the user. User queries are matched to documents stored in a database. IR had a tremendous impact in the past decades, especially in the Internet age. Web search engines such as Google and Yahoo are amongst the most visible applications of information retrieval research. Can you imagine not having Google? To better understand IR, consider a collection of documents, say all the “help” information for built-in functions in Matlab. Maybe you want to be able to retrieve relevant information for particular kinds of operations, much like how Google returns relevant web-pages. Matlab does have basic search capabilities. For example, recall that you can type lookfor graphics in Matlab’s Command Window and Matlab will show you the names of all the functions whose first header comment line contains the word graphics . Let’s say we want a way to deal with a query that is more complex than just lookfor < a single keyword > . What if someone says, “I want to know about the....
View Full Document
- built-in function, Latent semantic analysis, matlab built-in function, Latent Semantic Indexing