This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: CS51 Project 1d: 10 100- Due: Friday, 13 March 2009 at 11:59 PM Total Points: 43 (including 5 style points) This is the fourth and final part of Project 1. Having worked with ADTs, graphs and random processes, we will now put them together to explore solutions to a compelling problem: finding “important” nodes in graphs like the Internet, such as the World Wide Web. The concept of assigning a measure of importance to nodes is very useful in designing search algorithms, such as those that many popular search engines rely on. Early search engines often ranked the relevance of pages based on the number of times that search terms appeared in the pages. However, it was easy for spammers to game this system by including popular search terms many times, propelling their results to the top of the list. When you enter a search query, you really want the important pages: the ones with valuable information, a property often reflected in the quantity and quality of other pages linking to them. Better algorithms were eventually developed that took into account the relationships between web pages, as determined by links. 1 These relationships can be represeted nicely by a graph structure, which is what we’ll be using here. 1 NodeScore ADT Throughout the assignment, we’ll want to maintain associations of graph nodes to their importance, or “NodeScore”: a value between 0 (completely unimportant) and 1 (the only important node in the graph). In order to assign NodeScores to the nodes in a graph, we’ve provided a module with an implementation of an ADT, nodescore , to hold such associations. nodescore has the following interface: [ nodescore? (- > any / c boolean? ) ] 1 For more about the history of search engines, you can check out this page: http://en.wikipedia.org/wiki/Search engine. 1 [ ns-new nodescore? ] [ ns-empty? (- > nodescore? boolean? ) ] [ ns- > string (- > nodescore? string? ) ] [ ns-scale (- > number? nodescore? nodescore? ) ] [ ns-normalize (- > nodescore? nodescore? ) ] [ ns-nodes (- > nodescore? list? ) ] [ ns-get-score (- > any / c ( and / c nodescore? ( not / c ns-empty? )) number? ) ] [ ns-set-score (- > any / c number? nodescore? nodescore? ) ] [ ns-add-score (- > any / c number? nodescore? nodescore? ) ] The nodescore structure makes it easy to create, modify, normalize (to sum to 1), and display NodeScores. More detailed documentation can be found in the implementation in nodescore.ss . 2 Testing Testing these algorithms is hard, because most of them involve a lot of randomness. We aren’t giving any explicit points for testing on this problem set. However, you will doubt- less want to test your code to make sure that it works! For the deterministic algorithm ( in-degree-nodescore ), you can do a small example by hand and then check that your algorithm gives the correct result. However, for the rest of them, you can’t do much better than running it a bunch of times for a large number of iterations and checking that it always...
View Full Document
This note was uploaded on 07/26/2009 for the course COMPUTERSC CS51 taught by Professor Gregmorrisett during the Spring '09 term at Harvard.
- Spring '09