4spec - CS32 Winter 2010 Project #4 !cheating Due: March...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
CS32 Winter 2010 Project #4 !cheating Due: March 11, 2010 Make sure to read the entire document (especially Requirements and Other Thoughts) before starting your project.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Table of Contents Introduction .......................................................................................................................... 3 But I don’t know how to use C++ to access the Internet! .................................................... 6 Ok, so what is it I have to do? .............................................................................................. 6 The LinkExtractor Class .................................................................................................. 6 The Searcher Class ........................................................................................................... 9 The Crawler Class .......................................................................................................... 12 The CheatFinder Class ................................................................................................... 17 Your main() Function ................................................................................................... 18 How do Command Line Parameters/Arguments work? ................................................ 20 Requirements and Other Thoughts .................................................................................... 23 What to Turn In .................................................................................................................. 25 Grading .............................................................................................................................. 26 2
Background image of page 2
Introduction For your fourth and final project, you’ve been hired by the NachenSmall software company, the world’s 352 nd largest electronic detective agency, to program a new digital plagiarism detector for use by the UC system. What’s digital plagiarism? Well, with the advent of the internet, students of all ages have mastered the art of electronic plagiarism – that is, cutting and pasting the contents of authoritative web pages into term papers, C++ programs, etc., and failing to attribute their content to the original author. What does a plagiarism detector do? Given a student essay (or C++ program, for that matter) your detector will “crawl” the Internet looking for web pages that contain similar word sequences to the essay. All web pages that have enough in common with the student’s essay are flagged for investigation for the professor. The primary inputs to a plagiarism detector are a student essay, which is provided in the form of a text file, and a specific starting URL (e.g., http://www.wikipedia.org/wiki/plagiarism ). The detector works by downloading the specified starting web page off the internet and comparing its contents to the provided student essay. If the essay and the web page share many words in common, the web page’s information is saved for later presentation to the instructor. Next, the detector identifies all web pages that are
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/08/2010 for the course COM SCI 32 taught by Professor Smallberg during the Spring '07 term at UCLA.

Page1 / 26

4spec - CS32 Winter 2010 Project #4 !cheating Due: March...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online