Course Hero - We put you ahead of the curve!
You have requested the below document.

lecture7 Lehigh IE 170
Sign up now to view this document for free!
  • Title: lecture7
  • Type: Notes
  • School: Lehigh
  • Course: IE 170
  • Term: Spring

Coursehero >> Pennsylvania >> Lehigh >> IE 170
Course Hero has millions of student submitted documents similar to the one below including study guides, homework solutions, papers, and exam answer keys.

Data Divide-And-Conquer Structures Divide-And-Conquer Data Structures Taking Stock Last Time Master Theorem Practice Some Sorting Algs Beginning of Java Collections Interfaces Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University IE170: Algorithms in Systems Engineering: Lecture 7 This Time Hashes Trees and Binary Search Trees More on Java Collections Interfaces lab == fun January 29, 2007 Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Master Theorem Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Master Theorem The Java Collections Interfaces In the remainder of the class, we will be using the Java Collections Interface: http://java.sun.com/docs/books/ tutorial/collections/TOC.html Important: Most of what I will say only works if you set the "code level" to Java 5.0 in eclipse! Preferences, Java Compiler: Set this to 5.0 The interfaces form a hierarchy: (A subset of) the Collections Interface public interface Collection<E> extends Iterable<E> { // Basic operations int size(); boolean isEmpty(); boolean contains(Object element); boolean add(E element); //optional boolean remove(Object element); //optional Iterator<E> iterator(); // Array operations Object[] toArray(); <T> T[] toArray(T[] a); } Jeff Linderoth IE170:Lecture 7 Jeff Linderoth IE170:Lecture 7 Divide-And-Conquer Data Structures Master Theorem Divide-And-Conquer Data Structures Master Theorem Set Hash? A Set is a Collection that cannot contain duplicate elements. It models the mathematical set abstraction. The Set interface contains only methods inherited from Collection and adds the restriction that duplicate elements are prohibited. Set is still an interface. There are 3 implementations of Set in Java. HashSet TreeSet LinkedHashSet No, Cheech. A hash table is a data structure in which we can "look up" (or search) for an element efficiently. The expected time to search for an element in a has table in O(1). (Worst case time in (n)). Think of a hash table as an array With a regular array, we find the element whose "key" is j in position j of the array. j = 17; val = a[j]; . This is called direct addressing and it takes O(1) on your regular ol' random access computer. This form of direct addressing works when we can afford to have an array with one position for every possible key Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Master Theorem Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Master Theorem Example More on Hash In a hash table the number of keys stored is small relative to the number of possible keys A hash table is an array. Given a key k, we don't use k as the index into the array rather, we have a hash function h, and we use h(k) as an index into the array. Given a "universe" of keys K. Think of K as all the words in a dictionary, for example This look great. However, what happens if h(k1 ) = h(k2 ) for k1 = k2 ? Two keys hash to the same value. The elements collide This is typically handled by chaining Instead of storing a key k (or later key value pair (k, v)) at every position in the array, we store a linked list of keys. Example: h : K {0, 1, . . . m - 1}, so that h(k) gets mapped to an integer between 0 and m - 1 for every k K We say that k hashes to h(k) Jeff Linderoth IE170:Lecture 7 Jeff Linderoth IE170:Lecture 7 Divide-And-Conquer Data Structures Master Theorem Divide-And-Conquer Data Structures Master Theorem A (Fairly) Obvious point BAD hash function. h(k) = 3. If all keys hash to the same value, then looking up a key takes (n). (Since it is just a list). We would like a hash function to be "random" in the sense that a key k is equally likely to has into any of the m slots in the hash table (array). If we have such a function, then we can show that the average n time required to search for a key is (1 + m ) When hashing keys that are not numbers, you must convert them to numbers, e.g.: beer = -142 + 24 + 53 + 52 + 181 = 42. Average Hash Search Time The number of elements to be searched is 1 more than the number of elements that appear before x in x s list. Assuming we insert items into the list at the beginning, then this is the number of elements that were inserted after x. By definition: P(h(ki ) = h(kj )) = 1 m Let Xij be indicator random variable that is equal to one if and only if h(ki ) = h(kj ) Then just compute: E 1 n n i=1 1 + n j=i+1 Xij . Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Master Theorem Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Master Theorem Hash Functions Modular Hash Function Let m be (roughly) the size of your hash table: h(k) = k mod m Good choice of m: A prime number not too close to an exact power of 2 Multiplicative Hash Function h(k) = m(kA mod 1) Multiply key k by A, take fractional part, and multiply by m If m = 2p this can be done very fast with bit shifting A = ( 5 - 1)/2 seems a good value Back to the Java Collections So now you know what a Java HashSet is. A LinkedHashSet is a HashSet that also keeps track of the order in which elements were inserted. (Think of laying a linked list on top of the Hash Table) A TreeSet stores its elements in a alertred-black tree. In order to understand red-black trees, we must know about binary search trees. Hash table is "good" at insert(), search(), delete(). But what if you also want to support (efficiently) minimum(), maximum() Jeff Linderoth IE170:Lecture 7 Jeff Linderoth IE170:Lecture 7 Divide-And-Conquer Data Structures Java Collections Divide-And-Conquer Data Structures Java Collections Trees A tree is a set of items organized into a hierarchical structure (think of a family tree). When organized in this way, we call the items nodes. Each node has a single designated parent and one or more children. There is a single designated node, called the root, with no parent. Any node with no children is called a leaf. Any node with children is called internal. A tree in which all nodes have 2 or fewer children is called a binary tree. Storing a list of items in a tree structure allows us to represent additional relationships among the items in the list. Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Java Collections Binary Tree Data Structures To store a tree of keys k, or maybe (key, value) pairs: (k, v), we need a data structure supporting three basic operations left l: Points to the left child right r: Points to the right child parent p: Points to the parent This allows us to traverse tree the and perform other operations on it. The level of a node in the tree is the number of recursive calls to parent() needed to reach the root. The depth of the tree is the maximum level of any of its nodes. A balanced tree is one in which all leaves are at levels k or k - 1, where k is the depth of the tree. Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Java Collections Data Structures for Storing Trees Array The root is stored in position 0. The children of the node in position i are stored in positions 2i + 1 and 2i + 2. This determines a unique storage location for every node in the tree and makes it easy to find a node's parent and children. Using an array, the basic operations can be performed very efficiently. If the tree is unbalanced or dynamic, a linked list may be better. Data Structures for Storing Trees Linked List In a linked list, each item is stored along with explicit pointers to its parent and children. This allows for easy addition and deletion of nodes from the tree. Jeff Linderoth IE170:Lecture 7 Jeff Linderoth IE170:Lecture 7 Divide-And-Conquer Data Structures Java Collections Divide-And-Conquer Data Structures Java Collections Binary Search Tree Binary Search Trees There are lots of binary trees that can satisfy this property. A binary search tree is a data structrue that is conceptualized as a binary tree, but has one additional property: It is obvious that the number of binary tree on n nodes bn is bn = Binary Search Tree Property If y is in the left subtree of x, then k(y) k(x) 1 n+1 2n n bn = 4n (1 + O(1/n)) n3/2 And not all of these (exponentially many) are created equal. In fact, we would like to keep our binary search trees "short", because most of the operations we would like to support are a function of the height h of the tree. Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Java Collections Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Java Collections Operations Short Is Beautiful search() takes O(h) minimum(), maximum() also take O(h) Slightly less obvious is that insert(), delete() also take O(h) Thus we would like to keep out binary search trees "short" (h is small). successor(x) How would I know "next biggest" element? If right subtree is not empty: minimum(r(x)) If right subtree is empty: Walk up tree until you make the first "right" move insert(x) Just walk down the tree and put it in. It will go "at the bottom" Jeff Linderoth IE170:Lecture 7 Jeff Linderoth IE170:Lecture 7 Divide-And-Conquer Data Structures Java Collections Divide-And-Conquer Data Structures Java Collections delete() red-black Trees red-black trees are simply a way to keep binary search trees short. (Or balanced) If 0 or 1 child, deletion is fairly easy If 2 children, deletion is made easier by the following fact: Binary Search Tree Property If a node has 2 children, then its successor will not have a left child its predecessor will not have a right child Balanced here means that no path on the tree is more than twice as long as another path. An implication of this is that its maximum height is 2 lg(n + 1) search(), minimum(), maximum(), all take O(lg n) It's implementation is complicated, so we won't cover it insert(): also runs in O lg(n) delete(): runs in O lg(n) (but it is more complicated to maintain the "red-black" property) Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Java Collections Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Java Collections Back to the Java Collections Lists red-black trees remain sorted You don't really have any control over the order in which things will appear in a HashSet If you care about that you should use a LinkedHashSet, which lays a linked list on top of the HashSet In general, Sets are not for ordered collections of items, for that, you should use a list A List is an ordered Collection (sometimes called a sequence). Lists may contain duplicate elements. In addition to the operations inherited from Collection, the List interface includes operations for the following: Positional access: manipulate elements based on their numerical position in the list Search: searches for a specified object in the list and returns its numerical position Jeff Linderoth IE170:Lecture 7 Jeff Linderoth IE170:Lecture 7 Divide-And-Conquer Data Structures Java Collections Divide-And-Conquer Data Structures Java Collections (Subset of) List Interface public interface List<E> extends Collection<E> { // Positional access E get(int index); E set(int index, E element); //optional boolean add(E element); //optional void add(int index, E element); //optional E remove(int index); //optional // Search int indexOf(Object o); int lastIndexOf(Object o); // Iteration ListIterator<E> listIterator(); ListIterator<E> listIterator(int index); } Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Java Collections Java List Implementations Two List Implementations 1 2 ArrayList: which is usually the better-performing LinkedList: offers better performance under certain circumstances, (i.e. if lots of add/remove in the middle if the list) Jeff Linderoth Divide-And-Conquer Data Structures IE170:Lecture 7 Java Collections Java Lists have extended iterators List Stuff ListIterator<E> listIterator(): gives iterator at beginning ListIterator<E> listIterator(int index): gives iterator at specified index The index refers to the element that would be returned by an initial call to next() The cursor is always between two elements: the one that would be returned by a call to previous() the one that would be returned by a call to next() public interface ListIterator<E> extends Iterator<E> { boolean hasNext(); E next(); boolean hasPrevious(); E previous(); int nextIndex(); int previousIndex(); void remove(); //optional void set(E e); //optional void add(E e); //optional } The n + 1 valid index values correspond to the n + 1 gaps between elements, from the gap before the first element to the gap after the last one. Jeff Linderoth IE170:Lecture 7 Jeff Linderoth IE170:Lecture 7 Divide-And-Conquer Data Structures Java Collections Next Time A bit on Java Collection Map Interface Move on to Heaps (Chapter 6) We have covered chapters 1-4, 10-11, and Appendices A and B News New Homework Posted! Let's have a little quiz on 2/7 Homework is due 2/5: No late homework accepted. (I need to hand out solutions and discuss in class on 2/5). Jeff Linderoth IE170:Lecture 7

Find millions of documents here - Study Guides, Homework Solutions, Papers, Exam Answer Keys and more. Course Hero has millions of course related materials that will enable you to learn better, faster and get an A in all your courses.
Below is a small sample set of documents:

Mining-Anchor-Text
Path: Lehigh >> CSE >> 450 Spring, 2008

Description: Mining Anchor Text for Query Refinement Reiner Kraft and Jason Zien IBM Almaden Research Center Mark Strohmaier Problem Motivation 23% of search queries are single-term Expanding the query can lead to more accurate searches Previous studies indi...
lecture26
Path: Lehigh >> IE >> 170 Spring, 2007
Description: Taking Stock IE170: Algorithms in Systems Engineering: Lecture 26 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University Last Time Flows This Time Review! April 2, 2007 Jeff Linderoth (Lehigh University) IE170:Lecture...
topic-sensitive-pagerank
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Topic-Sensitive PageRank Taher H. Haveliwala Stanford University Computer Science Department Stanford, CA 94305 taherh@cs.stanford.edu (650) 723-9273 ABSTRACT In the original PageRank algorithm for improving the ranking of search-query results, a...
lecture1
Path: Lehigh >> IE >> 170 Spring, 2007
Description: Today\'s Outline IE426: Algorithms in Systems Engineering: Lecture 1 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University About this class. About me About you Say Cheese! Quiz Number 0 Background in Algorithms January ...
PageRank-without-hyperlinks
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: PageRank without hyperlinks: Structural re-ranking using links induced by language models Oren Kurland and Lillian Lee Presentation by Yang Yu Related Work Query-dependent clustering Do not directly induces an obvious ranking of doc. Techniques ba...
lecture33
Path: Lehigh >> IE >> 170 Spring, 2007
Description: What We\'ve Learned Part One IE170: Algorithms in Systems Engineering: Lecture 33 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 1 2 3 4 5 Summation Formulae, Induction and Bounding How to compare functions: o, ,...
WorkplaceWeb
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Searching the Workplace Web Ronald Fagin Ravi Kumar Kevin S.McCurley Jasmine Novak D. Sivakumar John A.Tomlin David P.Williamson Presentation by Na Dai Motivation Influence of social forces on internet vs. intranet Reflections Development guidance...
lecture31
Path: Lehigh >> IE >> 170 Spring, 2007
Description: I Hate A-Rod! IE170: Algorithms in Systems Engineering: Lecture 31 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University April 20, 2007 Jeff Linderoth (Lehigh University) IE170:Lecture 31 Lecture Notes 1 / 14 Jeff L...
CorroborateLearnFacts
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Corroborate and learn facts from the web Shubin Zhao and Jonathan Betz Presentation by Yang Yu 1 Problem Definition Known Facts Entity_Name Date of Birth Angelina Jolie June 4, 1975 More Facts Entity_Name Date of Birth Academy Awards Place of birt...
04cpphandout
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Data abstraction with C+ classes A data structure for dates in C: struct Date { int month, day, year } void setMonth(struct Date*,int); /Note: ANSI C function prototypes from C+ void setDay(struct Date*,int); void printDate(Date*) . What\'s the proble...
lec2-mapred
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Distributed Computing Seminar Lecture 2: MapReduce Theory and Implementation Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007 Except as otherwise noted, the contents of this presentation are Copyright 2007 University of Wa...
undo
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Sample (short and simple) Requirements Specification Line Editor with Multiple Undo/Redo Purpose: Develop a simple text editor, whose most interesting feature will be multiple undo/redo, i.e., it should be possible to undo any sequence of commands th...
opinion_observer
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Opinion Observer: Analyzing and Comparing Opinions on the Web Bing Liu, Minging Hu, Junsheng Cheng Presentation by Mark Strohmaier Problem Overview The Internet allows people to see product reviews from a large number of people Often easier for c...
Assessments
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Team Project Role Assessments: Project manager Each project manager should give a self-assessment, and each person who interacts with a project manager should evaluate that person\'s performance of that in the project Project manager\'s name: __ Respon...
info-retrieval
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Chapter 5: Information Retrieval and Web Search An introduction Most slides courtesy Bing Liu Introduction Text mining refers to data mining using text documents as data. Most text mining tasks use Information Retrieval (IR) methods to pre-p...
10inher
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: The \"meaning\" of inheritance Semantics of inheritances is shifty (Overhead: A Hierarchy of Classes) -a classic paper by Woods on semantic networks: What\'s in a link? -isa could mean subtype or instance-of or has-properties-of. See and discuss UM mult...
introduction
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Web Mining Seminar CSE 450 Spring 2008 MWF 11:1012:00pm Maginnes 113 Instructor: Dr. Brian D. Davison Dept. of Computer Science & Engineering Lehigh University http:/www.cse.lehigh.edu/~brian/course/webmining/ davison@cse.lehigh.edu Course Obje...
09templa
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Templates LOOKOUT library includes several different array classes: IntArray, FloatArray, StringArray. Cobiously these have a lot in common; the only difference between them is element type CWouldn=t it be simpler if there were just one generic array...
lec5-pagerank
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Distributed Computing Seminar Lecture 5: Graph Algorithms Sierra Michels-Slettvet Summer 2007 Except as otherwise noted, the content of this presentation is 2007 Google Inc. and licensed under the ...
13exRTTI
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Exception handling in C+ Why is exception handling a good idea? What is it good for? Robustness: error recovery, or at least graceful termination Goal: separate exceptional from normal processing How is error handling done in traditional C code? 1) R...
lec4-clustering
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Distributed Computing Seminar Lecture 4: Clustering an Overview and Sample MapReduce Implementation Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007 Except as otherwise noted, the content of this presentation is 2007 Googl...
14sockets
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Networking in Java Networking is a massive and complex topic, whole courses are devoted to this subject Java provides a rich set of networking capabilities Ranging from manipulating URLs on the Internet to client-server systems connecting via sockets...
web-decay
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Sic Transit Gloria Telae: Towards an Understanding of the Web\'s Decay Ziv Bar-Yossef et al IBM Almaden and T.J Watson Research Centers Mark Strohmaier Problem Motivation Determining if a link is dead is not trivial Using dead links as a decay...
03UseCaseHandout
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Use Cases Two kinds of use case documents: a use case diagram and use case text. Text provides the detailed description of a particular use case Diagram provides an overview of interactions between actors and use cases Here\'s an example of a use case...
classification2
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Nave Bayesian classification Nave Bayes for text classification Support vector machines K-nearest neighbor Ens...
07eiffel
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Eiffel Like the tower, emphasizing elegant French design; developed by Bertrand Meyer The basic structure of object-oriented languages is the class a class is both 1) a module and 2) a type as a module: an interface (set of available services) & an i...
Detecting_Phrase_Level_Duplication
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: DETECTING PHRASE-LEVEL DUPLICATION ON THE WORLD WIDE WEB Dennis Fetterly, Mark Manasse Marc Najork Microsoft Research SIGIR\'05 CSE 450 Web Mining Seminar Presented by Liangjie Hong y gj g March 24th, 2008 1 BACKGROUND Types of Spam Content Spam Lin...
11idiom
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: C+ idioms Now that you\'ve learned C+, how do we use it well A highly recommended book: Scott Meyers Effective C+: 50 Specific Ways to Improve your Programs and Designs, Second Edition, Addison-Wesley, 1997. (Also More Effective C+: 35 New Ways.) Scot...
05ooa
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Object-Oriented Analysis Requirements analysis and domain analysis precede design So far we\'ve looked at requirements analysis-understanding what the customer wants Domain analysis understands the customer\'s problem-by identifying the classes compris...
CostEstimation
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Cost Estimation Van Vliet, chapter 7 Glenn D. Blank Cost estimates: when and why When does a contractor estimate costs for building a house? Before construction begins, let alone payment Takes into account subcontracts for foundation, framing, ...
ADTs
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Abstract data types What does abstract` mean? From Latin: to pull out`-the essentials To defer or hide the details Abstraction emphasizes essentials and defers the details, making engineering artifacts easier to use I don`t need a mechanic`s ...
Assertions in Java
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Assertions in Java (JDK 1.4) Jarret Raim updated by Glenn Blank What is an assertion? An assertion is a statement in Java that enables you to test your assumptions about your program. Each assertion contains a boolean expression that you believ...
01objectives
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: CSE432: Object-Oriented Software Engineering Objectives What do you hope to learn in this course? Here are my list of course objectives: To investigate principles of object-oriented software engineering, from analysis through testing To lea...
LehighUML
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: LehighUML Project John Pequeno, Adam Balgach, Sally Moritz & Professor Glenn Blank Extreme Programming XP Method Iterative Development Iterations measured in minutes to weeks of iteration dependent on project type. LehighUML Iterations of 1 to 3...
J2EECatieWelsh
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: J2EE Structure & Definitions Catie Welsh CSE 432 http:/www.developer.com/java/ejb/article.php/1434371 http:/java.sun.com/j2ee/1.4/docs/tutorial/doc/index.html J2EE Breakdown Web Clients contain 2 parts Dynamic Web pages containing HTML, XML W...
paper_115
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: A Taxonomy of JavaScript Redirection Spam Kumar Chellapilla Microsoft Live Labs One Microsoft Way Redmond, WA 98052 +1 425 707 7575 Alexey Maykov Microsoft Live Labs One Microsoft Way Redmond, WA 98052 +1 425 705 5193 kumarc@microsoft.com ABSTRACT ...
Exceptions in Java
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Exceptions in Java What is an exception? An exception is an error condition that changes the normal flow of control in a program Exceptions in Java separates error handling from main business logic Based on ideas developed in Ada, Eiffel and C...
clustering
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Chapter 4: Unsupervised Learning Most slides courtesy Bing Liu Road map Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Data standardization Handling mixed attributes Which clusterin...
02softEngIntro
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Why software engineering? Demand for software is growing dramatically Software costs are growing per system Many projects have cost overruns Many projects fail altogether Software engineering seeks to find ways to build systems that are on time ...
classification
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Chapter 3: Supervised Learning Most slides courtesy Bing Liu Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Nave Bayesian classification Nave Bayes for tex...
Designpatterns
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Design patterns Glenn D. Blank Definitions A pattern is a recurring solution to a standard problem, in a context. Christopher Alexander, a professor of architecture. Why would what a prof of architecture says be relevant to software? \"A patte...
Ranking-the-Web-Frontier
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Nadav Eiron, Kevin S.McCurley, JohA.Tomlin IBM Almaden Research Center WWW\'04 CSE 450 Web Mining Presented by Zaihan Yang Introduction & Contribution Propose algorithmic innovations for the basic PageRank paradigm. Problem of Web Frontier ( Dangl...
05useCasesToClasses
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: From use cases to classes (in UML) A use case for writing use cases Use case: writing a use case Actors: analyst, client(s) Client identifies and write down all the actors. Analyst writes down all the actors. Client identifies the use cases, i.e., g...
pr-presentation
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Mining Web Multi-resolution Community-based Popularity for Information Retrieval Laurence A. F. Park Kotagiri Ramamohanarao Department of Computer Science and Software Engineering University of Melbourne, Australia {lapark,rao}@csse.unimelb.edu.au ...
03RequirementsSpecification
Path: Lehigh >> CSE >> 432 Fall, 2008
Description: Requirements specification CSE432 Object-Oriented Software Engineering Requirements analysis and system specification Why is this the first stage of most life cycles? Need to understand what customer wants first! Requirements analysis says: \"Make...
chi00
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen School of Information Management & Systems University of California Berkeley, CA 94720 USA hchen@sims.berkeley.edu ABSTRACT Susan Dumais Microsoft Research One Microsoft W...
link-analysis
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Chapter 6: Link Analysis Most slides courtesy Bing Liu Road map Introduction Social network analysis Co-citation and bibliographic coupling PageRank HITS Summary 2 Introduction Early search engines mainly compare content similarity of t...
QueryChains
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: CSE 450 Web Mining Seminar CSE W b Mi i S i Jian Wang Roadmap d Analysis of User Behavior A l i f U B h i Analysis of Implicit Feedback Learning Ranking Functions Conclusion and Future Work Reference: Accurately Interpreting Clickthrough Dat...
BringingOrder
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: by Hao Chen, Susan Dumais by Hao Chen Susan Dumais cse 450: Web Mining Seminar Jian Wang ABSTRACT & INTRODUCTION A user interface that organizes Web search results into hierarchical lt i t hi hi l categories. Two main components A text classi...
Web-Page-Summarization
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: CSE 450 Web Mining Seminar CSE W b Mi i S i Jian Wang Introduction Extractbased generic Webpage summarization To utilize extra knowledge to improve Webpage summarization, i.e., clickthrough dataset summari ation i e clickthrough dataset To bui...
MT1_WhiteKey
Path: UC Davis >> BIS >> 102 Spring, 2008
Description: Name White Key Last, First BIS102-02, Spring \'08, Page 1 of 7 C. S. Gasser BioSci 102-02 Apr. 22, 2008, First Midterm Instructions: There are seven pages in this exam including the cover sheet, please count them before you start to make sure a...
A_Taxonomy_Of_javaScript_Redirection_Spam
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: A TAXONOMY OF JAVASCRIPT REDIRECTION SPAM Kumar Chellapilla, Alexey Maykov Microsoft Live Labs AIRWeb 2007 CSE 450 Web Mining Seminar Presented by Liangjie Hong Feb 20th, 2008 1 BACKGROUND & INTRODUCTION What is Spam? Any deliberate human actio...
WebUsageMining
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Web Usage Mining: An Overview Lin Lin Department of Management Lehigh University Jan. 30th Agenda Web Usage Mining: Definition Research Issues in Web Usage Mining Current Research in Web Usage Mining Going Forward Web Usage Mining: A Definition...
yang-using-web-structure
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Eric J. Glover1, Kostas Tsioutsiouliklis1,2, Steve Lawrence1, David M. Pennock1, Gary W. Flake1 International World Wide Web Conference, 2002 Presented by Zaihan Yang CSE Web Mining Introduction Aim Classification of web pages Description of web...
9783540378815-c1
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Preface The rapid growth of the Web in the last decade makes it the largest publicly accessible data source in the world. Web mining aims to discover useful information or knowledge from Web hyperlinks, page contents, and usage logs. Based on the pr...
Navigation-Aided-Retrieval
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Navigation-Aided Retrieval Shashank Pandit and Christopher Olstony Presentation by Yang Yu CSE 450 Web Data Mining Outline Introduction Related Work System Model Prototype System Evaluation Summary & Future Work Introduction Background reas...
Practice%20Final
Path: UC Davis >> NPB >> 114 Spring, 2008
Description: NPB 114 Final Exam (2004) Matching (2 pts each). NOTE: Some answers may be used more than once a. Enterokinase b. Amylase c. Sucrase _ _ _ _ d. Trypsin e. Lactase 1. This enzyme doesn\'t breakdown any food items in the GI tract 2. Its activity yields...
DUSTBuster
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Ziv Bar-Yossef, Idit Keidar, Uri Schonfeld WWW\'07 CSE 450 Web Mining Presented by Zaihan Yang Introduction & Contribution Propose a novel algorithm DustBuster for uncovering DUST. Discover DUST rules from a URL list Mainly focus on the substring sub...
TellingApplesFromOranges
Path: Lehigh >> CSE >> 450 Spring, 2008
Description: Enhanced Web Page Classification Xiaoguang Qi Background Utilizing features of neighbors Using fielded features Problem definition Classification A set of labeled data is used to train a classifier which can be applied to label future example...
jeffdoku2
Path: Lehigh >> IE >> 426 Fall, 2006
Description: 1 4 ...
sudoku2
Path: Lehigh >> IE >> 426 Fall, 2006
Description: 1 2 2 3 ...
jeffdoku
Path: Lehigh >> IE >> 426 Fall, 2006
Description: 6 8 2 1 4 5 6 1 7 6 3 1 4 2 6 7 5 7 5 6 8 9 ...
jeffdoku
Path: Lehigh >> IE >> 426 Fall, 2006
Description: 6 8 2 1 4 5 6 1 7 6 3 1 4 2 6 7 5 7 5 6 8 9 ...
Practice%20MT%202
Path: UC Davis >> NPB >> 114 Spring, 2008
Description: NPB 114 Practice MT#2 Matching (1 pt each) a. Acinar cell b. Endocrine cell c. Parietal cell _ _ _ d. Chief cell e. Mucous cell 1. Produces an alkaline fluid to protect the stomach 2. Its product is released into the bloodstream 3. Produces pepsinog...
sudoku
Path: Lehigh >> IE >> 426 Fall, 2006
Description: 6 8 2 8 6 7 5 7 4 1 3 4 5 6 5 1 4 7 3 9 1 4 2 2 5 6 8 9 7 6 ...
sudoku
Path: Lehigh >> IE >> 426 Fall, 2006
Description: 6 8 2 8 6 7 5 7 4 1 3 4 5 6 5 1 4 7 3 9 1 4 2 2 5 6 8 9 7 6 ...
news
Path: Lehigh >> IE >> 426 Fall, 2006
Description: Scenario Mean Stdev Buy Optimal q c r 100 30 100 85 0.7 0.5 0.05 YOUR CHOICE OPTIMAL Demand Sell Salvage Profit Sell Salvage 1 121 100 0 70 85 0 2 71 71 29 51.15 71 14 3 110 100 0 70 85 0 67 67 33 48.55 67 18 59 59 41 43.35 59 26 51 51 49 38.15 51 3...
hw1-survey
Path: Lehigh >> IE >> 426 Fall, 2006
Description: Informal Homework Survey September 14, 2006 Please answer the following questions. This is an anonymous survey, but even if it wasn\'t, I wouldn\'t hold your answers against you. Difficulty On a scale of 1-10, with a 10 being \"I hate you. Why are you ...
survey
Path: Lehigh >> IE >> 426 Fall, 2006
Description: IE426 Course Survey-Quiz #0 Name: email: Background Mathematics Mathematicians are like Frenchmen: whatever you say to them they translate into their own language and forthwith it is something entirely different.\" -Johann Wolfgang von Goethe Please...
wap
Path: Lehigh >> IE >> 426 Fall, 2006
Description: IE 426 Case Study Integer Programming 1 Wireless Capacity Expansion Planning Note: This is a real consulting problem. The names have been changed to protect the innocent. Prof. Linderoth will be acting as the client. You have been contracted by a...

Course Hero is not sponsored or endorsed by any college or university.