Lecture 3
Classification Trees
1
2
Classification and Regression Trees
If one had to choose a classification technique that performs well across a wide range of situations without requiring much effort from the application developer while being readily un
Network is a collection of objects where some
pairs of objects are connected by links
What is the structure of the network?
9/27/2011
Jure Leskovec, Stanford CS224W: Social and Information Network Analysis
36
Objects: nodes, vertices
Interactions: links,
Understanding and Managing
Cascades on Large Graphs
B. Aditya Prakash
Virginia Tech.
Christos Faloutsos
Carnegie Mellon University
Sept 28, Tutorial, ECML-PKDD 2012, Bristol
Information Diffusion
Mary McGlohon
CMU 10-802
3/23/10
KDD 2012 Tutorial
Informat
Research Theme
ANALYSIS
Understanding
POLICY/
ACTION
DATA
Large real-world
networks & processes
Managing
Prakash and Faloutsos 2012
1
Research Theme Public Health
ANALYSIS
Will an epidemic
happen?
POLICY/
ACTION
DATA
Modeling # patient
transfers
Prakash a
Part 1: Cascades
Q1: How do cascades look like?
J. Leskovec, M. McGlohon, C. Faloutsos,
N. Glance, M. Hurst.
Cascading Behavior in Large Blog Graphs.
SDM, 2007.
Q2: How does activity evolve over time?
J. Leskovec, L. Backstrom, J. Kleinberg.
Meme-tracking
Mining Knowledge-Sharing Sites for Viral Marketing
Matthew Richardson and Pedro Domingos
Department of Computer Science and Engineering University of Washington Box 352350 Seattle, WA 98195-2350
cfw_mattr,pedrod@cs.washington.edu ABSTRACT
Viral marketing
Dimensionality Reduction: Principal Components Analysis
In data mining one often encounters situations where there are a large number of variables in
the database. In such situations it is very likely that subsets of variables are highly correlated
with
Data Mining: Overview
What is Data Mining?
Recently* coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science, engineering and business. In a state of flux, man
Lecture 6
Articial Neural Networks
1
1
Articial Neural Networks
In this note we provide an overview of the key concepts that have led to
the emergence of Articial Neural Networks as a major paradigm for Data
Mining applications. Neural nets have gone thro
15.062 Data Mining Spring 2003
Nitin R. Patel
Comparison of Data Mining techniques large data sets Guidelines ( and only guidelines)
H: high, M:medium, L:low.
Neural
Nets
Trees
k-Nearest
Neighbors
Accuracy
Logistic
Discriminant Nave
Multiple
Regression An
Lecture 4
Discriminant Analysis
1
Discriminant analysis uses continuous variable measurements on different groups of
items to highlight aspects that distinguish the groups and to use these measurements to
classify new items. Common uses of the method have
Sales of Handloom Saris
An Application of Logistic
Regression
Objectives
Illustrate importance of interpretation, domain
insights from managers for interpretation and
implementation
Relevance to situations where too many products
(or services) but can d
Lecture 2
Judging the Performance of
Classifiers
1
In this note we will examine the question of how to judge the usefulness of a classifier and how to
compare different classifiers. Not only do we have a wide choice of different types of classifiers
to
1
Cluster Analysis
2
0.1
What is Cluster Analysis?
Cluster analysis is concerned with forming groups of similar objects based on
several measurements of dierent kinds made on the objects. The key idea is
to identify classications of the objects that would
Lecture 1
k-Nearest Neighbor Algorithms
for Classication and Prediction
1
1
k-Nearest Neighbor Classication
The idea behind the k-Nearest Neighbor algorithm is to build a classication
method using no assumptions about the form of the function, y = f (x1 ,
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values as 0 and 1). As with multiple linear regression t
Multiple Linear Regression in
Data Mining
Contents
2.1. A Review of Multiple Linear Regression
2.2. Illustration of the Regression Process
2.3. Subset Selection in Linear Regression
1
2
Chap. 2
Multiple Linear Regression
Perhaps the most popular mathemati
Multiple Linear Regression Review
Outline Outline
Simple Linear Regression Multiple Regression Understanding the Regression Output Coefficient of Determination R2 Validating the Regression Model
1
Linear Regression: An Example Linear Regression: An Examp