# Lecture09 - Data Mining: Principles and Algorithms Jianyong...

This preview shows pages 1–7. Sign up to view the full content.

12/8/2009 Data Mining: Principles and Algorithms 1 Data Mining: Principles and Algorithms Jianyong Wang Database Lab, Institute of Software Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
12/8/2009 Data Mining: Principles and Algorithms 2 Chapter 6. Classification and Prediction What is classification? What is prediction? Issues regarding classification and prediction Classification by decision tree induction Bayesian classification Rule-based classification Artificial Neural Networks Support Vector Machines (SVM) Associative classification Lazy learners (or learning from your neighbors) Other classification methods Ensemble methods Prediction Accuracy and error measures Summary
12/8/2009 Data Mining: Principles and Algorithms 3 Bayesian Classification: Why? A statistical classifier : performs probabilistic prediction, i.e., predicts class membership probabilities Foundation: Based on Bayes Theorem. Performance: A simple Bayesian classifier, na ï ve Bayesian classifier , has comparable performance with decision tree and selected neural network classifiers Incremental : Each training example can incrementally increase/decrease the probability that a hypothesis is correct prior knowledge can be combined with observed data Standard : Even when Bayesian methods are computationally intractable, they can provide a standard of optimal decision making against which other methods can be measured

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
12/8/2009 Data Mining: Principles and Algorithms 4 Bayesian Theorem: Basics Let X be a data sample ( evidence ): class label is unknown Let H be a hypothesis that X belongs to class C Classification is to determine P(H| X ), the probability that the hypothesis holds given the observed data sample X P(H) ( prior probability ), the initial probability - E.g., X will buy computer, regardless of age, income, P( X ): probability that sample data is observed P( X |H) , the probability of observing the sample X , given that the hypothesis holds - E.g., given that X will buy computer, the probability that X is 31. .40, medium income
12/8/2009 Data Mining: Principles and Algorithms 5 Bayesian Theorem Given data sample X , posteriori probability of a hypothesis H , P(H| X ) , follows the Bayes theorem Informally, this can be written as posteriori = likelihood × prior/evidence Predicts X belongs to C 2 iff the probability P(C 2 | X ) is the highest among all the P(C i |X) for all the k classes (i.e., 1≤i≤k ) Practical difficulty: require initial knowledge of many probabilities, significant computational cost ) ( ) ( ) | ( ) | ( X X X P H P H P H P

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
12/8/2009 Data Mining: Principles and Algorithms 6 Towards Na ï ve Bayesian Classifier Let D be a training set of tuples and their associated class labels, and each tuple is represented by an n-D attribute vector X = (x 1 , x 2 , , x n ) Suppose there are
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 06/02/2010 for the course COMPUTER DM2009F taught by Professor Wangwei during the Fall '09 term at Tsinghua University.

### Page1 / 44

Lecture09 - Data Mining: Principles and Algorithms Jianyong...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online