Naive Bayes Data Mining Prof. Dawn Woodard School of ORIE Cornell University 1 Outline 1 Announcements 2 Conditional Distributions 3 Accidents Data 4 Naive Bayes 5 Bayes’ Rule 6 Naive Bayes Training 2 Announcements Before lab next week, register at www.dataminingbook.com so that you can get the data sets Questions? 4 Conditional Probability Recall from the lecture about heart disease: > table( heart \$ sex, heart \$ cad ) NY Fem 67 20 Mal 83 100 These are the counts for gender and heart disease Want to know whether gender is associated with heart disease frequency 6

Conditional Probability > table( heart \$ sex, heart \$ cad ) NY Fem 67 20 Mal 83 100 Taking just the females, what percentage have heart disease? How about for males? These are conditional probabilities , e.g. Pr ( CAD | Sex = Fem ) and Pr ( CAD | Sex = Mal ) 7 Conditional Probability Recall the formula for conditional probability: Pr ( A | B )= Pr ( A , B ) Pr ( B ) Apply this formula to our table to obtain the conditional probability of CAD given gender. > table( heart \$ sex, heart \$ cad ) Fem 67 20 Mal 83 100 8 Conditional Probability Pr ( CAD = Ye s | Sex = Fem Pr ( CAD = , Sex = Fem ) Pr ( Sex = Fem ) Recalling that the comma means “and”. ..
