lecture6

Data Mining CS57300 Purdue University Decision

Data Mining CS57300 Purdue University September 14, 2010

Decision making
Heuristics and biases Tversky & Kahneman, psychologists, propose that people often do not follow rules of probability Instead, decision making may be based on heuristics Lowers cognitive load but may lead to systematic errors and biases • Examples: Availability heuristic Representativeness heuristic Confirmation bias Conjunction fallacy Numerosity heuristic

Estimating probabilities (Tversky & Kahneman ’73/’74) Question: Is the letter R more likely to be the 1st or 3rd letter in English words? Results: Most said R more probable as 1st letter Reality: R appears much more often as the 3rd letter, but easier to think of words where R is the 1st letter
Estimating probabilities (cont) Question: Which causes more deaths in developed countries? (a) traffic accidents or (b) stomach cancer Typical guess: traffic accident = 4X stomach cancer Actual: 45,000 traffic, 95,000 stomach cancer deaths in US Ratio of newspaper reports on each subject: 137 (traffic fatality) to 1 (stomach cancer death) Availability heuristic : Tendency for people to make judgments of frequency on basis of how easily examples come to mind

Base Rate Study (Kahneman & Tversky '73) Participants told that for a set of 100 people are either: 30% engineers/70% lawyers, or 70% engineers/30% lawyers Given: A description of a person Jack, which is representative of a prototypical engineer (e.g., likes carpentry and mathematical puzzles, careful, conservative) Question: Is Jack more likely to be a lawyer or engineer? Results: Participants in the 30% condition judged Jack just as likely to be an engineer as participants in the 70% condition.
Base rate study (cont) People use the representative heuristic to make inferences... Inferences is based solely on similarity of target to category members Base rates (70%-30%) are ignored ...rather than using formal statistical rules to make inferences Inferences should be based on similarity of target to category members AND base rates (70%-30%) Representative heuristic : categorizations made on the basis of similarity between instance and category members

