DSCI4520_DecisionTrees_5

DSCI4520_DecisionTrees_5 - DSCI 4520/5240 DATA MINING DSCI...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
Lecture 5 - 1 DSCI 4520/5240 DATA MINING Some slide material taken from: SAS Education DSCI 4520/5240 Lecture 5 Decision Trees Overview DSCI 4520/5240 DBDSS (DATA MINING)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lecture 5 - 2 DSCI 4520/5240 DATA MINING On the News: The Rise of the Numerati. BusinessWeek, Sep 8, 2008: With the explosion of data from the Internet, cell phones, and credit cards, the people who can make sense of it all are changing our world. An excerpt from the introduction of the book The Numerati by Stephen Baker: Imagine you're in a café, perhaps the noisy one I'm sitting in at this moment. A young woman at a table to your right is typing on her laptop. You turn your head and look at her screen. She surfs the Internet. You watch. Hours pass. She reads an online newspaper. You notice that she reads three articles about China. She scouts movies for Friday night and watches the trailer for Kung Fu Panda. She clicks on an ad that promises to connect her to old high school classmates. You sit there taking notes. With each passing minute, you're learning more about her. Now imagine that you could watch 150 million people surfing at the same time. That's what is happening today at the business place.
Background image of page 2
Lecture 5 - 3 DSCI 4520/5240 DATA MINING On the News: The Rise of the Numerati. By building mathematical models of its own employees, IBM aims to improve productivity and automate management. In 2005, IBM embarked on research to harvest massive data on employees, and to build mathematical models of 50,000 of the company’s consultants. The goal was to optimize them, using operations research, so that they can be deployed with ever more efficiency. Data on IBM employees include: -Time spent in meetings -Social network participation -Time spent surfing the Web -Response time to e-mails -Amount of sales -Marital status -Ratio of personal to work e-mails -Allergies -Number of interns managed -Client visits -Computer languages -Number of words per e-mail -Amount spent entertaining clients -Number of weekends worked
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lecture 5 - 4 DSCI 4520/5240 DATA MINING Decision Trees: Objectives Introduce the concept of “Curse of Dimensionality” Benefits and Pitfalls in Decision Tree modeling
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 25

DSCI4520_DecisionTrees_5 - DSCI 4520/5240 DATA MINING DSCI...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online