STERN SCHOOL OF BUSINESS
NEW YORK UNIVERSITY
COURSE SUPPLEMENT, PART II, REGRESSION
STATISTICS AND DATA ANALYSIS
COR1-GB.1305
Professor Peter Lakner
Office: Kaufman Management Center 8-61
Phone: (212) 998-0476
Email: [email protected]
1
2
Contents
1.

A Box Plot Describes the Distribution
of Values in a Set of Data
Average House Listing Price by State
900000
Hawaii
800000
700000
Listing
600000
500000
400000
300000
200000
100000
Box and Whisker Plot for House Price Listings
31/39
1. Data Presentation

Distribution of House
Price Listings
shows up in the box and
whisker plot. Note the long
whisker at the top of the
figure.
Histogram of Listing
14
12
Average House Listing Price by State
8
6
900000
4
800000
2
700000
0
600000
200000
300000
400000
500000 6

Ordered Qualitative Outcomes
Bond Ratings
Movie Ratings
Arithmetic Mean may not be
meaningful.
(a) Ordinal measure rankings
(b) Look at that distribution!
22/39
1. Data Presentation

House Price Listings and
Per Capita Incomes. States.
Regression and Correlation. Are
these two variables correlated?
r = .48
How to describe/summarize them.
How to explain the variation across
states
How to determine if there is any
correlation between th

Data = A Set of Facts
A picture of some aspect of the world
Pizza Sales by Type
What do the data tell
you?
How can you use the
information?
What additional
information would
make these data
(more) informative?
15/39
1. Data Presentation

Data Types and Measurement
Quantitative
Discrete
= count: Number of car accidents by city by time
Continuous = quantitative measurement: Housing prices
Qualitative
Categorical: Shopping mall, car brand, trip mode
Ordinal: Survey data on attitudes; How

Ordered Qualitative Data
German Health Satisfaction Survey; 27,326 individuals. On a
scale from 0 to 10, how do you feel about your health?
20/39
1. Data Presentation

What Does it Mean?
Slightly more than one-third of Americans have a favorable opinion of
the Democratic-led Congress, a poll said Wednesday.
The Pew Research Center for the People & the Press said the 37%
expressing a positive opinion represents a decline

Probability of Survival to Age 50, Female at Birth
U.S. and 20 Other Wealthy Countries
It is possible to be
misled by a
presentation such
as this one. Note
the vertical axis.
What does this graph tell
you? What do the
probabilities mean? Are the
differenc

Data Presentation Agenda
Data Types: Cross Section and Time Series
Summarizing Data Graphically
Summarizing Data with Descriptive Statistics
14/39
Pie chart, bar chart
Box plot, histogram
Central tendency
Spread
Distribution (shape)
1. Data Presentation

Does the Picture Tell the Story?
This is the only graphic in
the article. The article
compares default rates on
VA vs. FHA mortgages. Is
there anything wrong with
this picture? The very
technical looking
graph/table is unrelated to
the article.
New York T

Pie Chart vs. Frequency Table
Pizza Pies Sold, by Type
Pie Chart of Percent vs Type
Meatball
Garlic 5.0%
2.3%
Mushroom and Onion
9.2%
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Pepperoni
21.8%
Pepper and

EXERCISE 4, SOLUTIONS
STATISTICS AND DATA ANALYSIS
Peter Lakner
John, Kathy, Len, and Marta are the final quality inspectors for computer monitors. If a
computer monitor functions properly, then
John will say OK with probability 0.92
Kathy will say OK wit

Sample Final Exam
Statistics and Data Analysis
Giloni
Problem
Problem
Problem
Problem
Problem
1
2
3
4
5
has
has
has
has
has
3
2
3
2
4
parts
parts
parts
parts
parts
Professor
1) (20 points) A government regulating agency claims that 15 percent of
microchip

HW 3
1) A soda machine can be regulated so that it dispenses an average of ounces per cup. If the amount
dispensed is normally distributed with standard deviation 0.2 ounces, what should be the setting for so
that 12 ounce cups will overflow only 1% of th

Hernandez
1
Brian Hernandez
Summer Session Stats
HW5
1) The file Diamond.MTW contains data on pricing of ladies diamond rings, based on
the weights of the diamonds. The data were originally given in a full page advertisement
placed in the Straits Times ne

Brian Hernandez
1 a) The below time series plot shows some variance in prices and then a big drop
on the 24th followed by a big increase the next day and then more price fluctuation
than was seen earlier in the time series:
1b)-5.1901 standard deviations

Homework #7
17. To help your restaurant marketing campaign target the right age levels, you want to find
out if there is a statistically significant difference, on the average, between the age of your
customers and the age of the general population in tow