Chapter00 - Initial Data Exploration STAT 563 Spring 2007...

Info icon This preview shows pages 1–30. Sign up to view the full content.

View Full Document Right Arrow Icon
Initial Data Exploration STAT 563 Spring 2007 Mani Lakshminarayanan
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Inheritance of Height Original data collected by E.S. Pearson during 1893-1898 n=1375 heights of mothers under the age of 65 and one of their adult daughters over the age of 18 Questions: Do taller mothers tend to have taller daughters? Do shorter mothers tend to have shorter daughters?
Image of page 2
dat1 <- read.table("C:/My Documents/heights.txt",header=TRUE) plot(Mheight,Dheight) mhround <- round(Mheight,0) dat3 <- data.frame(mhround,Dheight) xsub <- dat3[mhround %in% c(58,64,68),] plot(xsub$mhround,xsub$Dheight)
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Height Data
Image of page 4
Observations The range of data appear to be the same for both mothers and daughters To avoid over plotting, one could use jittering (add a small perturbation to each data points) Response variable seems to depend on the predictor variable ( mothers height) Next plot shows that there is an increasing trend Variability is approximately the same for each mother- daughter pair slices Scatter appears to be elliptically shaped with an increasing trend None of the points appear to be different from the cluster ( outliers)
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Height Data
Image of page 6
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Forbes Data
Image of page 8
Forbes Data
Image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Observation Number of observations small All points appear to fall on a smoothed curve (ie, variability in pressure for a given temperature is small) Though most points appear to fall on the line, there appears to be a systematic trend (systematic error), those in the middle fall below the line and the highest and lowest fall above the line (except for one point) This is much easier to see in the plot on the side where residual =pressure-point on the line is plotted To get the same resolution in the first plot, we need 10/0.8 = 12.5 as big as the second one
Image of page 10
Observation Though there is nothing wrong with the curvature (non-linear), most of the theory to be discussed in this course is suited best if the fit is a straight line Transformation of one or both variables could help to get linearity Forbes used a logarithmic scale Straight line appears to be reasonable There appears to be no systematic deviation from the straight line
Image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Forbes Data
Image of page 12
Forbes Data
Image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Old Faithful Geyser, Yellowstone Park
Image of page 14
Image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 16
Characteristics Data look bi-modal with two clusters Common mean is not a good measure of location as there are two peaks, around 2 and 4.5 Standard deviation may not be a good measure of spread No obvious outliers Two clusters Granularity (not rounded to one decimal place) might give a better picture
Image of page 17

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Some Typical Plots
Image of page 18
No relationship
Image of page 19

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Strong Linear (Positive Correlation)
Image of page 20
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern