Cynthia rush lecture 3 regression and graphics

This preview shows page 79 - 89 out of 111 pages.

Cynthia RushLecture 3: Regression and GraphicsSeptember 23, 201654 / 84
Section VExploratory Data Analysis andRGraphicsCynthia RushLecture 3: Regression and GraphicsSeptember 23, 201655 / 84
Diamonds DatasetDownloaddiamonds.csvfrom the Canvas page (Homepage, Week 3,Slides).Save to your computer and set your working directory to match thatlocation.Rundiamonds <- read.csv("diamonds.csv", as.is = TRUE).Cynthia RushLecture 3: Regression and GraphicsSeptember 23, 201656 / 84
Diamonds DatasetInfo on54000 diamonds from.VariablesCarat– Weight of the diamond (0.2 - 5.01).Color– Diamond color from J (worst) to D (best).Clarity– A measurement of how clear the diamond is (I1 (worst),SI1, SI2, VS1, VS2, VVS1, VVS2, IF (best)).Cut– Quality of the cut (Fair, Good, Very Good, Premium, Ideal).Price– Price in US dollars.Cynthia RushLecture 3: Regression and GraphicsSeptember 23, 201657 / 84
Diamonds DatasetCode example.Cynthia RushLecture 3: Regression and GraphicsSeptember 23, 201658 / 84
Check YourselfQuestions:1.Think by yourself for a few minutes: what are some interestingquestions we could answer using this dataset?2.Use what we learned about multiple linear regression to modeldiamond price as a linear function of the diamonds weight (carat).Cynthia RushLecture 3: Regression and GraphicsSeptember 23, 201659 / 84
Check YourselfQuestions:1.Think by yourself for a few minutes: what are some interestingquestions we could answer using this dataset?2.Use what we learned about multiple linear regression to modeldiamond price as a linear function of the diamonds weight (carat).Some question ideas:What does the distribution of diamond prices look like? Symmetric?Skewed?How does a diamond’s price relate to its weight?Does the relationship between the price and the weight changedepending on the quality of the diamond’s cut?Cynthia RushLecture 3: Regression and GraphicsSeptember 23, 201659 / 84
Check YourselfLinear Regression> lm_diamonds <- lm(diamonds$price ~ diamonds$carat)> lm_diamonds <- lm(price ~ carat, data = diamonds)> lm_diamondsCall:lm(formula = price ~ carat, data = diamonds)Coefficients:(Intercept)carat-22567756dprice =-2256 + 7756carat.Cynthia RushLecture 3: Regression and GraphicsSeptember 23, 201660 / 84
Exploratory Data Analysis1Exploratory Data Analysis, or EDA for short, is exploring data in asystematic way.It’s an iterative process:1.Generate questions about your data.2.Search for answers by visualising, transforming, and modelling yourdata.3.Use what you learn to refine your questions and or generate newquestions.1EDA slides developed from G. Grolemund and H. Wickham.Cynthia RushLecture 3: Regression and GraphicsSeptember 23, 201661 / 84
Exploratory Data AnalysisEDA is a way for you to learn about and better understand your data.Asking Questions1.What type ofvariationoccurswithinmy variables?

Upload your study docs or become a

Course Hero member to access this document

Upload your study docs or become a

Course Hero member to access this document

End of preview. Want to read all 111 pages?

Upload your study docs or become a

Course Hero member to access this document

Term
Fall
Professor
victor

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture