lab4.pdf - Lab 4 Foundations for statistical inference...

This preview shows page 1 - 3 out of 5 pages.

Lab 4: Foundations for statistical inference - Sampling distributions In this lab, we investigate the ways in which the statistics from a random sample of data can serve as point estimates for population parameters. We’re interested in formulating a sampling distribution of our estimate in order to learn about the properties of the estimate, such as its distribution. Template for lab report download.file ( "" , destfile = "lab4.Rmd" ) Fill in your team information and use the allotted spaces to enter your responses. For questions that require R code or a plot, space has been provided for you to enter the relevant code. If you encounter any errors when “knitting” the document, check out the FAQ page on the course website for troubleshooting help. The data We consider real estate data from the city of Ames, Iowa. The details of every real estate transaction in Ames is recorded by the City Assessor’s office. Our particular focus for this lab will be all residential home sales in Ames between 2006 and 2010. This collection represents our population of interest. In this lab we would like to learn about these home sales by taking smaller samples from the full population. Let’s load the data. download.file ( "" , destfile = "ames.RData" ) load ( "ames.RData" ) We see that there are quite a few variables in the data set, enough to do a very in-depth analysis. For this lab, we’ll restrict our attention to just two of the variables: the above ground living area of the house in square feet ( Gr.Liv.Area ) and the sale price ( SalePrice ). To save some effort throughout the lab, create two variables with short names that represent these two variables. area <- ames $ Gr.Liv.Area price <- ames $ SalePrice Let’s look at the distribution of area in our population of home sales by calculating a few summary statistics and making a histogram. summary ( area ) hist ( area ) Exercise 1 Describe this population distribution. Be sure to include a visualization in your answer. This is a product of OpenIntro that is released under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported ( ). This lab was written for OpenIntro by Andrew Bray and Mine C ¸ etinkaya-Rundel. 1
Image of page 1

Subscribe to view the full document.

The unknown sampling distribution In this lab we have access to the entire population, but this is rarely the case in real life. Gathering information on an entire population is often extremely costly or impossible. Because of this, we often take a sample of the population and use that to understand the properties of the population. If we were interested in estimating the mean living area in Ames based on a sample, we can use the sample function to sample from the population. But before we do that let’s “set a seed” in R so that each time we knit the lab document the random sample we obtain is the same. Otherwise the sample we obtain will change each time we run the code, and hence the sample statistics will change as well, making it somewhat frustrating to write up accurate answers to the questions in the rest of the lab.
Image of page 2
Image of page 3
  • Fall '15

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern