Unformatted text preview: 36-226 Summer 2010Homework 2Due July 61. Descriptive statistics.For this problem you will need the data filegdpdata.Rdataavailable on the website. Toload the data file, useload(‘/path/to/datafile/gdpdata.Rdata’).Once loaded, sayattach(data). It contains 181 rows and four variables:•country—the name of each of the 181 countries•gdp—total GDP for 2008 or 2009 (depending on the most recent estimate)•gdppc—GDP per capita•labs—a code number classifying each country into one of 7 groups:1= Eastern Europe,2= South America, Central America, and Mexico,3= Western Europe, US, Canada,Japan, Australia, New Zealand, and Israel,4= Middle East,5= Asia,6= Africa,7=Various small island nations.(a) Make a histogram of GDP and supplement it with descriptive statistics using the func-tionshist()andsummary().(b) Describe the shape of the distribution. What is the center and spread? Are there anyoutliers?(c) It seems like statisticians are obsessed with the normal distribution. One of the rea-sons for that is that many inferential methods have the assumption that the data fol-lows a normal distribution. When data is heavily skewed, it is therefore sometimesdesirable to “fix” it so that it is more “normal like” using a transformation. In par-ticular, when data is right skewed, this can be done using transformations of the formx1/2,x1/3,x1/4,...,log(x). Find a transformation that will “make the data approxi-mately normal”, create a histogram of the transformed data, and fit a normal distribu-tion to it. To fit a normal distribution, find the mean and sd of the transformed data....
