hw2sol - 36-226 Summer 2010Homework 2Solutions1. The code...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 36-226 Summer 2010Homework 2Solutions1. The code for this problem is given at the end.(a) The histogram is shown in Figure1below. The summary statistics areMin.1st Qu.MedianMean3rd Qu.Max.0.16.024.2321.8168.214260.0(b) This distribution is highly right skewed. Therefore a good measure of the center andspread are the median and IQR which are 24.2 and 162.2 respectively. Note that themean and SD areNOTgood estimates of the center and spread. There is a huge outlieraround 15000 (the United States) among others.(c) This data is very right skewed and measures income. The appropriate transformationin cases like this is to take the log of the data. Figure2shows the histogram of thetransformed data with the fitted normal distribution. The right side of the figure showswhat is called a qq-plot (you were not required to make one). This plot is good forchecking if the data is well approximated by the normal distribution (or any distributionreally). If the data lie along the straight line, then you have a good fit.(d) The boxplot is shown in Figure3. We can see easily that wealth is not evenly distributed,in particular the median GDP per capita in group 3 is around $43000 which is about 5times the median of group 1, 10 times the median in groups 2, 4, and 7, 30 times thatin group 5 and 50 times that in group 6. Looking at the plot, group 4 seems to havethe largest IQR, which is probably to be expected since there are a number of highlydeveloped middle eastern countries as well as a number that belong in the stone age.Africa is of course the most concentrated in poverty with the lowest median and smallestIQR. The outliers in group 3 are Luxembourg and Norway on the high side and Polandon the low side. The US? Right in the middle.Code for Question 1---------------------------------------------------------------------load("/path/to/data/set/gdpdata.Rdata")attach(data)hist(gdp,breaks=40)z = log(gdp)z = z[-168]hist(z,breaks=20,freq=FALSE)z = seq(min(z),max(z),length=100)y=dnorm(x,mean(z),sd(z))lines(x,y,col=2)boxplot(gdppc~labs,main=Side-by-Side Boxplots,ylab=GDP per capita,xlab=Country group)by(gdppc,labs,summary)2. Order Statistics....
View Full Document

Page1 / 7

hw2sol - 36-226 Summer 2010Homework 2Solutions1. The code...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online