Midterm Exam 1: Urban Scaling, Continued
36-402, Advanced Data Analysis
Due at 5 pm on Tuesday, 1 March 2011
Instructions
Please read the background section, and all of the questions, carefully before
beginning to work.
You will be sent a data set (CSV fo
Homework 4
Abby Smith (als1)
Question 1
data<-read.csv('mobility.csv')
attach(data)
mob.1<-data.frame(Name, Mobility, Population, State)
mobility.new<-na.omit(mob.1)
se.prop=function(p,n)cfw_
SE = sqrt(p*(1-p)/n)
return(SE)
a. The below should print 0.5.
Homework Assignment 10: Use and Abuse of
Conditioning
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
1. (a) Answer: Observe that occuatpion blocks all back door paths between smoking and cancer, and that occupation is not a descendant
of smoking. L
Homework Assignment 10: Estimating with
DAGs
36-402, Advanced Data Analysis, Spring 2011
Solutions
1. (a) Answer:
Variable
cancer
cellular damage
tar
teeth
dental care
smoking
asbestos
occupation
(b) Answer:
Variable
cancer
cellular
tar
teeth
dental
smoki
Homework Assignment 9: Patterns of Exchange
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
1. (a) Answer:
# Load fx.csv as "fx"
fx = read.csv("http:/www.stat.cmu.edu/~cshalizi/402/hw/09/fx.csv",
header = T, row.names = 1)
# verify that the matrix i
Homework Assignment 8: Fairs Aairs
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
library(AER)
data(Affairs)
1. Answer:
(a) When dealing with an counting variable Y with a known (not estimated) upper limit m, we can try to model it as having a bino
Homework Assignment 7: Diabetes
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
1. Answer:
# Look what variables are in the data:
colnames(pima)
# Consider maximums and minimums to check that the variable values are feasible
summary(pima)
help(pima)
Homework Assignment 6: Nice Demo City, But
Will It Scale?
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
1. Answer: Taking the log of both sides gives
log y = log
Y
N
=
log Y log N
log(cN b ) log N
=
log c + log N b log N
=
log c + b log N log N
=
Homework Assignment 5: Bootstrapping Will
Continue Until Morale Improves
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
1. Answer:
library(MASS)
cats.lm1 <- lm(Hwt ~ 0+Bwt,data=cats) # "0+" sets intercept to zero
summary(cats.lm1)
# Quick view of r
Homework 4: An Insucciently Random Walk
Down Wall Street
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
1. Answer: Following the notes for lecture 7,
# You can download the data directly the web like this
sp <- read.csv("SPhistory.short.csv")
spdat
Homework Assignment 2: The Advantages of
Backwardness
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
This problem set was based on the preliminary analysis in the paper
E. Maasoumi, J. S. Racine and T. Stengos, Growth and convergence: a prole of di
Homework Assignment 1: Whats That Got to
Do with the Price of Condos in California?
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
The easiest way to load the data is with read.table, but you have to tell
R that the rst line names the variables:
>
Midterm Exam 2: Mystery Multivariate Data
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
General note: The data came from a ten-dimensional Gaussian. Each
variable had an expected value of 100 and a standard deviation of 15. The
correlation matrix
Midterm Exam 1: Urban Scaling, Continued
36-402, Advanced Data Analysis, Spring 2011
SOLUTIONS
General set-up:
gmp = read.csv(file = "gmp-2006.csv")
Your data le was derived from this data le, plus or minus 4% noise for each
observation.
1. Answer: The ba
Homework 1, 36-402
Abby Smith (als1)
1 . Comments on Mobility plot:
There are more data points in Eastern US. There are fewer data points for Hawaii/Alaska, as well as
in the Utah/Nevada regions. In terms of higher mobility, which is indicated by the dark