7 Million Study Materials
From students who've taken these classes before
24/7 Access to Tutors
Personal attention for all your questions
Learn
93% of our members earn better grades
69 sample documents related to STAT 600
-
Statistics 600 Problem set 4 Due in class on November 18th. 1. Suppose we t a regression model using OLS and get a parameter estimate with covari ance matrix cov(|X) = V . Derive an expression for the expected value of k given that the Z-score fo
-
Statistics 600 Capstone Project (methodology component) For this part of the project you will dene and answer a question related to the methodology covered in this course, and produce a 2-3 page typed writeup of your ndings (any and all gures and tab
-
Statistics 600 Problem Set 5 Due in class on October 24th 1. Suppose we have a linear model Y = 0 + 1 X1 + 2 X2 + where the design matrix satises n 0 0 X X = 0 n nr . 0 nr n The goal of this problem is to understand the circumstances in which enh
-
Statistics 600 Problem set 3 Due in class on November 4th. 1. Suppose the data generating model is a simple linear model of the form Y = X + , where the usual linear model assumptions hold and var( |X) = 2 I. Suppose we use least squares to t a line
-
Statistics 600 Capstone Project Due Thursday, December 14th The project has two parts: You will carry out a data analysis, focusing on the techniques learned in this class. You will be given several data sets to choose from, each having one or more
-
Statistics 600 Problem set 2 Due in class on October 2nd. 1. Suppose A and B are symmetric matrices, and A, B, and A + B are all idempotent. Show that AB 0. Hint: use the Schur decomposition for a symmetric matrix S, S = V DV , where D is diagonal a
-
Statistics 600 Applied statistics and data analysis I Instructor: Oce: E-mail: Oce hours: Course web page: Kerby Shedden 461 West Hall kshedden@umich.edu Monday 3-4, Wednesday 11-12 www.stat.lsa.umich.edu/kshedden/Courses/Stat600 Course description:
-
Statistics 600 Applied statistics and data analysis I Instructor: Oce: E-mail: Oce hours: Course web page: Kerby Shedden 461 West Hall kshedden@umich.edu Wednesday 2-3, Friday 4-5 www.stat.lsa.umich.edu/kshedden/Courses/Stat600 GSI: Oce: E-mail: Oce
-
Least Squares Fitting and Inference Kerby Shedden August, 2008 Denitions and Motivation Independent variables (predictors, regressors, covariates): X = (X1 , . . . , Xp ) Dependent variable (response, outcome): Y The goal is to learn about an unkno
-
Statistics 600 Problem Set 7 Due in class on November 14th 1. What is the relationship between measurement error and confounding? In the notation of the confounding course notes, let Z be the true covariate level, and let X be the covariate as observ
-
Statistics 600 Problem Set 2 Due in class on September 26 1. In this problem you will use simulation to better understand the sampling variation of when two covariates are strongly dependent. Suppose we can partition the design matrix as X = [1 U V
-
Statistics 600 Problem Set 5 (for practice only) 1. In probit regression, the probability distribution of Y is modeled as P (Y = 1|X) = P (Z X), where Z is an unobserved standard normal random value that is independent of all observed data. (a) Wha
-
Diagnostics Motivation When working with a linear model with design matrix X, we may optimistically suppose that var(Y |X) = 2 I. EY col(X) and Point estimates and inferences depend on these assumptions approximately holding. Inferences for sma
-
Model mis-specification and confounding Suppose we have a data generating model of the form Y = + X + Z + . The usual assumptions E( |X, Z) = 0 and var ( |X, Z) = 2 hold. The covariate X is observed, but Z is not observable. If we regress Y on X,
-
Ridge regression Ridge regression uses the minimizer of a penalized squared error loss function to estimate the regression coefficients: ^ argmin Y - X 2 + D. The usual specification of D is a diagonal matrix with 0 in the 1,1 position and ones
-
Statistics 600 Problem Set 4 Due in class on October 10th 1. This problem concerns the situation where a condence interval for a regression parameter k is constructed after rst conrming that the test of the null hypothesis k = 0 rejects. For simplici
-
Statistics 600 Problem set 1 Due in class on September 18th. 1. Suppose Y , X, and Z are centered unit vectors in Rn (that is, 1 Y = 1 X = 1 Z = 0, and Y = X = Z = 1), and let = X Z . We regress Y on X to get the t 1 + 1 X, and we regress Y on Z
-
Statistics 600 Problem Set 3 Due in class on Tuesday, October 23rd 1. Suppose we have a bivariate regression model Y = + X + Z + where the usual assumptions E( |X, Z) = 0 and cov( |X, Z) I hold. In addition, for simplicity, assume that EX = EZ = 0
-
Generalized linear models The key properties of a linear model are that E(Y |X) = X and var(Y |X) 1. In some cases where these conditions are not met, we can transform Y to rectify things. However it is often dicult to nd a transform that simu
-
Statistics 600 Problem Set 3 Due in class on October 3rd 1. This problem asks about two situations where it may be possible to beat a linear unbiased estimator in terms of mean squared error (MSE) for estimating a regression coecient. Recall that the
-
Decomposing variance Pearson correlation The population Pearson correlation coecient of two jointly distributed random variables X and Y is cov(X, Y ) . X Y XY = It is estimated by XY cov(X, Y ) = = X Y X)(Yi Y ) (X X) (Y Y ) = .
-
Statistics 600 Problem Set 1 Due in class on September 19 A \"dataset\" for this problem will consist of 200 points simulated from the following model: Xi N (0, 1) N (0, 1) i Yi = Xi + i 1. Conduct a simulation study in which the least squares fit of
-
Prediction Ridge regression uses the minimizer of a penalized squared error loss function to estimate the regression coefficients: 2 ^ argmin Y - X + D. Typically D is a diagonal matrix with 0 in the 1,1 position and ones on the rest of the di
-
Statistics 600 Capstone Project (methodology component) Due December 18th. For this part of the project you will produce a 1-2 page typed writeup of your ndings. It is very important to begin by dening a clear and specic aim. You must limit your disc
-
Statistics 600 Exam 1 October 14, 2008 1. (a) Suppose we t a simple linear regression between Y and X. Our goal is to estimate the coecient in the relationship E(Y |X) = + X with the least possible variance, where cov(Y |X) = 2 I can be assumed to
-
Least Squares Fitting and Inference Kerby Shedden August, 2007 Definitions and Motivation Independent variables (predictors, regressors, covariates): X = (X1 , . . . , Xp ) Dependent variable (response, outcome): Y The goal is to learn about an unk
-
Decomposing variance Pearson correlation The population Pearson correlation coefficient of two jointly distributed random variables X and Y is cov(X, Y ) . X Y XY = It is estimated by XY ^ cov(X, Y ) = = X Y ^ ^ - X)(Yi - Y ) (X - X) (Y - Y
-
Model mis-specification and confounding Suppose we have a data generating model of the form Y = + X + Z + . The usual assumptions E( |X, Z) = 0 and var ( |X, Z) = 2 hold. The covariate X is observed, but Z is not observable. If we regress Y on X,
-
Diagnostics Motivation When working with a linear model with design matrix X, we usually assume that var(Y |X) = 2 I. EY col(X) and Point estimates and inferences may depend on these assumptions approximately holding. Inferences for small sampl
-
Generalized linear models The key properties of a linear model are that E(Y |X) = X and var(Y |X) 1. In some cases where these conditions are not met, we can transform Y to rectify things. However it is often difficult to find a transform that
-
Statistics 600 Problem Set 1 Due in class on September 25 1. Prove that the \"horizontal residuals\" sum to zero based on a least squares fit of Y (as the dependent variable) on X (as the independent variable). The horizontal residuals are the lengths
-
Statistics 600 Problem Set 2 Due in class on Tuesday, October 9th 1. Suppose you are given a p p projection matrix of rank q, where col(P ) = S1 . Given another subspace S2 of dimension q, we wish to find a square orthogonal matrix Q such that Q P Q
-
Statistics 600 Exam December 5, 2006 1. Suppose the data generating model is a simple linear model Y = + X + with E( |X) = 0 and cov( |X) = 2 I. Suppose the sample size is n, an even number, and we divide the cases into non-overlapping pairs (i1 ,
-
Statistics 600 Practice Exam 1. For two random variables A and B, what is the relationship between the following two situations? i E(A|B) = 0 for all B. ii cov(A, B) = 0 and EA = 0. For each of the implications i ii and ii i, state whether the impl
-
Statistics 600 Problem Set 6 Due in class on November 2nd 1. Prove that as long as there is an intercept in the model, Pii 1/n, where n is the number of cases. As a hint, consider centering each column of the design matrix (except the intercept). So
-
Sampling variance and covariances of residuals 1. Use simulation demonstrate the formula ^ cov(Y - Y ) = 2 (I - P ). Extensions: Since this formula does not require the errors to be Gaussian, we can try some other mean zero, constant variance error
-
q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
-
q q 8 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
-
beta_hat_1 4 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
-
Profile likelihood -300 -2 -1 0 1 2 Lambda -250 -200 -150 -100
-
Estimated Box-Cox parameter (true value is 1/2) 200 Frequency 0 50 100 150 0.0 Lambda 0.5 1.0
-
20 Student test score 15 q School 1 School 2 School 3 q qq q q qqq q q qq q q q q qq q q q qq q q q qq q q qqq q q qq q qqq q q q 0 5 10 0 5 10 15 20 Parent education
-
Adjusted student test score q 2 School 1 School 2 School 3 4 q q q q q q q q q q qq q q q q q q qq q q q q qq q q q q q q q q q qq q q q q q qq qq q -4 -2 0 -4 -2 0 2 4 Adjusted parent income
-
20 Student test score 10 15 q School 1 School 2 School 3 q qq q q q qq q qq q q q q qq q qq qqq q q qqq q q q q q q q q q q q qq qq q qq q q 0 5 0 5 10 15 20 Parent education
-
Adjusted student test score q 2 School 1 School 2 School 3 4 q q q -4 -2 q qq q q qq q qq q q q q q q q q q q qq q q q q q q q qq qq q q q qqq q q q q q q 0 -4 -2 0 2 4 Adjusted parent education
-
15 Student test score q 0 5 10 School 1 School 2 School 3 q q qq q qq q qqqq qq qq qqq q qq q q qq q qqqq qq qq q qq q q q q qq q q 0 5 10 15 20 Parent education
-
Adjusted student test score q -2 0 2 School 1 School 2 School 3 4 q q q q q q q q q q q qq q qq q q q q q q q q qq q q q q q q q qq q qqq q q q q q q q qq q -4 -4 -2 0 2 4 Adjusted parent education
-
20 Student test score 15 q School 1 School 2 School 3 q q q q qqq q q q q qq qq q q qq q qq q q q qq q q q q qq q qq q qq qq 0 5 10 0 5 10 15 20 Parent education
-
Adjusted student test score q 2 School 1 School 2 School 3 q 4 q q -4 -2 qq q q q q q q q q q q q q q q q qq q q q q q q q qq qq q q q qq q q q q qq q qq q q q 0 -4 -2 0 2 4 Adjusted parent education
-
20 Student test score 10 15 q School 1 School 2 School 3 qq q qq qq qq q q q q q q qq q qqqq q q qqqq q qq q qqqq q q qq qq 0 5 0 5 10 15 20 Parent education
-
Adjusted student test score q 2 -4 -2 q q q q qq q q q qq q q q q qq q q q q q qq qqq q qq q qq q q q qq q qq q q q q q q School 1 School 2 School 3 0 4 -4 -2 0 2 4 Adjusted parent education
-
20 Student test score 15 q School 1 School 2 School 3 q q q qq qq q q q q qqqq q q q qqq q qq qqq q qq qq q q q qq qq q 0 5 10 0 5 10 15 20 Parent income
-
q 1.0 1.2 q q q q q q q q q q q q q q q q q q q q q q q q q qq qq q q q q q q q Fitted values 0.8 q q q 0.2 -2 0.4 0.6 -1 0 1 2 Studentized residuals
-
4.5 q q q q q q q 4.0 q qq q q q Fitted values qqq q q q q q q q q q q q q q q q q q q 3.5 q q q q q 2.5 3.0 q -1 0 1 2 Studentized residuals
-
q q 1.5 q q q q q q q q q q q q X2 0.5 -0.5 q qq q q q q q qq q qq q q q q q q q q qq q q -1.5 q -2 -1 0 1 2 Studentized residuals
-
1.5 q qq q q q qq qq q q q q q q qq q q q q qq q q q q q q q q qq qq q Sample Quantiles -1.5 -0.5 0.5 q q q -2 -1 0 1 2 Theoretical Quantiles
-
=1 Q(v) Q(v)+(1-)Q(w) =0 Q(w) Q(v+(1-)w) v w
-
%!PS-Adobe-3.0 %Title: fig-convexity.ps %Creator: matplotlib version 0.90.1, http:/matplotlib.sourceforge.net/ %CreationDate: Wed Sep 5 12:26:58 2007 %Orientation: portrait %DocumentPaperSizes: letter %BoundingBox: 162 288 450 504 %Pages: 1 %EndComme
-
Underestimate slope, overestimate intercept - 0.5 0.0 Y Overestimate slope, underestimate intercept 0.5 1.0 X 1.5 2.0 2.5
-
%!PS-Adobe-3.0 %Title: fig-cov_int_slope.ps %Creator: matplotlib version 0.98.0, http:/matplotlib.sourceforge.net/ %CreationDate: Thu Sep 4 12:11:17 2008 %Orientation: portrait %DocumentPaperSizes: letter %BoundingBox: 126 252 486 540 %Pages: 1 %EndC
-
3 2 1 0 -1 -2 -3 -3 Y Fitted line Population line -2 -1 X 0 1 2
-
%!PS-Adobe-3.0 %Title: fig-masking-2.ps %Creator: matplotlib version 0.90.1, http:/matplotlib.sourceforge.net/ %CreationDate: Sat Oct 20 11:10:16 2007 %Orientation: portrait %DocumentPaperSizes: letter %BoundingBox: 162 288 450 504 %Pages: 1 %EndComm
-
3 2 1 0 -1 -2 -3 -3 Y Fitted line Population line -2 -1 X 0 1 2
-
%!PS-Adobe-3.0 %Title: fig-masking.ps %Creator: matplotlib version 0.90.1, http:/matplotlib.sourceforge.net/ %CreationDate: Sat Oct 20 11:13:50 2007 %Orientation: portrait %DocumentPaperSizes: letter %BoundingBox: 162 288 450 504 %Pages: 1 %EndCommen
-
q 4 q q 2 q q q q q q qq q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q qq Y q q q q -2 0 q -2 -1 0 X 1 2
-
1.0 0.8 Probability 0.6 P (Y=1| X) P (Y=0 | X) 0.4 0.2 0.0 -6 -4 -2 X 0 2 4 6
-
%!PS-Adobe-3.0 %Title: glim-1.ps %Creator: matplotlib version 0.90.1, http:/matplotlib.sourceforge.net/ %CreationDate: Tue Dec 4 00:15:06 2007 %Orientation: portrait %DocumentPaperSizes: letter %BoundingBox: 18 180 594 612 %Pages: 1 %EndComments %Beg
-
Frequency 0 0.0 50 100 150 0.2 0.4 beta_1 0.6
-
Random effects models and mixed effects models The random intercepts model for clustered data Suppose observations are made in known clusters, where any pair of values within a cluster are similar beyond what can be explained by covariates. If we as
7,000,000 study materials • 24/7 tutors • earn better grades