stat331-tutorial1_f15 - STAT 331 Applied Linear Models Tutorial 1 Solutions version 07:53:12 Q1 Install R on your computer which can be downloaded from

# stat331-tutorial1_f15 - STAT 331 Applied Linear Models...

This preview shows page 1 - 4 out of 9 pages.

STAT 331: Applied Linear Models Tutorial 1: Solutions version: 2015-09-24 · 07:53:12 Q1. Install R on your computer, which can be downloaded from here . Note: the default interface to R is extremely cumbersome and you should never use it. Instead, use RStudio as your interface. Solution: In addition to downloading R and RStudio , here are a couple of other pointers to get started: The official Intro to R manual. A reference sheet of commonly used functions. Typing ? before any function, e.g., ?cov , pulls up the help page on that function. Q2. Use R to generate N = 1000 random variables x 1 , . . . , x N iid ∼ N (0 , 5 2 ). Then, with parameters β 0 = 1, β 1 = 2, σ = 3 . 5, generate data from the linear model y i = β 0 + β 1 x i + ǫ i , ǫ i iid ∼ N (0 , σ 2 ) . (1) Solution: The function to simulate Normals is called rnorm : args (rnorm) function (n, mean = 0, sd = 1) NULL That is, the function generates n iid Normals with mean mean and standard deviation sd , which by default take the values 0 and 1 respectively. The code to generate x 1 , . . . , x 1000 iid N (0 , 25) is thus N <- 1000 x <- rnorm(N, sd = 5) Note that R functions accept arguments in many different ways. If you don’t name the argument, the default order is assumed (i.e., the first argument was n = N ). If you don’t specify an argument, the default value is assumed (i.e., mean = 0 ). Finally, you can spec- ify arguments by name and then change the order. For example an equivalent way of generating the Normals is: x <- rnorm(mean = 0, sd = 5, n = N) 1 Tutorial 1: Solutions The vector x now contains 1000 iid Normals: length(x) # length of vector  1000 mean(x) # sample mean  -0.00191465 sd(x) # sample standard deviation  5.08595 head (x) # first 6 elements  1.992412 2.458632 2.366446 -1.733301 -3.962583 2.719608 Next, let’s simulate the response variables y i ind N ( β 0 + β 1 x i , σ 2 ), with ( β 0 , β 1 , σ ) = (1 , 2 , 3 . 5). In many programming languages, you would do this using a for-loop: # parameter values beta0 <- 1 beta1 <- 2 sigma <- 3.5 # generate response y <- rep( NA , len = N) # declare a vector of length N for (ii in 1:N) { y[ii] <- rnorm(n = 1, mean = beta0 + beta1 * x[ii], sd = sigma) } In R , however, many of the functions have built-in vectorization. That is, you could simply do # parameter values beta0 <- 1 beta1 <- 2 sigma <- 3.5 # generate response y <- rnorm(N, mean = beta0 + beta1*x, sd = sigma) The space for y is automatically allocated, an R knows to use each value of beta0 + beta1*x as the mean and the common value of sigma to generate each individual y i . The latter method is both shorter to code and actually computes faster than the former. Use vector- ization as much as possible instead of for-loops in R for all-around better programming. a. Calculate the MLE’s ˆ β 0 and ˆ β 1 . Solution: The MLE’s of β 0 and β 1 can be calculated as follows: # MLE of beta1 xbar <- mean(x) ybar <- mean(y) Sxx <- sum((x-xbar)^2) 2 STAT 331: Applied Linear Models Sxy <- sum((y-ybar)*(x-xbar)) beta1.hat <- Sxy/Sxx # or equivalently beta1.hat <- cov(x,y)/var(x) # each of these just divides by N-1 beta1.hat - Sxy/Sxx # check that they’re the same  -4.440892e-16 # MLE of beta0 beta0.hat <- mean(y) - beta1.hat * mean(x) Here’s a trick to display the MLEs nicely which might come in handy for assignments.  #### You've reached the end of your free preview.

Want to read all 9 pages?

• • •  