COS424SML 302 Linear Regression 27 46 Evaluating the fit of

Cos424sml 302 linear regression 27 46 evaluating the

This preview shows page 27 - 35 out of 46 pages.

COS424/SML 302 Linear Regression February 25, 2019 27 / 46
Image of page 27

Subscribe to view the full document.

Evaluating the fit of our model: example On our shoe size, height data: linear model no intercept remove outlier RSS 815.3 1188.3 453.9 MSE 6.42 9.36 3.60 RMSE 2.53 3.06 1.90 r 2 0.60 0.42 0.75 COS424/SML 302 Linear Regression February 25, 2019 28 / 46
Image of page 28
Multiple predictors: multivariate linear regression What happens if we have more than one predictor, or feature, that we can use to predict response y ? Examples of multivariate linear regression models x = age, gender and y = height x = distance from the shore, depth of water and y = weight of clams x = gestational age, mother’s age and y = birthweight x = cigarette smoker, BMI and y = lifespan x = presence of a dam, water temperature and y = fish weight x = disposable income, education and y = total consumption x = genotype at 20 million genomic loci and y = hip-to-waist ratio We are increasing the number of features, not the number of samples. COS424/SML 302 Linear Regression February 25, 2019 29 / 46
Image of page 29

Subscribe to view the full document.

Definitions: Multivariate regression For sample i 1 : n , p predictors y i ∈ < : response (observed) x i ∈ < p : predictors, covariates, or explanatory variables (observed) β ∈ < p : coefficients, effects (parameter) i ∈ < residual error, noise X Y Univariate versus multivariate regression Univariate regression p = 1, a single covariate Multivariate regression p > 1, multiple covariates COS424/SML 302 Linear Regression February 25, 2019 30 / 46
Image of page 30
Linear model specification For p predictors and n samples, we define multivariate linear regression. Gaussian multivariate linear regression A Gaussian linear regression model has the form: y = x T β + = β 0 + x 1 β 1 + · · · + x p β p + where ∼ N (0 , σ 2 ). Z X Z X Genera)ve Discrimina)ve Y X Y X Y X p n n This is equivalent to y | x T , β, σ 2 ∼ N ( x T β, σ 2 ). Note the v-structure in the model. What makes this tractable for large p ? COS424/SML 302 Linear Regression February 25, 2019 31 / 46
Image of page 31

Subscribe to view the full document.

Multivariate regression assumptions These models assume over-simplified data: predictors x are treated as fixed value RVs. We do not care about their distribution. y is a weighted linear combination of the x values the variance term is not a function of x ( homoskedasticity ) the residual errors are independent the predictors are independent Linear regression, even with these assumptions, is one of our most important data analysis tools. COS424/SML 302 Linear Regression February 25, 2019 32 / 46
Image of page 32
Parameter estimation in linear regression Let’s discuss how to estimate the coefficients β , β 0 in the univariate model, with data set D = { ( x 1 , y 1 ) , ..., ( x n , y n ) } First, we will try to derive the maximum likelihood estimate (MLE); recall: write the log likelihood differentiate with respect to β set equal to 0 and solve for β . Recall: why is the parameter value at the 0 point of the derivative the parameter MLE? COS424/SML 302 Linear Regression February 25, 2019 33 / 46
Image of page 33

Subscribe to view the full document.

MLE parameter estimation in linear regression The likelihood is written as a Gaussian conditional distribution: y | x , β, σ 2 n Y i =1 N ( x i β, σ 2 ) . Log likelihood for univariate linear regression ( β ; D ) = log n Y i =1 " 1 2 πσ 2 1 / 2 exp - 1 2 σ 2 ( y i - β x i ) 2 # = n X i =1 log " 1 2 πσ 2 1 / 2 exp - 1 2 σ 2 ( y i - β x i ) 2 # = - n 2 log (2 πσ 2 ) - 1 2 σ 2 n X i =1 ( y i - β x i ) 2 = - 1 2 σ 2 RSS( β, D ) + c We maximize log likelihood by minimizing residual sum of squares (RSS).
Image of page 34
Image of page 35

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes