This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: STAT 100B Note 2 1 Topics (1) Linear regression (2) Maximum likelihood estimation (3) Bayesian decision rule and likelihood ratio test 2 Mathematical preparation (1) Let x i = ( x i 1 ,...,x ip ) T and β = ( β 1 ,...,β p ) T be two column vectors. Then x i 1 β 1 + ... + x ip β p = p X j =1 x ij β j = h x i ,β i = x T i β. (2) Let f ( β ) = x T i β , then ∂f ( β ) /∂β j = x ij for j = 1 ,...,p . Define f ( β ) = ( ∂f ( β ) /∂β 1 ,...,∂f ( β ) /∂β p ) T to be a column vector, then f ( β ) = ( x i 1 ,...,x ip ) T = x i . (3) Let g ( β ) = h ( x T i β ), then according to the chain rule, ∂g ( β ) /∂β j = h ( x T i β ) ∂ ( x T i β ) /∂β j = h ( x T i β ) x ij . g ( β ) = ( ∂g ( β ) /∂β 1 ,...,∂g ( β ) /∂β p ) T = ( h ( x T i β ) x i 1 ,...,h ( x T i β ) x ip ) T = h ( x T i β ) x i . (4) h x i ,β i =  x i  β  cos θ , where θ is the angle between x i and β . So x T i β/  β  =  x i  cos θ can be considered the projection of the vector x i onto the vector β . 3 Linear regression (1) Simplest regression model y i = βx i + i , i = 1 ,...,n . The residual sum of squares is R ( β ) = X i ( y i βx i ) 2 . According to the least squares principle, we want to estimate β by minimizing R ( β ). Taking derivative R ( β ) = 2 X i ( y i βx i ) x i By setting R ( β ) = 0, we solve ˆ β = ∑ i x i y i ∑ i x 2 i . (2) Simple regression model y i = β + β 1 x i + i , i = 1 ,...,n . The residual sum of squares R ( β ,β 1 ) = X i [ y i ( β + β 1 x i )] 2 . 1 We want to estiamte β and β 1 by minimizing R ( β ,β 1 ). The partial derivative ∂R ∂β = 2 X i [ y i ( β + β 1 x i )] . By setting ∂R/∂β = 0, we have, at the minimum, X i y i = X i ( β + β 1 x i ) , so ¯ y = β + β 1 ¯ x. So (¯ x, ¯ y ) lies on the least squares regression line. Thus β = ¯ y β 1 ¯ x at the minimum of R . Substituting β in R , we have R = X i [ y i (¯ y β 1 ¯ x + β 1 x i )] 2 = X i [( y i ¯ y ) β 1 ( x i ¯ x )] 2 = X i (˜ y i β ˜ x i ) 2 where ˜ y i = y i ¯ y , and ˜ x i = x i ¯ x i . So the least squares problem for the simple regression is transformed into the least squares problem for the simplest regression. ˆ β 1 = ∑ i ˜ x i ˜ y i ∑ i ˜ x 2 i , and ˆ β = ¯ y ˆ β 1 ¯ x . (3) Geometry and correlation: Let X be a column vector whose ith component is x i ¯ x , and let Y be a column vector whose ith component is y i ¯ y , for i = 1 ,...,n . The sample correlation between ( x i ,i = 1 ,...,n ) and ( y i ,i = 1 ,...,n ) is ∑ i ( x i ¯ x )( y i ¯ y ) p ∑ i ( x i ¯ x ) 2 p ∑ i ( y i ¯ y ) 2 = ∑ i ˜ x i ˜ y i q ∑ i ˜ x 2 i q ∑ i ˜ y 2 i = h X , Y i  X  Y  = cos θ....
View
Full
Document
This note was uploaded on 03/30/2011 for the course STAT 100B taught by Professor Wu during the Spring '11 term at UCLA.
 Spring '11
 Wu
 Linear Regression

Click to edit the document details