STAT_331_notes - STAT 331 notes Applied Linear Models...

This preview shows page 1 - 4 out of 38 pages.

STAT 331 notes: Applied Linear Models Johnew Zhang November 10, 2012 Contents 1 Introduction 2 2 Review of Simple Linear Regression Model 3 2.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Least Square Estimation (LSE) . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 The Properties of ˆ β 0 and ˆ β 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.4 The Estimation of σ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.5 Confidence Interval and Hypothesis Testing . . . . . . . . . . . . . . . . . . 6 2.5.1 The t-test Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.6 Prediction for Future Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.7 Analysis of Variance (ANOVA) . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.7.1 Some distribution result (when H 0 is true) . . . . . . . . . . . . . . 9 2.7.2 Terminology of ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Review of Random Vectors and Matrix Algebra 11 3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 Differentiating Over Linear and Quadratic Forms . . . . . . . . . . . . . . . 12 3.4 Some Useful Results on Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 12 4 Multiple Linear Regression 13 4.1 Multivariate Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 The Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3 LSE of β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.4 An estimation of σ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.5 Sampling Distribution of ˆ β, ˆ σ 2 under Normality . . . . . . . . . . . . . . . . 17 4.6 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.7 ANOVA Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1
4.7.1 ANOVA Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5 Model evaluation and residual analysis 21 5.1 Model and Model Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.2 Relationship between residuals and random errors . . . . . . . . . . . . . . 22 5.2.1 Statistical Properties of r ( r = ( I - H ) ) . . . . . . . . . . . . . . . . 23 5.3 Residual plots for checking E ( i ) = 0 . . . . . . . . . . . . . . . . . . . . . . 23 5.3.1 Residual versus x j . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.3.2 Partial residuals versus x j . . . . . . . . . . . . . . . . . . . . . . . . 24 5.3.3 Added-variable plots . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.4 Residual plots for checking constant variance V ( i ) = σ 2 . . . . . . . . . . . 24 5.5 Residual plots for detecting correlation in c i ’s . . . . . . . . . . . . . . . . . 25 6 Model Evaluation: Data Transformations 26 6.1 The Box-Cox Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 26 6.2 Logarithmic Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 6.2.1 Logarithmic Transformation of y only . . . . . . . . . . . . . . . . . 26 6.2.2 Logrithmic transformation of all variables . . . . . . . . . . . . . . . 27 6.2.3 Logarithmic transformation of y and some x’s . . . . . . . . . . . . . 27 6.2.4 95% CI for transformed estimate . . . . . . . . . . . . . . . . . . . . 28 6.3 Transformation for Stabilizing Variance . . . . . . . . . . . . . . . . . . . . 28 6.4 Some Remedies for non-linearity- Polynomial Regression . . . . . . . . . . . 29 7 Model Evaluation - Outliers and Influential Case 29 7.1 Outlier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 7.2 Hat Matrix and Leverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 7.3 Cook’s Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 8 Model Building and Selection 33 8.1 More Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 8.1.1 Testing some but not all β ’s . . . . . . . . . . . . . . . . . . . . . . . 33 8.2 Extra Sum of Square Principle . . . . . . . . . . . . . . . . . . . . . . . . . 34 8.2.1 ANOVA Table for Testing Some β ’s . . . . . . . . . . . . . . . . . . 34 8.2.2 The general linear hypothesis . . . . . . . . . . . . . . . . . . . . . . 35 8.3 Categorical Predictors and InteractionTerms . . . . . . . . . . . . . . . . . . 36 8.3.1 Binary predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 8.3.2 Hypothesis Testing of Interaction Term . . . . . . . . . . . . . . . . 36 8.3.3 categorical predictor with more than 2 levels . . . . . . . . . . . . . 36 2
1 Introduction Regression deals with the func tional re la tion ship between a response (or outcome) variable y and one or more explanatory variables (or predictor variables) x 1 , · · · , x p . A general expression for a regression model is y = f ( x 1 , · · · , x p ) + where function f ( x 1 , · · · , x p ) represents the deterministic relationship between y and x 1 , · · · , x p the extra term is an error term or called noise. It represents unexplained variation in y due to other factors Applications y x s global climate surface temperature GHG finance stock price index unemployment rate, CPI, etc.

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture