Stat 108: Linear Regression Lecture 1 January 06, 2014

Overview of Regression Analysis Regression analysis is a statistical methodology to (i) describe the relationship between a response variable Y and a predictor variable X or a set of predictor variables and to (ii) predict the former from the latter. Simple regression: only one predictor variable (Part I). Multiple regression: more than one predictor variables (Part II).
History and Origin Regression analysis was first developed by Galton (1822-1911) in 19th century in his study of family resemblances. He noted that child’s heights tend to be more moderate than their parents, an effect he called “ regression to mediocrity ”. 1885 study of Francis Galton: The variables are the height of the adult child and the midparent height, defined as the average of the height of the father and the (adjusted) height of the mother. The number of cases is 928, representing 928 children and their 205 parents. Heights of women were adjusted by multiplying 1 . 08 such that men’s and women’s heights would have the same mean.

Child Midparent 1 61.57220 70.07404 2 61.24382 68.22505 3 61.90968 65.12639 4 61.85769 64.23529 5 61.44986 63.88177 6 62.00005 67.02702 ......
Figure: Scatter plot of child’s height against parent’s height ●● ●● ● ● ● ● ●● ● ● ●● ●● ●● ●● ●● ● ● ●● ●●

