Linear Regression Analysis • Correlation • Simple Linear Regression • The Multiple Linear Regression Model • Least Squares Estimates • R 2 and Adjusted R 2 • Overall Validity of the Model ( F test) • Testing for individual regressor ( t test) • Problem of Multicollinearity Gaurav Garg (IIM Lucknow)
Smoking and Lung Capacity • Suppose, for example, we want to investigate the relationship between cigarette smoking and lung capacity • We might ask a group of people about their smoking habits, and measure their lung capacities Cigarettes ( X ) Lung Capacity ( Y ) 0 45 5 42 10 33 15 31 20 29 Gaurav Garg (IIM Lucknow)
• Scatter plot of the data • We can see that as smoking goes up, lung capacity tends to go down. • The two variables change the values in opposite directions. 0 5 10 15 20 25 0 10 20 30 40 50 Lung Capacity Gaurav Garg (IIM Lucknow)
Height and Weight•Consider the following data of heights and weights of 5 women swimmers:Height (inch):62 64 65 66 68Weight (pounds):102108115128132•We can observe that weight is also increasing with height. 61 62 63 64 65 66 67 68 69 0 20 40 60 80 100 120 140 Gaurav Garg (IIM Lucknow)
• Sometimes two variables are related to each other. • The values of both of the variables are paired. • Change in the value of one affects the value of other. • Usually these two variables are two attributes of each member of the population • For Example: Height Weight Advertising Expenditure Sales Volume Unemployment Crime Rate Rainfall Food Production Expenditure Savings Gaurav Garg (IIM Lucknow)
• Properties of Covariance: Cov(X+a, Y+b) = Cov(X, Y) [not affected by change in location] Cov(aX, bY) = ab Cov(X, Y) [affected by change in scale] Covariance can take any value from -∞ to +∞ . Cov(X,Y) > 0 means X and Y change in the same direction Cov(X,Y) < 0 means X and Y change in the opposite direction If X and Y are independent, Cov(X,Y) = 0 [other way may not be true] • It is not unit free. • So it is not a good measure of relationship between two variables. • A better measure is correlation coefficient. • It is unit free and takes values in [-1,+1]. Gaurav Garg (IIM Lucknow)
Correlation • Karl Pearson’s Correlation coefficient is given by • When the joint distribution of X and Y is known • When observations on X and Y are available Gaurav Garg (IIM Lucknow) ) ( ) ( ) , ( ) , ( Y Var X Var Y X Cov Y X Corr r XY 2 2 2 2 )] ( [ ) ( ) ( , )] ( [ ) ( ) ( ) ( ) ( ) ( ) , ( Y E Y E Y Var X E X E X Var Y E X E XY E Y X Cov n i i n i i n i i i y y n Y Var x x n X Var y y x x n Y X Cov 1 2 1 2 1 ) ( 1 ) ( , ) ( 1 ) ( ) )( ( 1 ) , (
Properties of Correlation Coefficient • Corr(aX+b, cY+d) = Corr(X, Y), • It is unit free. • It measures the strength of relationship on a scale of -1 to +1 . • So, it can be used to compare the relationships of various pairs of variables. • Values close to 0 indicate little or no correlation • Values close to +1 indicate very strong positive correlation. • Values close to -1 indicate very strong negative correlation. Gaurav Garg (IIM Lucknow)
You've reached the end of your free preview.
Want to read all 96 pages?
- Spring '15
- Regression Analysis, Gaurav Garg