PS18_310_SOLN

Course: ECO 310, Fall 2009
School: Toledo
Word Count: 1267

Problem ECO310Y: Set 18 SOLUTIONS 1. Please ask questions you may have about reading STATA regression output during tutorials. 2. Solutions are not relevant for this self-directed activity. 3. (a) i. 100 observations ii. 4 variables iii. y has the largest standard deviation, but this comparison depends on units of measurement. Given that we dont know that all four variables are measured in the same units, we can...

Problem ECO310Y: Set 18 SOLUTIONS 1. Please ask questions you may have about reading STATA regression output during tutorials. 2. Solutions are not relevant for this self-directed activity. 3. (a) i. 100 observations ii. 4 variables iii. y has the largest standard deviation, but this comparison depends on units of measurement. Given that we dont know that all four variables are measured in the same units, we can look at the coecient of variation (cv): standard deviation / mean. If based on a comparison of the cv then x3 is the most variable. One could also look at the range of each variable (max min) and according to that statistic y would be the most variable. iv. Absolutely not i. x1 and x2 are strongly positively correlated with each other. The coecient of correlation ranges between -1 (perfect negative correlation) to 0 (no correlation) to 1 (perfect positive correlation). A value of 0.8704 indicates that x1 and x2 are strongly positively correlated. There is a weak positive correlation between x3 and y. There is a very weak positive correlations between x1 and x3 and x2 and x3. In fact, we would say that x1 and x3 are not correlated and that x2 and x3 are not correlated. ii. y and x2 are strongly negatively correlated with a coecient of correlation equal to -0.7225. y and x1 are moderately negatively correlated. i. yi = 0 + 1 x1i + 2 x2i + 3 x3i + i ii. It is 3 because there are 3 independent (explanatory) variables: x1, x2, and x3. iii. R-squared is 0.5994, which means that 59.94% of the variation in y is explained by variation in x1, x2, and x3. iv. To test the over statistical signicance of the model, do an F-test. The F statistic is 47.88 with numerator degrees of freedom of 3 and denominator degrees of freedom of 96, which we could compare to the critical value found from a statistical table. However, there is no need because Stata also reports the p-value of this test: Prob F = 0.0000. This means that the probability of making a Type I error (rejecting a true null hypothesis) is basically zero. Hence, we would clearly reject the null hypothesis that the model has no explanatory power (i.e. that all of the slope coecients are zero) and conclude that this linear regression model is statistically signicant. v. The Root MSE is 3.7989 which means that the estimate of 2 is 14.43 (= 3.79892 ) or you could have read this directly o the ANOVA table. This is a measure of the variance of the error term: the unobservables (). The larger the variance of the error term relative to the variance of the dependent variable, the more the dependent variable is being determined by the unobservables. Hence, a relatively large error variance means that our model is failing to explain much about the dependent variable: most of 1 (b) (c) vi. vii. viii. ix. x. xi. 4. (a) explanation lies with the error. We see that relative to the variance of y, which is 34.93 (which can be read directly o the ANOVA table, or can be computed using the summary of the data above as 5.910382 , the variance of the error is quite large. Of course this fact is conveniently summarized in the R-squared statistic already discussed. All three. t-stats in each case are greater than 2 in absolute value (simply use the rule of thumb). No. The t-stat would be -1.27 (= 2.1583 ), which does not pass the rule of 0.664 thumb. Hence we have insucient evidence in these data to reject the hypothesis that the true slope parameter on x1 is 3. There is no bias. The correlations in the table above the regression are raw correlations: this means that we look at the correlations between two variables without holding the variables other constant. The advantage of the multiple regression analysis is that it allows estimation of the marginal eect of each variable while holding the other variables constant. In this particular case, once we controlled for the separate eects of x2 and x3, the marginal eect of an increase in x1 on y is actually positive and not negative. A one unit increase in x1 on average results in a 2.2 unit increase in y. A one unit increase in x2 on average results in a 5.7 unit decrease in y. A one unit increase in x3 on average results in a 0.9 unit increase in y. Plugging in we got: y = 103.6967 + 2.158004 x1 5.669571 (7.881578) + 0.9049954 (4.119784). Hence there is a simple straight-line relationship between y and x1: y = 62.73992 + 2.158004 x1. This just shifts the relationship: y = 50.531639 + 2.158004 x1. i. rm A - rm H and yr 1990 - yr 1999 are the dummy variables. ii. It can take two values: 0 or 1. It will be 1 for all observations related to rm A and 0 otherwise. Given that the question stated that each rm was followed for 10 years there will be 10 ones and 70 zeros for this variable. iii. It would be 1 for all observations in the data. Each observation will be associated with one, and only one rm. Hence only one rm dummy variable will be turned on for each row of the data (each observation). iv. It can take two values: 0 or 1. It will be 1 for all observations from year 1993 and 0 otherwise. Given that the question stated that there are 8 rms there will be 8 ones and 72 zeros for this variable. i. Firm A ii. The dependent variable y tends to be 79.5159 units bigger for Firm B compared to Firm A. iii. The dependent variable y tends to be 159.9315 units smaller for Firm C compared to Firm A. iv. y = 118.4732 + 1.831333var1 + 10.33615var2 v. Plugging in y = 118.4732 + 11.45358 + 1.831333var1 + 10.33615var2, so relationship is y = 129.93 + 1.83var1 + 10.34var2. (b) 2 vi. Plugging in y = 118.4732 + 79.5159 + 11.45358 + 1.831333var1 + 10.33615var2, so relationship is y = 209.44 + 1.83var1 + 10.34var2. vii. There are clearly large rm xed eects, which means that it is important to control for these. There do not appear to be su...

