### Review_Final_key

Course: ECON 102, Fall 2006
School: Iowa State
data The used to create the JMP output is for 224 students who entered a university in a particular year and were planning to major in computer science. The GPA variable is the student's cumulative grade point average after three semesters. The HSM variable is the student's high school math grade, with 10 representing an "A", 9 an "A-", and so on. The SATM variable is the student's SAT math score. Data analysis: individual variables. Describe the distribution of each variable: GPA: HSM: SATM: Mound shaped and approximately symmetric (slightly skewed to the left) with a mean of 2.64 and a standard deviation of 0.78. [Range from 0.12 to 4.0] Skewed to the left with MIN=2, Q1=7, M=9, Q3=10, and MAX=10 Mound shaped and symmetric with a mean of 595.3 and standard deviation 86.4. [Range from 300 to 800] Data analysis: pairs of variables. Describe the relationship between each pair of variables: GPA and HSM: GPA and SATM: SATM and HSM: FORM: linear, DIRECTION: positive, STRENGTH: r = 0.44 (moderate) FORM: linear, DIRECTION: positive, STRENGTH: r = 0.25 (weak) FORM: linear, DIRECTION: positive, STRENGTH: r = 0.45 (moderate) Calculate the slope and intercept of the least-squares regression line for GPA vs. SATM: sy b = r s x 0.7794 = 0.25 = 0.0023 and a = y - bx = 2.6352 - (0.0023) 595.2857 = 1.2608 86.4014 Interpret the meaning of the slope in the context of the data: Every one point increase in SATM score is associated with a 0.0023 increase in GPA Predict the expected GPA for a student with a SATM score of 500: ^ y = a + bx = 1.2608 + 0.0023(500) = 2.41 Explain why this is or is not a case of extrapolation: It is not extrapolation because a score of 500 falls inside the observed range of SATM scores Create a 95% confidence interval for the mean GPA from this sample: s 0.7794 2.6352 t * ( 223) x t* 2.6352 1.984(0.0521) ( 2.5319 , 2.7385 ) 224 n If I wanted this interval to be half as wide, how many students would I need in my sample? z * n= m 1.96 0.7794 = 0.1033 2 = 874.8 so you would need 875 2 2 The Dean is excited about this particular class of students because their average SATM score is higher than than the previous year, which had a mean SATM score of 580. Perform a hypothesis test to evaluate whether this year's mean score is significantly higher. Use an alpha level of .01. H 0 : 580 x - 0 595.2857 - 580 t= = = 2.648 0.005 > p-value > 0.0025 , so reject H0 H a : > 580 s n 86.4014 Interpret 224 the p-value from the test conducted above: The probability of seeing a sample mean greater than or equal to 595.3 from a random sample of 224 students from a population with mean 580 and standard deviation 86.4 is between 0.005 and 0.0025 Looking at the data, I know that 56 of these 224 students have a GPA above 3.2. Create a 99% confidence interval for the proportion of computer science students having a GPA above 3.2 after three semesters at this university. [Use the Wilson statistic] ~ ~ ~ ~ ~ z * p (1 - p ) X + 2 z * p (1 - p ) 0.2544 2.576 .2544(1 - .2544) ( 0.1801 , 0.3287 ) p 228 n+4 n+4 n+4 If I was ok with this interval being twice as wide and lowering my confidence level to 95%, how many students would I need to collect data on next year? z* 1.96 n = p * (1 - p * ) - 4 n = 0.2544(1 - 0.2544) - 4 n = 28.9988 so you need 29 m 2 * 0.0743 Nationally, Computer science programs have 26.5% of their students with GPA's above 3.2. Using the data from above, is there evidence that this university differs significantly from the National statistic? 2 2 26.5% is well inside the 99% confidence interval above, so we have no evidence this university differs Simple linear regression: Is there a linear relationship between GPA and HSM? Perform the appropriate hypothesis test. H 0 : 1 = 0 H a : 1 0 t= b1 0.207602 = = 7.229 p-value < 0.001, so reject H0, there is a relationship SEb1 0.028719 Looking at diagnostic plots, do we have: (1) constant variance? Residuals vs. Predicted looks pretty good (2) a Normal distribution for the residuals? Normal quantile plot looks ok (3) Influential points or outliers? There are a couple students with low HSM's (influential?) Multiple regression: What percent of the variation in the GPA variable can be explained by the leastsquares regression of GPA on SATM and HSM? About 19.4% of the variation in GPA is explained by the LS regression on SATM and HSM Predict the expected GPA for a student with a SATM score of 750 and HSM score of an "A-": ^ y = b0 + b1 x1 + b2 x 2 = 0.6657425 + 0.0006105(750) + 0.1930048(9) = 2.86 What distribution is used for inference about the beta's in this model? Test whether there is a linear relationship between SATM and GPA, after accounting for HSM scores: 0.0006105 A t distribution with 224 - 2 - 1 = 221 degrees of freedom, t = = 0.9992 0.000611 0.40 > p-value > 0.30, so we fail to reject H0: 1 = 0. There is no evidence of a relationship
