Unformatted text preview: Assignment #3
(Due: February 21, 2011)
1. Use the forlang SAS data in Problem 2 of Assignment #2. a. Using PROC UNIVARIATE, obtain a histogram for the 12th grade GPA variable. Comment on the shape of the distribution. b. Plot English score (on y‐axis) vs 8th grade GPA (on x‐axis). Does it look like there is a positive or negative relationship between English score and 8th grade GPA? c. Plot English score versus 8th grade GPA, using whether or not the student has taken a foreign language as the plot character. Does it look like there is a relationship between English score and 8th grade GPA? Does it look like students taking a foreign language do better on the English exam? d. A difference in sample means could be due to random variation. Draw an appropriate plot or chart for English score versus whether or not the student has taken a foreign language. Does it look like students who have taken a foreign language have scored substantially higher on the English exam? (Soon we will be able to determine if this difference is statistically significant; for now, just make a guess based on the plot.) e. Create a pie chart on the variable that distinguished whether or not the student elected to take a foreign language class at the high school. Did more people take a foreign language class or not take a foreign language class at high school? f. Plot English score (on y‐axis) vs 8th grade GPA (on x‐axis) and Math score (on y‐axis) vs 8th grade GPA (on x‐axis) on the same graph. For the English score data, use a red plus to signify each data value. For the math score data, use a green star to signify each data value. (These colors won’t show up on your print‐outs, but that is okay). From this graph, which data values appear to be higher on average, Math scores or English scores? 2. You have an excel file called beer.xls. This file is on Quarterly U.S. Beer Production from the First Quarter of 1975 to the Fourth Quarter of 1982 (Millions of barrels). Use PROC IMPORT to read this data to SAS and create a temporary SAS data set. a. Plot beer production (on y‐axis) vs year (on x‐axis) for each quarter so that you will have four different lines (for 4 different quarters) on your plot. Make sure that you use different symbols for different quarters and provide a legend for it. Do you expect an interaction between year and quarter? b. Add a new variable Time_Q to your dataset by using Time_Q = Quarter + (Year ‐1975) *4 c. Plot the time series of beer production (Hint: beer production on y‐axis vs Time_Q on x‐axis). Please connect the points using a black solid line. 3. A study was performed to test the efficiency of a new drug developed to increase high density lipoprotein (HDL) cholesterol levels in patients. 27 volunteers were split into 3 groups (Placebo/5 mg/10 mg) and the difference in HDL levels before and after the treatment (HDL_DIFF) was measured. The 27 volunteers were also divided into three age groups (18‐39 and 40‐60 and ≥ 60). The full data set is shown below: The difference in HDL levels before and after the treatment (HDL_DIFF) Age 18‐39 years 40‐ 60 years ≥ 60 years Placebo 4, 3, ‐1 3, 2, 0 1, 4, 5 5 mg 9, 5, 6 5, 6, 7 9, 9, 6 10 mg 14, 12, 10 10, 8, 7 8, 10, 7 Treatment Create an interaction plot for the HDL data. (i.e., display the levels of the treatment on the x‐axis and the mean of HDL_DIFF for each treatment on the y‐axis and show a separate line connecting the HDL_DIFF corresponding to 3 AGE groups.) Do you see any interaction between Treatment and Age? ...
View Full Document
- Spring '11
- grade GPA