This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Assignment #2
(Due: February 11, 2011)
1. The luxury ocean liner Titanic sank spectacularly in 1912. The SAS dataset titanic in the folder (copy it to your library folder) has information about the people on board: whether they were crew or first, second, or third class passengers, whether they were male (1) or female (0), adult (1) or child (0), and whether they survived the disaster (yes=1, no=0). (a) What proportion of the voyagers survived? (b) What are the survival proportions for each passenger class? (c) What proportion of first‐class female passengers survived? What about third‐class female passengers? 2. A foreign language teacher at the local high school has a dataset representing the current senior class; you can find the SAS data set called forlang (copy it to your library folder) in the folder. Each observation represents a student. The variables listed (in order) are student id number, gender, score on the standardized English exam, score on the standardized math exam, whether or not the student elected to take a foreign language class at the high school, whether or not the student participated in varsity sports, the student’s 8th grade GPA and the student’s 12th grade GPA. Create a SAS dataset and answer the following questions. (a) Compute the mean, median, standard deviation, and 90% CI for the mean for the combined score on the standardized exams (math+ English)? (Create a new variable.) (b) The teacher thinks that foreign language should be a requirement rather than an elective because kids taking foreign language do better on the English exam. Is she right? Do kids taking a foreign language score higher on the English exam? Compute the mean and standard deviation and suggest an answer. (c) There is concern that kids participating in varsity sports become less concerned with academics. Compute the mean and standard deviation for the drop in GPA (from 8th to 12th grade) for kids participating in varsity sports vs. kids not participating in varsity sports. (Create a new variable.) What does the result suggest? (d) Test the hypothesis H0: μmath = 70 vs Ha: μmath ≠ 70 with α = 0.001. Explain your decision. Also, test hypothesis H0: μmath = 70 vs Ha: μmath < 70 with α = 0.001. Do you have the same conclusion for the two tests? (e) Produce a stem‐and‐leaf plot and a boxplot for the variables Math and English and interpret the plots. (f) Kids with higher than 152 combined score get an award from the state. What proportion of seniors will get this award? Hint: You can create a new variable award as follows in a data step: combine=english+math;
if combine > 152 then award='Y';
else award='N'; 3. Field goal data set (FG_data.xls) contains a random sample of field goals kicked during the 1995 NFL regular season. Bilder and Loughin (Chance, 1998) performed an analysis of the data. Below is the description of the variables: • Date ‐ Date of the field goal attempt (month, day, year) • Stadium ‐ Type of stadium; O=Outside, D=Dome • Surface ‐ Type of playing surface; T=Artificial Turf, G=Grass • Kicker ‐ Name of field goal kicker • Success ‐ Success of the field goal; Y=Yes, N=No • Distance ‐ Distance in yards of the field goal (a) Import the data using PROC IMPORT and create a temporary SAS data set (called FieldGoal). Print the first 10 observations of your dataset. Note: If you print out the data set you will find that SAS did not perfectly import the data sets. Change the variable name _stadium to stadium using an extra data step. (b) Produce a contingency table of field goal result vs. surface, and performs the Fisher’s exact test. How do you interpret the test result? Also, create an output data set (call it out_set1) that includes the cell frequencies for the last table specified in the TABLES statement. (c) Produce a contingency table of field goal result vs. stadium, and performs the Pearson chi‐
square test for independence. How do you interpret the test result? Also, print each cell’s contribution to the Pearson chi‐square test statistic and do not show any percentages in each cell. Create a data set (call it out_set2) that includes the result of the chi‐square test. ...
View Full Document
This note was uploaded on 06/06/2011 for the course STAT 4360 taught by Professor Park during the Spring '11 term at University of Georgia Athens.
- Spring '11