PracticeExam3Answers - Stat 3011 Spring 2011 Introduction...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Stat 3011 Spring 2011 Introduction to Statistics Exam 3 Answers Problem 1 (10 points) Below is a stem and leaf plot that illustrates the number of wins, as of April 18th, for all 16 teams in the National League. Does the data meet either of the two assumptions for performing a confidence interval for the average number of wins by a National League team? Explain why or why not for each assumption. > stem(winsn, scale=2) The decimal point is at the | 5 | 00 6|0 7 | 000000 8 | 0000 9|0 10 | 0 11 | 12 | 0 There are two assumptions we need to check in order to do a confidence interval for the average: (1) The data is from an independent, random sample (2 points) The data does not meet this assumption because it is not a random sample. In fact, it is not a sample. The problem asks for the average number of wins for all the National League teams. The problem has graphed all the data for all the teams in the National League. Therefore, this is a graph of the population. The average of all of these data points is the average number of wins for all the national league teams, and therefore, we don't need a confidence interval-- we can calculate the parameter. (3 points) (2) The data is normally distributed (2 points) This is somewhat of a judgement call. I'd say no, this data is a bit too right skewed to be seen as normal. If you need something more concrete, note that Q1 is at 7 and Q3 is at 8, so 1.5*IQR is 1.5, which means 5, 10, and 12 are all outliers. But you don't have to calculate potential outliers to get full credit. (3 points) Problem 2 (10 points) A Pew Research Poll of 1004 adults from March 17-20 found that 39% favored increasing the use of nuclear power. A poll of 2251 adults from last October found that 47% favored increasing the use of nuclear power. Is this an instance of dependent samples (i.e. matched pairs) or independent samples? Explain. This is an example of independent samples. (3 points) The reason is worth 7 points. Some possible reasons include: The two samples have different sample sizes, so they can't be directly linked. There seem to be two separate polls taken, which means two random samples. While there is the possibility that some people were included in both polls, not all are, so independent. Problem 3 A) (30 points) The standard tip percentage for a good server is 20%. For a waitress at a restaurant, a random sample of 23 checks had an average tip percentage of 27.8% and a standard deviation of 7.8%. A histogram of the tip percentage looks like the normal curve. Is there evidence that the tip percentage differs from 20%? Start with the last sentence: "is there evidence" means that we need to do a hypothesis test. And "the tip percentage differs from 20%" tells us both the null value and the alternative hypothesis. The tricky part of this problem is that the information is in terms of percentages, but the percentage is a measurement (% of the total bill) as opposed to an indication of the relative frequency of a category. Note that the problem gives a standard deviation, which we only need for a quantitative variable. 1) Hypotheses (6 points) Ho: µ=20% (or µ=0.20, but be careful: you need to convert all the data into proportions to perform the testing, which adds more potential for arithmetical mistakes.) Ha: µ≠20% 2) Assumptions (5 points) We need to check and make sure that the data is a random sample for that waitress and that the data looks normal. Both these criteria are met. (Test taking tip: if I'm going to assign 30 points to a hypothesis testing problem, it will satisfy the assumptions) 3) Test statistic (9 points) s 7.8% =1.63 (2 pts formula, 1 for calculation) The standard error estimate: se= = n 23 x −0 27.8 −20 The test statistic: t = = =4.795 (3 pts formula, 1 pt calculation) se 1.63 df=n-1 = 23-1=22 (1 pt formula, 1 calculation) 4) p-value (5 points) using the R output on the last page, we see the closest output is: > pt(-4.79, 22) [1] 4.385543e-05 (2 points) since the alternative hypothesis is two-sided, we need to double this number, so the p-value is 2*0.000043855 = 0.00008771 (3 points) 5) Conclusions (5 points) There is no rejection level given, so we have to assume the rejection level is 0.05. At this rejection level, we have evidence to reject Ho.(3 points) We have evidence that the true tip percentage for this waitress differs from 20%.(2 points) B) (5 points) Based on your conclusion in part A, what type of error could be present: Type I or Type II error? Explain. There is potential for Type I error because we rejected the null hypothesis (2 points). Type I error occurs when you reject Ho when it is in fact true. We have rejected the null, and we therefore have the potential to have rejected the null when in fact, the waitresses true tip percentage was 20%.(3 points) Problem 4 (20 points) We want to survey people with little experience with computers to estimate how many difficulties arise over the course of a month when they use computers. We'd like a 95% confidence interval to be accurate to within ±3 hassles. A similar survey done on people with a lot of computer experience had a standard deviation of 10 hassles. How many people with little computer experience do we need to survey? Again, if we go to the last sentence, we see what the problem is asking for: a sample size for a study. Looking at the rest of the problem, we can figure out what type of data we have and what info we're given. Note that units are given. That is a sure sign that the variable is quantitative. Also, we're given a standard deviation. So we need to use the following formula: z 2∗ 2 (8 points) where m=±3 hassles (2 pts), σ=10 hassles (2 pts) , and zα = z0.025 = 1.96 (4 pts) n= 2 m 1.96 2∗10 2 plugging into the formula: n= 2 = 42.68 (2 pts) which we round up to 43 people (2 points) 3 Problem 5 A) (20 points) A survey about homelessness offered the randomly selected participants $10 to complete the survey. At the end of the survey, they had the choice to either donate the $10 to a homeless shelter or keep it. Of the 275 who chose to keep the $10, 103 of them had not seen any homeless people in the past month. Of the 1228 who chose to donate the $10, 466 of them had not seen any homeless people in the past month. For the event “did not see any homeless people”, find the 99% confidence interval for the difference between those who chose to keep the money and those who chose to donate it. The first thing we need to do is see if we have a large enough sample (5 points): keep group: n* pk = # of people who haven't seen any homeless people = 103 n*(1- pk )= # of people who have seen homeless people = 275 - 103 = 172 donate group: n* pd = # of people who haven't seen any homeless people = 466 n*(1- pd ) = # of people who have seen homeless people = 1228 - 466 = 762 all are bigger than 10, so we can do a confidence interval. The formula is: ( pk - pd ) ± zα * p k∗1 − p k p d ∗1 − p d where k=keep and d=donate (5pts) nk nd 103 =0.3745 (1 pt) 466 = 0.3795 (1 pt) and nk = 275 (1 pt) and nd = 1228 (1pt) pd= pk = 1228 275 And, for the 99% confidence interval, we need z0.005 = 2.575 (2 points) Where 0.3745 ∗ 1 −0.3745 0.3795 ∗ 1−0.3795 Plugging the values in: (0.3745 - 0.3795) ± 2.575* 275 1228 = -0.0050 ± 2.575 *0.0323 = -0.0050 ±0.0832 (2 points) Which gives us a confidence interval of (-0.0872, 0.0782) B) (5 points) Interpret this confidence interval. What does it tell you about the difference between those who donated the money and those that chose to keep it? Since 0 is in the interval, there's no significant difference between those who chose to keep the money and those who donated it with respect to the proportion that had not seen any homeless people in the past month. ...
View Full Document

This note was uploaded on 02/28/2012 for the course ECON 111 taught by Professor Aaa during the Summer '11 term at UIBE.

Ask a homework question - tutors are online