Unformatted text preview: Previous Midterms
These two exams are indicative of the style and length of what you will see. The only topics not covered on these example exams are o Regression o Conditional expectation o Confidence intervals (a little not a lot) For extra regression and confidence interval questions see the class study guide. I will post extra questions on conditional expectation. Business 550 : Data and Decision Analytics Summer 2008 Practice Exam 1 Midterm Examination Directions The exam will end 120 minutes after it begins. The exam is divided into three parts. The first part is true/fale and the second part is multiple choice. Please answer the multiple choice questions on the exam by circling the best answer (some rounding occurs in several places). There will be no partial credit for these questions. The third part of the exam consists of several problems. Please answer these problems in the space provided on the exam (you may use the backs of the sheets if necessary). You will get partial credit for these problems provided that your answers are organized and legible so that your train of thought can be easily followed. Good Luck
DON'T EVEN THINK ABOUT PANICING
By Printing my name below I acknowledge that the GBS has an honor code and that I will adhere to it. Failure to abide by the honor code could result in failing this course and having to wash Professor Parzen's car with my toothbrush. NAME : ____________________________________________________________ (50 if not printed) Question T/F Multiple Choice Long 1 Long 2 Long 3 Total Points 30 190 40 40 50 350 Obtained 1 True or False (3 points each) 1. T F If X~N(0,1) then P(X>0)=.5 2. T F If X is a binomial random variable with n=1 and p=.5 then P(X=0)=0 3. T F If the sample covariance is negative then the sample correlation must also be negative. 4. T F As we collect more and more (x,y) pairs of data, the value of r (the correlation) should go to one. 5. T F The sample mean is more affected by outliers than the sample median. 6. T F The sample correlation is always a value between 1 and 1 (inclusive) 7. T F If you add a constant amount to each value in a dataset, the value of the variance will change. If Z~N(0,1), then P(2 < Z < 0.44) = .6472 8. T F 9. T F 50% of a dataset are within the interval of (Q1, Q3). 10. T F Var ( X ) E ( X 2 ) 2 Multiple Choice (10 points each) 1) Suppose a loaded dice has the following probability distribution: Face Probability 1 0.3 2 0.1 3 0.1 4 0.1 5 0.1 6 0.3 The dice is thrown and the top face shows an odd number. Given this fact, what is the probability that the dice shows a 1? a. 0.1 b. 0.3 c. 0.5 d. 0.6 e. none of the above 2) A small company has 7 employees. The numbers of years these employees have worked for this company are shown as follows: 4 14 3 16 9 8 16 Based upon this information, the median number of years that employees have been with this company is: a. b. c. d. e. 9 years 10 years 11.5 years 16 years None of the above 3) Let A be the event that a student is enrolled in an accounting course, and let S be the event that a student is enrolled in a statistics course. It is known that 30% of all students are enrolled in an accounting course and 40% of all students are enrolled in statistics. We also know that 15% of the students are enrolled in both statistics and accounting. A student is randomly selected. Given that that the student is enrolled in accounting, what is the probability that this student is also enrolled in statistics? a) 0.15 b) 0.75 c) 0.375 d) 0.50 e) 0.80 3 f) None of the above 4) Manuel Banales, Marketing Director of Plano Power Plants, Inc.'s Electrical Division, is leading a study to assess the relative importance of product features. Two items on a survey questionnaire distributed to 100 of Plano's customers asked them to rate the importance of "ease of maintenance" and "efficiency of operation" on a scale of 1 to 10 (with 1 meaning "not important" and 10 meaning "highly important"). His staff assembled the following statistics on these two items. Ease of Maintenance Mean Median Standard Deviation 7.5 8.5 1.5 Efficiency of Operation 6.0 5.5 2.5 What can Manuel conclude from these statistics? A. B. C. D. Ease of Maintenance is more important than Efficiency of Operation Efficiency of Operation is more important than Ease of Maintenance Efficiency of Operation and Ease of Maintenance are equally important The Efficiency of Operation distribution has less dispersion 5) There are three children in a room  ages 3, 4, and 5. If a fouryearold child enters the room, the a. mean age will stay the same but the variance will increase. b. mean age will stay the same but the variance will decrease. c. mean age and variance will stay the same. d. mean age and variance will both increase. 6) Which of the following statements is false? a. The width of a confidence interval estimate of the population mean narrows when the sample size increases b. The width of a confidence interval estimate of the population mean narrows when the variance increases c. The width of a confidence interval estimate of the population mean widens when the confidence level increases 4 7) In developing an interval estimate for a population mean, s was 10. The interval estimate was 40 6. Had s equaled 5, and everything else remains the same, the interval estimate would be (a) 40 12 (b) 20 12 (c) 20 3 (d) 40 3 The next two questions refer to the following setting: For several years Dominos Pizza advertised that if takes them longer than 30 min to deliver your pizza, then you can have it for free. Experience has shown that the time it takes to deliver a pizza in Athens, Georgia is approximately normally distributed with a mean of 25 minutes and a standard deviation of 2.5 minutes. 8) Considering a random sample of 50 customers in Athens, what is the probability that no one will receive a free pizza? (A) 0.053 (B) 0.152 (C) 0.257 (D) 0.316 (E) None of the above 9) What should the "get a free pizza" determining time of 30 minutes be changed to if the company wishes to provide only 0.1% of its customers with rebates? (A) 30 (B) 32.75 (C) 34 .3 (D) 36.81 (E) None of the above 10) The average gas mileage of a certain model car is 26 miles per gallon. If the gas mileages are normally distributed with a standard deviation of 1.3, find the probability that a car has a gas mileage of between 25.8 and 26.3 miles per gallon. A) 0.18 B) 0.20 C) 0.15 D) 0.26 E) None of the above 11) A survey of 800 women shoppers found that 17% of them shop on impulse. What is the 95% confidence interval for the true proportion of women shoppers who shop on impulse? A) 0.144 < p < 0.196 B) 0.148 < p < 0.192 C) 0.136 < p < 0.204 D) 0.139 < p < 0.201 E) None of the above 5 12) A television station estimates that 60% of college students watch the Super Bowl. For a sample of 240 students selected at random, what is the mean and variance of the number of students who watch this game? A) Mean = 144.0, Variance = 7.59 B) Mean = 57.6, Variance = 57.60 C) Mean = 57.6, Variance = 7.59 D) Mean = 144.0, Variance = 57.60 E) None of the above 13) A researcher calculated the values and probabilities for a random variable X as shown below. Unfortunately, he erased the last value and needs to figure out what it was. If the mean of X was 3.4, then what was the last value? A) 12 B) 11 C) 10 D) 9 E) None of the above 14) The employees of EAB Tree Removal Company have a mean salary of $25,000 with a standard deviation of $2500. Due to a good business year, the owner of the company gives every employee a $2000 raise in their salary. The new mean and standard deviation of the salary a) will be $25,000 and $2500. b) will be $27,000 and $4500. c) will be $27,000 and $2500. d) cannot be determined without knowing the number of employees, 15) Bob is a high school basketball player. He is a 70% free throw shooter. That means his probability of making a free throw is 0.70. What is the probability that Bob makes his first free throw on his fifth shot? (A) 0.0024 (B) 0.0057 (C) 0.0081 (D) 0.0720 (E) 0.1681 6 16) A sample of n = 168 students was asked, "Do you believe in love at first sight?" Shown below are four confidence intervals, in scrambled order, for 90%, 95%, 98%, and 99% confidence levels for the population proportion who would answer "yes". Which is the 98% confidence interval (circle your answer) ? 17) For some positive value of x, the probability that a standard normal variable is between 0 and +2x is 0.1255. The value of x is
a) 0.99 b) 0.40 c) 0.32 d) 0.16 e) None of the above 18) Increasing the sample size causes the distribution of X to ________. (a) (b) (c) (d) shift to the right shift to the left have more dispersion have less dispersion Health care issues are receiving much attention in both academic and political arenas. A sociologist recently conducted a survey of citizens over 60 years of age whose net worth is too high to qualify for Medicaid and have no private health insurance. The descriptive statistics for the ages of 25 uninsured senior citizens were as follows: 19) Which of the following is the best correct statement ? a) One fourth of the senior citizens sampled are below 66 years of age. b) The middle 50% of the senior citizens sampled are between 66 and 73.0 years of age. c) The average age of senior citizens sampled is 73.5 years of age. d) All of the above are correct. 7 Short answer Question 1 TOTAL 130 points TOTAL 40 points Do not perform any detailed calculations to answer the following questions. Provide a onesentence explanation of your reasoning in each case. (a) Two researchers, Alex and Bob, independently select random samples from the same population. The sample sizes are 4000 for Alex and 1000 for Bob. Each researcher constructs a 95% confidence interval for from his data. The widths of the two intervals are .062 and .03. Match each interval width with its researcher. (b) Alex took another sample from the population and found a 95% confidence interval for to be (.23, .24). Later Alex learned that the true value of is .227. (i) What is the probability that is within Alex' confidence interval? (ii) Should we conclude that Alex is not good at Statistics and made a mistake when computing his confidence interval? (c) Two researchers, George and Henry, work together to study a simple random sample of subjects from a population, and they find that the sample mean is .60. When they construct a confidence interval based on this sample mean, George comes up with (.532,.668) while Henry gets (.552,.688). Indicate which interval has to be wrong. 8 Question 2 5points each, TOTAL 40 points A car salesman estimates the following probabilities for the number of cars that he will see in the next week a) Find the expected number of cars that will be sold in the week. b) Find the standard deviation of the number of cars that will be sold in the week. c) The salesman receives for the week a salary of $300, plus an additional $375 for each car sold. Find the mean and standard deviation of his total salary for the week. d) What is the probability that the salesman's salary for the week will be more than $1000? 9 Question 3 5points each, TOTAL 50 points A group of forty people at a health club were classified by their gender and by their smoking habits as shown in the table below. A person is selected at random from the group of forty. a) What is the probability the person is male and smokes? b) What is the probability the person does not smoke? c) What is probability the person is either female or a smoker? d) If you know the person selected is female, what is the probability that the person is a smoker? e) Are the events male and smoker independent? Why or why not? 10 Business 550: Data and Decision Analytics Fall 2008 Practice Exam 2 Midterm Examination Directions The exam will end 90 minutes after it begins. The exam is divided into three parts. The first part is true/fale and the second part is multiple choice. Please answer the multiple choice questions on the exam by circling the best answer (some rounding occurs in several places). There will be no partial credit for these questions. The third part of the exam consists of several problems. Please answer these problems in the space provided on the exam (you may use the backs of the sheets if necessary). You will get partial credit for these problems provided that your answers are organized and legible so that your train of thought can be easily followed. G ood Luck
DON'T EVEN THINK ABOUT PANICING
By Printing my name below I acknowledge that the GBS has an honor code and that I will adhere to it. Failure to abide by the honor code could result in failing this course and having to wash Professor Parzen's car with my toothbrush. NAME: ______________________________________________ (50 if not printed) Question T/F Multiple Choice Long 1 Long 2 Total Points 10 57 16 16 99 Obtained True or False (1 point each)
1. T F If the average of a list of numbers is 0, then its standard deviation must also be 0. 2. T F From the joint distribution of two discrete random variables we can always recover the two marginal distributions. 3. T F The average height in a class is 60 inches. There are 30 people in the class. One can conclude that 15 of the students are less than 60 inches tall. 4. T F If two data sets have the same average and standard deviation, their histograms must look the same. The standard deviation is never negative 5. T F 6. T F If the standard deviation of a list of numbers is zero, then its average must also be zero. 7. T F If you change the sign of each entry on a list, that changes the sign of the average. 8. T F The mean is more effected by outliers than the median. 9. T F Suppose that X~N(3,4). Then 2X~N(6,8). 10. T F I am tired and just want to go home and sleep. Multiple Choice (3 points each) 1) The probability that a tennis set will go to a tiebreaker is 13%. In 120 randomly selected tennis sets, what is the mean and the standard deviation of the number of tiebreakers? a) mean: 15.5; standard deviation: 3.95 b) mean: 14.4; standard deviation: 3.95 c) mean: 14.4; standard deviation: 3.68 d) mean: 15.6; standard deviation: 3.68 e) None of the above 2) Suppose a brewery has a filling machine that fills 12 ounce bottles of beer. It is known that the amount of beer poured by this filling machine follows a normal distribution with a mean of 12.49 ounces and a standard deviation of 0.04 ounce. Find the probability that the bottle contains between 12.39 and 12.45 ounces. a).2674 b).3085 c).1581 d).1915 e) None of the above 3) A deck of cards is shuffled (Note: there are 52 cards in a deck). What is the chance that the top card is the jack of clubs and the bottom card is not the queen of clubs? a) (1/52) (1/51) b) (1/51) (51/52) c) [1(1/52)](1/51) d) (1/52) (50/51) e) None of the above 4) Which of the following statement is not true about a normal curve? (a) (b) (c) (d) Every normal curve is bellshaped Every normal curve is centered at its mean Every normal curve is symmetric about 0 Every normal curve has its undercurve area equal to 1 5) A study attempted to examine the relationship between diabetes and alcohol consumption among older people. Refer to the table of counts below to answer the following question. Given an older person who does NOT drink at all, what's the probability that he has diabetes?
diabetes No diabetes Drink 1 or more per week 765 85 No alcohol drink 293 3565 a) b) c) d) e) 293/(765+293) 293/(293+3565) 293/(765+86+293+3565) (765+293)/(765+86+293+3565) None of the above. 6) You decide to invest in four independently moving risky stocks. You guess that each stock has an independent 40% chance of becoming a total loss. What is the chance that at least one of your stocks will tank? a) .0256 b) .4000 c) .8704 d) .9744 e) None of the above 7) When computing probabilities for standard normal distributions, which of the following probability statements is CORRECT? a) P ( Z a ) = 1  P ( Z a ) b) P (a Z b) = P ( Z a )  P ( Z b) c) If a > 0, P ( a Z a ) = 1  2 * P ( Z a ) d) If a < 0, P ( Z a ) = P ( Z  a ) e) None of the above statements are correct 8) A fair sixsided die is rolled n times. If the standard deviation of the number of times a 5 comes up is 10, determine the value of n.
a) b) c) d) e) 400 625 720 800 None of the above 9) The following 2way table is a survey from students at an unknown university regarding whether or not they know of Google and whether or not they know what a `googol' is. If one person is selected at random, what is the approximate probability that they know about Google or NOT know what a `googol' is? a) 0.725 b) 0.376 c) 0.975 d) 0.924 e) None of the above 10) Ten students take a test and their results are listed below. Determine the value of x if the mean score is 76. Score 45 65 75 x Number of Students 1 2 3 4 a) b) c) d) e) 80 85 90 95 None of the above 11) Which frequency distribution shows the set of outcomes with the smallest standard deviation? 12) Suppose you take a fair six sided die and mark the faces with the numbers 1, 1, 2, 2, 3, 3, respectively. Let X be the random variable of the number on the die observed after a toss. Then the mean and the variance of X are 13) Let the random variable X denote the length of cell phone calls in minutes for a particular population. Assume E(X) = 10 and Var(X)=4. Suppose that the Virgin Mobile prepaid cell phone plan has a connection charge of $0.60 and a further charge of $0.40 per minute. What is the expected cost and standard deviation of a cell phone call? a) b) c) d) e) f) 4.5,2.40 6,2.56 4.6,2.56 6.4,3.16 4.6,5.76 None of the above 14) Which of the following is true about the data in the histogram below? a) b) c) d) e) The median is about 60 and the mean is something less than 60. The median is about 65 and the mean is something less than 65. The mean is about 60 and the median is something less than 60. The mean is about 65 and the median is something less than 65. We cannot conclude anything about the mean from a histogram. 15) Referring to the dataset in the last question, the histogram, what would happen if the smallest point, say 50, was changed to 55? a) b) c) d) e) Both the mean and median would increase by 5. Both the mean and the standard deviation would increase by 5. Only the mean would increase by 5. Only the mean would increase. None of the statements above are completely true. 16) Due to recent stock market jitters, Janice has decided to invest in ducks. She has bought five ducks and the length of their bills (in cm) are: 14, 39, 10, 12, 10. Which of the following is true? a) b) c) d) e) f) mean = 14cm and median = 12cm . mean = 14cm and median = 10cm . mean = 17cm and median = 10cm . mean = 17cm and median = 12cm . mean = 16cm and median = 12cm . None of the above 17) The Central Limit Theorem states that: a) if n is large then the distribution of the sample can be approximated closely by a normal curve b) if n is large, and if the population is normal, then the variance of the sample mean must be small. c) if n is large, then the sampling distribution of the sample mean can be approximated closely by a normal curve e) if n is large, then the variance of the sample must be small. 18) Which of the following statements is correct? a. Changing the units of measurements of x or y does not change the value of the correlation r. b. A negative value for the correlation r indicates the data are strongly unassociated. c. The correlation always has the same units as the x variable, but not the y variable. d. The correlation always has the same units as the y variable, but not the x variable. 19) Estimate the standard deviation ( ) of the following normal distribution curve. a) b) c) d) e) 7.2 2.5 0 1.25 11.5 Short Answer
1) (16 points) Consider the following distribution for the random variable X: x 1 2 P(X=x) 0.2 0.8 Compute the probability distribution for each of the following new random variables: X+1, 3X, X+X, and XX (that is, make a table like the one above). Determine the expected value and variance for each new random variable. Work for X+1: Work for 3X: Work for X+X: Work for XX: 2) (16 points) A corporation has 15,000 employees. Sixtytwo percent of the employees are male. Twentythree percent of the employees earn more than $30,000 a year. Eighteen percent of the employees are male and earn more than $30,000 a year. Let M be the event that an employee is male Let Y be the event that an employee earns more than $30,000 a year a) Construct the joint probability table for events M , M , Y , Y b) What is the probability that the employee female and earns less than $30,000 a year? c) Given that a randomly selected employee is male, what is the probability that he makes more than $30,000 a year? d) Are M and Y independent events? Explain using probabilities. Business 550 : Data Practice and Decision Analytics Exam 1 Solutions Summer 2008 Midterm Examination Directions The exam will end 120 minutes after it begins. The exam is divided into three parts. The first part is true/fale and the second part is multiple choice. Please answer the multiple choice questions on the exam by circling the best answer (some rounding occurs in several places). There will be no partial credit for these questions. The third part of the exam consists of several problems. Please answer these problems in the space provided on the exam (you may use the backs of the sheets if necessary). You will get partial credit for these problems provided that your answers are organized and legible so that your train of thought can be easily followed. G ood Luck
DON'T EVEN THINK ABOUT PANICING
By Printing my name below I acknowledge that the GBS has an honor code and that I will adhere to it. Failure to abide by the honor code could result in failing this course and having to wash Professor Parzen's car with my toothbrush. NAME : ____________SOLUTIONS__________________________________ (50 if not printed) Question T/F Multiple Choice Long 1 Long 2 Long 3 Total Points 30 190 40 40 50 350 Obtained 1 True or False (3 points each) 1. T F If X~N(0,1) then P(X>0)=.5 2. T F If X is a binomial random variable with n=1 and p=.5 then P(X=0)=0 3. T F If the sample covariance is negative then the sample correlation must also be negative. 4. T F As we collect more and more (x,y) pairs of data, the value of r (the correlation) should go to one. 5. T F The sample mean is more affected by outliers than the sample median. 6. T F The sample correlation is always a value between 1 and 1 (inclusive) 7. T F If you add a constant amount to each value in a dataset, the value of the variance will change. If Z~N(0,1), then P(2 < Z < 0.44) = .6472 8. T F 9. T F 50% of a dataset are within the interval of (Q1, Q3). 10. T F Var ( X ) E ( X 2 ) 2 Multiple Choice (10 points each) 1) Suppose a loaded dice has the following probability distribution: Face Probability 1 0.3 2 0.1 3 0.1 4 0.1 5 0.1 6 0.3 The dice is thrown and the top face shows an odd number. Given this fact, what is the probability that the dice shows a 1? a. 0.1 b. 0.3 c. 0.5 d. 0.6 e. none of the above 2) A small company has 7 employees. The numbers of years these employees have worked for this company are shown as follows: 4 14 3 16 9 8 16 Based upon this information, the median number of years that employees have been with this company is: a. b. c. d. e. 9 years 10 years 11.5 years 16 years None of the above 3) Let A be the event that a student is enrolled in an accounting course, and let S be the event that a student is enrolled in a statistics course. It is known that 30% of all students are enrolled in an accounting course and 40% of all students are enrolled in statistics. We also know that 15% of the students are enrolled in both statistics and accounting. A student is randomly selected. Given that that the student is enrolled in accounting, what is the probability that this student is also enrolled in statistics? a) 0.15 b) 0.75 c) 0.375 d) 0.50 e) 0.80 3 f) None of the above 4) Manuel Banales, Marketing Director of Plano Power Plants, Inc.'s Electrical Division, is leading a study to assess the relative importance of product features. Two items on a survey questionnaire distributed to 100 of Plano's customers asked them to rate the importance of "ease of maintenance" and "efficiency of operation" on a scale of 1 to 10 (with 1 meaning "not important" and 10 meaning "highly important"). His staff assembled the following statistics on these two items. Ease of Maintenance Mean Median Standard Deviation 7.5 8.5 1.5 Efficiency of Operation 6.0 5.5 2.5 What can Manuel conclude from these statistics? A. operation B. C. D. Ease of Maintenance is more important than Efficiency of Efficiency of Operation is more important than Ease of Maintenance Efficiency of Operation and Ease of Maintenance are equally important The Efficiency of Operation distribution has less dispersion 5) There are three children in a room  ages 3, 4, and 5. If a fouryearold child enters the room, the a. mean age will stay the same but the variance will increase. b. mean age will stay the same but the variance will decrease. c. mean age and variance will stay the same. d. mean age and variance will both increase. 6) Which of the following statements is false? a. The width of a confidence interval estimate of the population mean narrows when the sample size increases b. The width of a confidence interval estimate of the population mean narrows when the variance increases c. The width of a confidence interval estimate of the population mean widens when the confidence level increases 4 7) In developing an interval estimate for a population mean, s was 10. The interval estimate was 40 6. Had s equaled 5, and everything else remains the same, the interval estimate would be (a) 40 12 (b) 20 12 (c) 20 3 (d) 40 3 The next two questions refer to the following setting: For several years Dominos Pizza advertised that if takes them longer than 30 min to deliver your pizza, then you can have it for free. Experience has shown that the time it takes to deliver a pizza in Athens, Georgia is approximately normally distributed with a mean of 25 minutes and a standard deviation of 2.5 minutes. 8) Considering a random sample of 50 customers in Athens, what is the probability that no one will receive a free pizza? (A) 0.053 (B) 0.152 (C) 0.257 (D) 0.316 (E) None of the above 9) What should the "get a free pizza" determining time of 30 minutes be changed to if the company wishes to provide only 0.1% of its customers with rebates? (A) 30 (B) 32.75 (C) 34 .3 (D) 36.81 (E) None of the above 10) The average gas mileage of a certain model car is 26 miles per gallon. If the gas mileages are normally distributed with a standard deviation of 1.3, find the probability that a car has a gas mileage of between 25.8 and 26.3 miles per gallon. A) 0.18 B) 0.20 C) 0.15 D) 0.26 E) None of the above 11) A survey of 800 women shoppers found that 17% of them shop on impulse. What is the 95% confidence interval for the true proportion of women shoppers who shop on impulse? A) 0.144 < p < 0.196 B) 0.148 < p < 0.192 C) 0.136 < p < 0.204 D) 0.139 < p < 0.201 5 E) None of the above 12) A television station estimates that 60% of college students watch the Super Bowl. For a sample of 240 students selected at random, what is the mean and variance of the number of students who watch this game? A) Mean = 144.0, Variance = 7.59 B) Mean = 57.6, Variance = 57.60 C) Mean = 57.6, Variance = 7.59 D) Mean = 144.0, Variance = 57.60 E) None of the above 13) A researcher calculated the values and probabilities for a random variable X as shown below. Unfortunately, he erased the last value and needs to figure out what it was. If the mean of X was 3.4, then what was the last value? A) 12 B) 11 C) 10 D) 9 E) None of the above 14) The employees of EAB Tree Removal Company have a mean salary of $25,000 with a standard deviation of $2500. Due to a good business year, the owner of the company gives every employee a $2000 raise in their salary. The new mean and standard deviation of the salary a) will be $25,000 and $2500. b) will be $27,000 and $4500. c) will be $27,000 and $2500. d) cannot be determined without knowing the number of employees, 15) Bob is a high school basketball player. He is a 70% free throw shooter. That means his probability of making a free throw is 0.70. What is the probability that Bob makes his first free throw on his fifth shot? (A) 0.0024 (B) 0.0057 (C) 0.0081 (D) 0.0720 (E) 0.1681 6 16) A sample of n = 168 students was asked, "Do you believe in love at first sight?" Shown below are four confidence intervals, in scrambled order, for 90%, 95%, 98%, and 99% confidence levels for the population proportion who would answer "yes". Which is the 98% confidence interval (circle your answer) ? 17) For some positive value of x, the probability that a standard normal variable is between 0 and +2x is 0.1255. The value of x is
a) 0.99 b) 0.40 c) 0.32 d) 0.16 e) None of the above 18) Increasing the sample size causes the distribution of X to ________. (a) (b) (c) (d) shift to the right shift to the left have more dispersion have less dispersion Health care issues are receiving much attention in both academic and political arenas. A sociologist recently conducted a survey of citizens over 60 years of age whose net worth is too high to qualify for Medicaid and have no private health insurance. The descriptive statistics for the ages of 25 uninsured senior citizens were as follows: 19) Which of the following is the best correct statement ? a) One fourth of the senior citizens sampled are below 66 years of age. b) The middle 50% of the senior citizens sampled are between 66 and 73.0 years of age. c) The average age of senior citizens sampled is 73.5 years of age. d) All of the above are correct. 7 Short answer Question 1 TOTAL 130 points TOTAL 40 points Do not perform any detailed calculations to answer the following questions. Provide a onesentence explanation of your reasoning in each case. (a) Two researchers, Alex and Bob, independently select random samples from the same population. The sample sizes are 4000 for Alex and 1000 for Bob. Each researcher constructs a 95% confidence interval for from his data. The widths of the two intervals are .062 and .03. Match each interval width with its researcher. A larger n implies a smaller width so Bob is .062 and Alex is .03 (sine the samples are from the same population it is reasonable to assume that s is the same or nearly so for Bob and Alex.) (b) Alex took another sample from the population and found a 95% confidence interval for to be (.23, .24). Later Alex learned that the true value of is .227. (i) What is the probability that is within Alex' confidence interval? Before we know the true population mean, there is a 95% chance it is in the interval. Once we know the true population mean's value, it is either in the interval or not, and since 0.227 is not in the interval, the answer is 0. (ii) Should we conclude that Alex is not good at Statistics and made a mistake when computing his confidence interval? No, there is always a 5% chance that the true mean will be outside the CI. (c) Two researchers, George and Henry, work together to study a simple random sample of subjects from a population, and they find that the sample mean is .60. When they construct a confidence interval based on this sample mean, George comes up with (.532,.668) while Henry gets (.552,.688). Indicate which interval has to be wrong. Henry is wrong. The center of each interval should be the sample mean 0.6, which is not the case for Henry's interval. 8 Question 2 5points each, TOTAL 40 points A car salesman estimates the following probabilities for the number of cars that he will see in the next week a) Find the expected number of cars that will be sold in the week. b) Find the standard deviation of the number of cars that will be sold in the week. c) The salesman receives for the week a salary of $300, plus an additional $375 for each car sold. Find the mean and standard deviation of his total salary for the week. 9 d) What is the probability that the salesman's salary for the week will be more than $1000? 10 Question 3 5points each, TOTAL 50 points A group of forty people at a health club were classified by their gender and by their smoking habits as shown in the table below. A person is selected at random from the group of forty. a) What is the probability the person is male and smokes? P(male and smoker) = 2/40 = 0.05 b) What is the probability the person does not smoke? P(nonsmoker) = 32/40 = 0.8 c) What is probability the person is either female or a smoker? P(female or smoker) = P(female) +P(smoker) P(female and smoker) = 14/40 + 8/40 6/40 = 16/40 = 0.40 d) If you know the person selected is female, what is the probability that the person is a smoker? P(smoker  female) = P(smoker and female)/P(female) = (6/40)/(14/40) = 0.4285 e) Are the events male and smoker independent? Why or why not? Not independent. P(smoker  female) does not equal P(smoker) 11 Business 550: Data and Decision Analytics Fall 2008 Practice Exam 2 Solutions Midterm Examination Directions The exam will end 90 minutes after it begins. The exam is divided into three parts. The first part is true/fale and the second part is multiple choice. Please answer the multiple choice questions on the exam by circling the best answer (some rounding occurs in several places). There will be no partial credit for these questions. The third part of the exam consists of several problems. Please answer these problems in the space provided on the exam (you may use the backs of the sheets if necessary). You will get partial credit for these problems provided that your answers are organized and legible so that your train of thought can be easily followed. G ood Luck
DON'T EVEN THINK ABOUT PANICING
By Printing my name below I acknowledge that the GBS has an honor code and that I will adhere to it. Failure to abide by the honor code could result in failing this course and having to wash Professor Parzen's car with my toothbrush. NAME: ____________SOLUTIONS___________________ (50 if not printed) Question T/F Multiple Choice Long 1 Long 2 Total Points 10 57 16 16 99 Obtained True or False (1 point each)
1. T F If the average of a list of numbers is 0, then its standard deviation must also be 0. 2. T F From the joint distribution of two discrete random variables we can always recover the two marginal distributions. 3. T F The average height in a class is 60 inches. There are 30 people in the class. One can conclude that 15 of the students are less than 60 inches tall. 4. T F If two data sets have the same average and standard deviation, their histograms must look the same. The standard deviation is never negative 5. T F 6. T F If the standard deviation of a list of numbers is zero, then its average must also be zero. 7. T F If you change the sign of each entry on a list, that changes the sign of the average. 8. T F The mean is more effected by outliers than the median. 9. T F Suppose that X~N(3,4). Then 2X~N(6,8). 10. T F I am tired and just want to go home and sleep. Multiple Choice (3 points each) 1) The probability that a tennis set will go to a tiebreaker is 13%. In 120 randomly selected tennis sets, what is the mean and the standard deviation of the number of tiebreakers? a) mean: 15.5; standard deviation: 3.95 b) mean: 14.4; standard deviation: 3.95 c) mean: 14.4; standard deviation: 3.68 d) mean: 15.6; standard deviation: 3.68 e) None of the above Binomial X~Bin(120,.13) E(X) = np = 120*.13 = 15.6 Var(X) = np(1p) = 15.6*.87 = 13.572 Sd Dev(X) = 3.684 2) Suppose a brewery has a filling machine that fills 12 ounce bottles of beer. It is known that the amount of beer poured by this filling machine follows a normal distribution with a mean of 12.49 ounces and a standard deviation of 0.04 ounce. Find the probability that the bottle contains between 12.39 and 12.45 ounces. a).2674 b).3085 c).1581 d).1915 e) None of the above
12.39  12.49 12.45  12.49 P (12.39 < X < 12.45) = P ( <Z< ) .04 .04 = P (2.5 < Z < 1) = F (1)  F (2.5) = F (1)  F (2.5) = (0.5  .3413)  (0.5  0.4938) = 0.4938  .3413 = 0.1525 3) A deck of cards is shuffled (Note: there are 52 cards in a deck). What is the chance that the top card is the jack of clubs and the bottom card is not the queen of clubs? a) (1/52) (1/51) b) (1/51) (51/52) c) [1(1/52)](1/51) d) (1/52) (50/51) e) None of the above This problem uses the rule P(A and B) = P(AB)P(B) B = Top Card Jack A = bottom card not the queen of clubs P(B) = 1/51 P(AB) = 50/51
4) Which of the following statement is not true about a normal curve? (a) (b) (c) (d) Every normal curve is bellshaped Every normal curve is centered at its mean Every normal curve is symmetric about 0 Every normal curve has its undercurve area equal to 1 The standard normal N(0,1) curve is centered at zero, but not every normal curve has to be centered at 0.
5) A study attempted to examine the relationship between diabetes and alcohol consumption among older people. Refer to the table of counts below to answer the following question. Given an older person who does NOT drink at all, what's the probability that he has diabetes?
diabetes No diabetes Drink 1 or more per week 765 85 No alcohol drink 293 3565 a) b) c) d) e) 293/(765+293) 293/(293+3565) 293/(765+86+293+3565) (765+293)/(765+86+293+3565) None of the above. P(diabetesnot drink) = P(diabetes and not drink)/P(not drink) = 293/(293+3565) 6) You decide to invest in four independently moving risky stocks. You guess that each stock has an independent 40% chance of becoming a total loss. What is the chance that at least one of your stocks will tank? a) .0256 b) .4000 c) .8704 d) .9744 e) None of the above P(at least one tanks) = 1 p(none tank) = 1  .6^4 = 0.8704
7) When computing probabilities for standard normal distributions, which of the following probability statements is CORRECT? a) P ( Z a ) = 1  P ( Z a ) b) P (a Z b) = P ( Z a )  P ( Z b) c) If a > 0, P ( a Z a ) = 1  2 * P ( Z a ) d) If a < 0, P ( Z a ) = P ( Z  a ) e) None of the above statements are correct You need to draw a picture to see this is true.
8) A fair sixsided die is rolled n times. If the standard deviation of the number of times a 5 comes up is 10, determine the value of n. a) b) c) d) e) 400 625 720 800 None of the above Let X = number of times a 5 comes up in n tosses. Then X is a binomial random variable with parameters n and p=1/6 We need to solve n*(1/6)*(5/6) = 100 or 5n/36 = 100 or n=720 9) The following 2way table is a survey from students at an unknown university regarding whether or not they know of Google and whether or not they know what a `googol' is. If one person is selected at random, what is the approximate probability that they know about Google or NOT know what a `googol' is? a) 0.725 b) 0.376 c) 0.975 d) 0.924 e) None of the above Use the rule P(A or B) = P(A) + P(B) P(A and B). 10) Ten students take a test and their results are listed below. Determine the value of x if the mean score is 76. Score 45 65 75 x Number of Students 1 2 3 4 a) b) c) d) e) 80 85 90 95 None of the above 45+2(65)+3(75_+4(x) = 76(10) or x=90 11) Which frequency distribution shows the set of outcomes with the smallest standard deviation? 12) Suppose you take a fair six sided die and mark the faces with the numbers 1, 1, 2, 2, 3, 3, respectively. Let X be the random variable of the number on the die observed after a toss. Then the mean and the variance of X are It is easier to first construct a new probability table x 1 2 3 p(x) 1/3 1/3 1/3 Then E(X) = 1(1/3)+2(1/3)+3(1/3) = 6/3 so the answer HAS to be (a). 13) Let the random variable X denote the length of cell phone calls in minutes for a particular population. Assume E(X) = 10 and Var(X)=4. Suppose that the Virgin Mobile prepaid cell phone plan has a connection charge of $0.60 and a further charge of $0.40 per minute. What is the expected cost and standard deviation of a cell phone call? a) b) c) d) e) f) 4.5,2.40 6,2.56 4.6,2.56 6.4,3.16 4.6,5.76 None of the above Y = .6+.4X E(Y) = .6+.4E(X) = 4.6 Var(Y) = .4^2 * Var(X) = (.4^2)(4) Std Dev(Y) = .4*2 = .8 14) Which of the following is true about the data in the histogram below? a) b) c) d) e) The median is about 60 and the mean is something less than 60. The median is about 65 and the mean is something less than 65. The mean is about 60 and the median is something less than 60. The mean is about 65 and the median is something less than 65. We cannot conclude anything about the mean from a histogram. The y axis is frequency (counts in each bin divided by total number of observations). If you (approximately) count the frequency past the 65 x value, you would obtain roughly .24+.05+.1+.13+.04 = 0.57 which is a little over half of the data values. So the median would be somewhere around 65, since about 12% of the values are just below 65. Since the data set is skewed to the left, the mean would be less than the median, so the mean would be less than 65. 15) Referring to the dataset in the last question, the histogram, what would happen if the smallest point, say 50, was changed to 55? a) b) c) d) e) Both the mean and median would increase by 5. Both the mean and the standard deviation would increase by 5. Only the mean would increase by 5. Only the mean would increase. None of the statements above are completely true. Unfortunately, since we introduced the standard deviation as a possible answer, both (d) and (e) are logical solutions, since not only would the mean increase, but the standard deviation would also change. 16) Due to recent stock market jitters, Janice has decided to invest in ducks. She has bought five ducks and the length of their bills (in cm) are: 14, 39, 10, 12, 10. Which of the following is true? a) b) c) d) e) f) mean = 14cm and median = 12cm . mean = 14cm and median = 10cm . mean = 17cm and median = 10cm . mean = 17cm and median = 12cm . mean = 16cm and median = 12cm . None of the above 17) The Central Limit Theorem states that: a) if n is large then the distribution of the sample can be approximated closely by a normal curve b) if n is large, and if the population is normal, then the variance of the sample mean must be small. c) if n is large, then the sampling distribution of the sample mean can be approximated closely by a normal curve e) if n is large, then the variance of the sample must be small. Slide 405 18) Which of the following statements is correct? a. Changing the units of measurements of x or y does not change the value of the correlation r. b. A negative value for the correlation r indicates the data are strongly unassociated. c. The correlation always has the same units as the x variable, but not the y variable. d. The correlation always has the same units as the y variable, but not the x variable. Correlation is unitless so changing the units of x and y will not change the value of the correlation. 19) Estimate the standard deviation ( ) of the following normal distribution curve. a) b) c) d) e) 7.2 2.5 0 1.25 11.5 You know that 95% of the area is between +/2 standard deviations of the mean. Eyeballing, it looks like the mean is around 7.5 and the 95% range is roughly 5 to 10 or so. This would mean that 2*standard deviation = 2.5 or that the standard deviation = 1.25. Short Answer
1) (16 points) Consider the following distribution for the random variable X: x 1 2 P(X=x) 0.2 0.8 Compute the probability distribution for each of the following new random variables: X+1, 3X, X+X, and XX (that is, make a table like the one above). Determine the expected value and variance for each new random variable. Note that E(X) = 1(.2)+2(.8) = 1.8 E(X^2) = 1(.2)+4(.8)=3.4 Var(X) = 3.41.8^2 = 0.16 Work for X+1:
x 2 3 p(x) 0.2 0.8 Let Y = X+1 Then E(Y) = E(X)+1=2.8 Var(Y) = Var(X) = 0.16 Work for 3X:
x 3 6 p(x) 0.2 0.8 Let Y = 3X Then E(Y) = 3E(X)=5.4 Var(Y) = 9Var(X) = 1.44 Work for X+X:
x 2 4 p(x) 0.2 0.8 Let Y = X+X=2X Then E(Y) = 2E(X)=3.6 Var(Y) = 4Var(X) = 0.64 Work for XX:
x 0 p(x) 1 E(X)=0 Var(X) = 0 2) (16 points) A corporation has 15,000 employees. Sixtytwo percent of the employees are male. Twentythree percent of the employees earn more than $30,000 a year. Eighteen percent of the employees are male and earn more than $30,000 a year. Let M be the event that an employee is male Let Y be the event that an employee earns more than $30,000 a year a) Construct the joint probability table for events M , M , Y , Y
Y 0.18 0.05 0.23 Y 0.44 0.33 0.77 M M 0.62 0.38 1.0 b) What is the probability that the employee female and earns less than $30,000 a year?
0.33 c) Given that a randomly selected employee is male, what is the probability that he makes more than $30,000 a year?
0.18/0.62 = 0.2903 d) Are M and Y independent events? Explain using probabilities.
No, M and Y are not independent. P(Y)=0.23 but P(YM) = 0.18 so since P(YM) does not equal P(Y) M and Y are not independent. z  0.0000 0.0100 0.0200 0.0300 0.0400 0.0500 0.0600 0.0700 0.0800 0.0900 0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359 0.10 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 0.60 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549 0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
3 0.0 0.1 foo 0.2 0.3 0.4 0.80 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 0.90 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 1.00 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
2 1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830 1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015 1.30 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 1 1.40 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319 1.50 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 1.60 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545 0 x 1.70 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633 1.80 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706 1.90 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767 2.00 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817 2.10 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857 2.20 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890 2.30 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916 2.40 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
3 2 1 2.50 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952 2.60 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964 2.70 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974 2.80 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981 2.90 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986 3.00 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990 3.10 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993 3.20 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995 3.30 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997 3.40 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998 3.50 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 3.60 0.4998 0.4998 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 3.70 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 3.80 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 3.90 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 ...
View
Full
Document
This note was uploaded on 03/27/2012 for the course STATS 104 taught by Professor Michaelparzen during the Fall '11 term at Harvard.
 Fall '11
 MichaelParzen

Click to edit the document details