w07stat200mtprobs_soln

w07stat200mtprobs_soln - 1 Harris recently installed a spam...

Info icon This preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 6
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1. Harris recently installed a spam filter software, but he still saw spam emails in his inbox. He made a daily record of the number of spam emails that were delivered to his inbox over the past 20 days. The following is a frequency histogram for his data. The frequency refers to the number of days. histogram of spam email data Frequ en cy 3 I_'—I—I_—I——I_l 20 40 60 80 100 120 # spam emails a) Harris also plotted a stemplot for the data. Which of the following is a correct stemplot for his data? Check only one answer. [2 marks] V_/ Stemplot A _StemplotB _StemplotC A. 2 I 011355 B. 2 I 011355 C. 1 I 05 3 I 01467 3 I 01467 2 I 011355 4 I 12479 4 I 12479 3 I 01467 5 I 56 5 I 56 4 I 12479 6 I 8 | O 5 | 56 7 I 10 | 5 6 | 8 I O 7 I 9 I 8 | O 10 I 5 b) Which of the following is a correct statement about the distribution of the spam email data? Check only one answer. [2 marks] The distribution is roughly symmetric, and the mean is about the same as the median. 1/1 The distribution is skewed, and the mean is larger than the median. The distribution is skewed, and the mean is smaller than the median. c) What is the percentage of days over the past 20 days that Harris received fewer than 50 spam emails? Check only one answer. [2 marks] __16% _19% “—% x loom : sci-VD _50% 1 ism d) What is the third quartile of the number of spam emails? Use the stemplot you have chosen in part (a) to answer this question. Check only one answer. [3 marks] ____23 Upper hall]: 0? l-lrre data 56’? above ($37."- #3: 31m 4., 41.44%?» 4‘1; 5'5: Sb; 5301'05 —* Q - d' I 1m above lisl: : Ar1 + 4f? e) Identify any outliers in the data set using the stemplot you have chosen in part (a); It is given to you that the IQR is 23. Show your work here. [4 marks] NO OLA‘HlQlZS in the lower end (distribution does not have, a long lGF‘rl'Clil) (D3 +}.S>¢ICH? : 4-3 +1.5K 2.3:: %25 ms 7 82.5 , So 105 [5 an omllhgrlr {he @11le OLL+ll.-Q.r‘ [kn Hate, dam f) Which of the following pairs of summary statistics best describe the center and the spread of the number of spam emails received daily? Check only one answer and explain briefly. [4 marks] _ mean and standard deviation __ mean and IQR i median and IQR _ median and variance Explain: BO‘H’l median and IQVZ are insensitive l1: cud-liens, So in HM Presence 0% Owlliers 005), We should vaport ‘l'he-Se 2 summanj Statl‘SfiCS, 2. An M&M’s chocolate fan is interested in studying the color distribution of the sugar coating of the chocolate candies. He opens a bag of l\/1&M’s chocolate candies, and classifies the candies according to the color of the coating. Check all statements that are correct. [3 marks] V’ The color of the sugar coating is a categorical variable. v” A bar chart can be used to display the distribution of the color variable. A side—by—side boxplot can be used to compare the number of yellow coated candies and the number of red coated candies. 3. The hourly rates for highschool private tutoring follow the normal distribution with mean it and standard deviation 0. It is given that the middle 99.7% of all the hourly rates fall between $13 and $43. Then (check your answers) a) the mean u is [1 mark] if equal to $28. _ greater than $28. __ less than $28. b) the standard deviation 0 is roughly equal to [2 marks] _y_"_ $5. $10. $15. 0) an hourly rate of $12 has [2 marks] a z—score of 0. v/ a negative z—score. a positive z—score. d) the IQR of the hourly rates is [2 marks] __ equal to $30. reater than $30. ess than $30. crq Z ,_a 4. Does how long children remain at the lunch table help predict how much they eat? Twenty toddlers at a nursery school were observed. On each toddler, the number of minutes he/ she spent-at the table when lunch was served and the number of calories that was consumed during lunch were measured. The two variables show a reasonably linear trend with a correlation coefficient 7“ = —0.65. The summary statistics are given as follows: # minutes spent at the lunch table : mean = 34, SD = 6.0 # calories consumed : mean = 456, SD = 30 a) Give a rough sketch of the scatterplot for the data. The axes have been set up for you. Remember to label the axes. Also indicate the mean—mean point (say) on the scatterplot. [5 marks] a UNUYI‘E’S ~ consumed I C— ”l | ., . ,/ “”1 I . s" 45k: |_~___ &-~_.___ ‘. i I ” I - I. I J i c I I r ' ; I : ----------------- I——————————————> time Spent at 34 the. IHhch mble (minutes) b) Find the least—squares regression line that predicts the amount of calories con- sumed from the time stayed at the table during lunch. [6 marks] K =- t’r- minutes Spent or I‘m. lunch mble 8: II COLIOTIQS ccnsurfled giCL.‘I'bX _ Y‘S‘fi EOBBWECD .. _ ’ s; : T‘IT‘ ” 3-35 OK: Q'rbx : 45b—(e3asm4) rereag c) Predict the number of calories consumed for a child who spends 25 minutes at the table during lunch. [2 marks] 34:15 g: Slabfih 33‘3135) 3 495-15 [:5 +146, predicted fit calories Consume! d) For the following statements, check all that are correct. [4 marks] 65% of the variation in the number of calories consumed is explained by the regression line. i The residual plot for this data set plots the residuals from the regression line against the number of minutes spent at the lunch table for the twenty kids. One standard deviation (SD) increase in the number of minutes spent at the lunch table is associated with 0.65xSD increase in the number of calories consumed. \/ If one changes the unit of the amount of time spent from minutes to hours, the value of the correlation coefficient 7‘ will remain unchanged, 5. The length of trout in a lake is normally distributed with mean a = 0.95 feet and an unknown standard deviation 0. If 60% of all trout are longer than 0.8 feet, what is the value of a? [6 marks] 1: length 0? mm: 0.8 is the 40 an pgrcemlh‘le £~SC§F€ for 4-0 {in pfir'cenble : ~ 0,713 (or " 01156) 6. A survey was conducted in 11 countries to determine the percentage of teenagers who had smoked cigarettes and used marijuana. The scatterplot for the two variables is shown below: LO 0 a. m‘ o s2 O o a m” 0 % Q— E o m (g:__ o o g 0 (58,15) .3 i9— . E2— 0 0 Lofl—il—l—fi—‘l | “‘1'“ "l .> 35 40 45 50 55 63 <95 qLo cigarette smoking (0/0) a) The scatterplot shows a very strong positive correlation between the two variables. Does this imply cigarette/smoking leads to marijuana use? Justify your answer. [5 marks] Yes V No Explain: ‘— flssodafimq does not L'mplg causation. Teenagers who smoke cigarettes are more like-lg +0 ham ou+ WWW iFviendS who smoke both cigar‘efi'es and marijuana. They may be, influenced Lot} Jr‘neir fi-Iends to smoke marijuana. Peer {HHUEV'ME ('5 (J Conébundrng ua-iable ‘Tl’ifl‘l— EXPICAMS the GSSOCiQHUH between C'I‘gar'et‘le and marijuana Smokin g. b) One more country participated in the survey, and the percentages of teenagers who have smoked cigarettes and used marijuana were found to be 68% and 15%, respectively. The correlation coefficient 7" is then recalculated; How do the values of 7' before and after the inclusion of the new observation compare? Check only one answer and explain briefly. [5 marks] _ 7"(before) < T(after) < O L 0 < 7'(after) < r(before) 7"(before) < r(after) < 1 Explain: (ggflg) {5 cm ,‘nFlueHJfl‘ay Observohbh. W‘l‘thb‘ut' it, ‘Une Cor‘r'eioh'ow is VEI‘Lj San, AHEr including if; ‘L-l’le pattern 0? +|fl€ PoirfiS becomes more SCCifier’ed. The Comata’n'w becomes weaker but is sh‘ll Positive, 7. You need to drive past two traffic lights on the way from your house to the nearest grocery store. The probability that you hit a red light is 0.5 at the first intersection and 0.4 at the second intersection. The probability that you run into a red light at both intersections is 0.25. On a random day you drive from home to that grocery store. Define the following events: E1 = you run into a red light at the first intersection E2 = you run into a red light at the second intersection E3 = you run into a green light at both intersections E4 = you run into a red light at both intersections Which of the following statements is (are) true about the above events? Check all that are correct. [4 marks] E1 and E2 are independent events. E1 and E2 are disjoint events. \/ E3 and E4 are disjoint events. _ E3 is the complement of E4. 8. In a university parking database with 5600 registered vehicles, records show that 43% of the registered vehicles are Asian makes, 23% are European makes and the remaining are American makes. Among all the 5600 cars, 20% ever received a parking ticket. You randomly pick three vehicles with replacement from the database. What is the probability that at most two of the three are American makes? [6 marks] L63“ X: #Americcm cwzs oui' 01C +413 3 chosen aarg Km Binfinrsf p:i—O,LJ(3»O.23 : o..3d,) Warmest- :1 cans are A-Ync.n‘cavi makes) TPCXEZ) : P(x:o) +- P()<=-l_)a P[)§:2) 0, 3 \ ‘ '1 l :(g) 0.34 uflogu) + (fl U.%41L|~b,%wz+(;)d.%4 10—039 3 0-28? + OAHLl "r 0122?;0-‘C1wb OP. P{a’r maH Z COG are Americam mam) =- l* P [0” 3) are, firmer-11w: r'l‘lal'Lé-J) 31,.{_p..gL-r_)5 : 1—0.0:3q ; OHM Wu 1‘ 7 _ . . _l \ W ifidfipemlgmg bfiCM-JL Oi: ”d'rC’i'rWVrj LV]+l/1 replatarwawt J cam WWW} malt/H120? fcthb-m null. . 9. Two stores sell watermelons. At the first store the melons weigh an average of 20 pounds with a standard deviation of 2.2 pounds. The melons are sold for 36 cents a pound. At the second store the melons are smaller, with a mean of 17 pounds and a standard deviation of 2 pounds. The store is having a sale on watermelons , only 25 cents a pound. Assume that the weights are normally distributed. Jenny selects a melon at random at each store. Find the mean and the variance of the difference in the prices Jenny pays for the two melons. [6 marks] L€+ X: weight 015 at melmi at w {it-3+ store A “(20; 2.3) “f: W'C’JI‘glrli OF 0L HLElEfl’I Ok‘l' +1“le grgffigngi siC'r‘e n1 N( :4} 2. Diggemmjc in Pv'lCQLS D : 35.x — 267’ {\Mkosured [:1 cents) raw) : Etsex ; 25v) : seam _ 2:; am : attzoi— 950:1) .—--“’_‘. my) : vtéex ~25Y> : ash-ix) + 25”“er :- 12qe (229+ mm 2%) assuming,i><! ‘1' We 2 8??? gq CENTS 1 ifldfirpmdcnr. ...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern