This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Hank Ibser Statistics 21 Spring 2010
Study Guide for the Midterm The midterm (Tuesday March 9 in lecture) will cover the following chapters with certain parts omitted: Ch1—2: Concepts and terminology in these chapters will be included as
relevant to Ch 12. Read section 1 of each chapter. Ch3: Omit sections 5—7. Ch4: Omit stuff on longitudinal/cross—sectional data in section 2. Omit section 7 unless you are using a calculator to find SDs. Ch5: Omit interquartile range (p89). Ch6,7: Omit. If you are looking for extra problems to do, the Special
Review Exercises at the end of Ch 6 are good because they incorporate
ideas from different chapters. Try l,2,3,4,6,7,9,ll. Ch 8: Omit Review Ex #12. Ch 9: Omit technical note on p146—l47 although I think it's interesting
to compare this to the rms error for regression. Ch 10: Omit technical note on p169. Ch 11: Omit section 3 and technical note on p197. Ch 12: Omit, in section 2, everything from "Now, an example." to the
end of the section (p208—211). Omit section 3. Omit Review Ex #12. Ch 13—15: You are responsible for everything in these chapters. More extra problems: Special Review Ex after Ch 15: 8—9, 11—15, 17—20. Handouts: "Summation, Average, and SD," "Summation and Correlation,"
and "Probability." The following 11 problems are from my old midterms; they tend to focus on
the handouts since you don't have practice from the text. After those are
27 problems from old midterms of Roger Purves (one of the authors of your
book). His problems tend to be shorter and there are more probability
problems (all on the last page). The midterm will have 5—6 problems. I also encourage you to do extra problems from the book as well as making
up problems and exchanging them with people. Look on bspace for updates
on review times and SLC services. 1) I have two bags of fruit. The first contains 4 apples and 2 oranges,
the second contains 5 oranges and 3 bananas. I choose a bag at random
and then take 4 pieces of fruit from that bag to eat for lunch. a) What is the chance that the 4 pieces of fruit are the same? b) What is the chance that I get at least one orange? 2) I record daily high temperatures in Berkeley for 90 days. The average of the first 30 days is 70 degrees Fahrenheit with an
SD of 8 degrees. In the next 60 days, the average is 60 degrees
with an SD of 6 degrees. a) What is the average temperature for all 90 days? b) What is the SD of temperature for all 90 days? 3) A list of transactions contains 100 numbers: 60 gains and 40 losses.
The gains are positive numbers and the losses are negative numbers. The
units are thousands of dollars. For the 60 gains, the average is 18 and
the SD is 7.5. For the 40 losses the average is —20 and the SD is 9.2.
a) Find the average of the 100 transactions. b) Find the SD of the 100 transactions. (If you didn't get an answer
for part a), use overall average=4.) 4) The following information was collected for a group of women: Average height = 65 inches SD of height = 2 inches Average weight 130 pounds SD of weight = 20 pounds r= 0.4 (a) Predict the weight of a woman who is 68 inches tall. (b) Of the women 68 inches tall, what percent weigh more than 130 pounds? 5) I roll a fair six—sided die 4 times. a) What is the chance that I get exactly 2 fours? b) What is the chance that not all the rolls are 2 or more? c) What is the chance that I get exactly 2 ones or exactly 2 sixes? 6) Incomes in a certain town follow the distribution below. 1000's of $/year % of Town
10—20 10
20—30 25
30—50 30
50—70 20
70—100 15 a) Draw a histogram for this data, labeling axes and including density scale.
b) About what percent of the people in this town earned between 50 and 55 thousand dollars? c) About what are the average, median, and SD for these incomes?
Choices are: 10,000, 20,000, 35,000, 40,000, 45,000, and 50,000. You obviously won't use all the numbers and shouldn't use any numbers
more than once. Average: Median: SD: 7) At one university, the average Verbal SAT (VSAT) score of the incoming
freshmen is 550, and the average Math SAT (MSAT) score is 530. The
correlation between VSAT and MSAT score is 0.6. The equation for
predicting MSAT score from VSAT score is reported as: predicted MSAT score = 0.9 (VSAT score) + 10
Does this line make sense? Pick one of the following and explain. (i) It doesn‘t make sense, that isn‘t the regression line.
(ii) It does make sense, that's the regression line.
(iii) We do not have enough information to tell if it is correct or not. 8) At a university, a group of 200 freshmen has an average VSAT score of 550, and the average MSAT score is 540. The correlation between VSAT and MSAT
score is 0.6. The SDs for both exams are 100. A group of 100 sophomores has
an average VSAT score of 550, and an average MSAT score of 600. The SDs for both exams are also 100. The correlation between VSAT and MSAT scores for the
sophomores is 0.5. (This is a too long for a midterm, but good practice.) a) Find the average VSAT and MSAT scores for all 300 students. b) Find the SDs for VSAT and MSAT scores for all 300 students.
c) Find the correlation between VSAT and MSAT scores for all 300 students. 9) The distribution of heights of a group of women closely follows the
normal curve. A woman at the 84th percentile of heights is 67 inches
tall, and a woman at the 7th percentile of heights is 61 inches tall.
(a) The average height is inches. (b) Someone at the 20th percentile is inches tall. 10) A drawer contains 20 socks. Ten of the socks are white, five are brown, four are gray, and one is red. a) You take two socks from this drawer at random without replacement. What is the chance that they are both the same color? b) 3 socks are drawn without replacement. What is the chance that exactly 2 white socks are drawn? 11) The midterm scores for a large math class were recorded as follows:
Midterm 1: average e 110, SD = 15 Midterm 2: average = 95, SD = 13, r = 0.5 (a) Find the equation of the regression line for predicting midterm 2 _score from midterm 1 score.
(b) Predict the mt 2 score of someone at the 47th percentile on mt l. Answers to my problems:
1) a) (1/2)(4/6)(3/5)(2/4)(1/3)+(l/2)(5/8)(4/7)(3/6)(2/5)
b) 1—((l/2)(4/6)(3/5)(2/4)(1/3))
2)avg=63.33, SD=8.47
3) avg=2.8, SD=20.35
4) a) 142 pOunds b) 74%
5) a) 41/21/2! (1/6)A2 (5/6)“2
b) 1—(5/6)“4
c) 2*(answer to a) — 41/2l/21 (1/6)A2 (1/6)“2
6) a) don't forget density scale... b) about 5%
c) avg=45,000, median= 40,000, SD=20,000
7) (i) point of averages isn't on the line.
8) a) 550,560 b) 100, 103.923 c) 0.54528 (rounding is OK)
9) a) avg=64.6 b) 62.6
10) a) (10/20)(9/19)+(5/20)(4/19)+(4/20)(3/19)
b) 3(10/20)(9/19)(10/18)
11) a) predicted mt2 = 0.433 mtl + 47.33
b) about 94.5, answer may vary a bit, I used —0.075 for 2.
Answers to problems from Roger Purves' Stat 21 past midterms: 1. a) 1 b) 2.83 2. 1 3. a) False b) must be negative 4. 6% 5. 50 pts 6. a) 62 b) 67% 7. a) 0 b) 4 8. about 96 students 9. a) 1 b) % of families per thousand dollars
c) 5% d) less than $40,000 10. a) 0.58 b) —1 11. a) 200 b) 135 12. a) 52,000 b) 12,247.5 13. 52.2 14. a) 27.5 b) 3.23 15. a) Blst percentile b) 60% 16. smaller than 190 17. about 255 students 18. 4.58 19. 1/45, 1/90 20. 26/52 x 25/51 x 26/52 x 25/51 21. a) (1/6 x 1/6) + (5/6 x 4/6)
b) 1—[(1/6 x 1/6) + (5/6) X (4/6)] 22. 20/100 = 1/5 23. 16/25 24. a) 51.8% b) 19.75% c) 1.08% 25. 39/52 x 38/51 26. 13/52 x 39/51 x 38/50 x 37/49 x 12/48 27. 8/20 Statistics 21
Problems from past midterms _ {gager Perv£5) Midterm 1 1. (10 points) A list of numbers has an average of 51. A new list is formed by subtracting 50
from every number on the original list. The r.m.s. of the numbers on the new list is 3.0. '
(a) Find theaverag‘e on the new list.
(1)) Find the so of the original list. 2. (10 points) One number is missing in the data set below: moor—n— >4
LDNM “d If possible, ﬁll in the blank to make the correlation coefﬁcient equal to 0. If it is not possible,
‘ say why not. 3. (10 points) An instructor gives tWo quizzes to the ten peeple in her course. On the ﬁrst quiz,
ﬁve of the people were above average; but on the second quiz, these people all scored below
average. The other ﬁve people moved in the opposite direction. They were all below average
on the ﬁrst quiz, and above average on the second one. (a) True or False and explain: “This is an example of the regression effect.” (b) The correlation coefﬁcient between the score on the two quizzes: must be zero. ' must be a positive number. must be a negative number. _ could be any of the above, depending on whether or not there are outliers in the data.
Check (NI) one option. Explain your choice. 4. (10 points) Here are the summary statistics for a large group of male students at a certain
uniVersity. Height: average =' 70 inches, SD = 3 inches
Weight: average == 162 pounds, SD = 30 pounds
correlation coefﬁcient = 0.50 The scatter diagram is football shaped. About what percent of the men 74 inches tall weigh less than the average weight of the men 66
inches tall? . 5. (10 points) In the mid—1980’s, the Educational Testing Service compared the SAT scores of
collegebound seniors with those obtained by a large representative sample of high school
juniors. On the verbal SAT, the 40th percentile of the scores for the collegebound seniors
happened to be equal to the 60th percentile of the scores for the sample of juniors. For both
groups, the SD was 100 points. The two histograms followsd the normal curveclosely. Find, approximately, the number of points separating the average of the two groups. f
x . 6. (10 points) Here are the summary statistics for a verylarge class: midterm score: average = 57 points, SD = 13 points
ﬁnal score: average = 52 points, SD = 25 points
correlation = 0.80 The scatter diagram is football shaped. (a) One person in the class scored 66 points on the midterm and 59 points on the ﬁnal.
What would be the regression estimate of his final score from his midterm score? (b) Out of all those who got the same score on the ﬁnal as he did, what percent scored
below him on the midterm? a; ( 5 gainer) A. list of numbers has an r,m.s. of 4.0. A new list is formed by adding 3 to every
. number on the list. The newlist has an r.m.s. of 5.0. ‘ (a) Find the average of the original list.
(b) Find the SD ofthe original list. ,8. (5 points) An instructor in a class of 300 students enters all test results in a computer file. A
program calculates the following summary statistics: ' midtenn score: average: 57 points, ‘ SD = 18rpoints
ﬁnal score: average = 52 points, SD = 25 points
correlation = 0160 The program also runs through the 300 students and calculates for each, regression estimate
of ﬁnal score from midterm score. For some of the students, the regression estimate was off
by more than 20 points. About how many? 9. (IOpriints) The histogram below shows the distribution of family income in a small town.
The data is hypothetical. ' ' (a) If the density scale is used on the vertical axis, what number belongs at the arrow? _ (b) What are the units for the answer to (a)? . ('c) What percent of the families eamed between $10,000 and $20,000? (d) Is the average income under $40,000, over $40,000 equal to $40,000? Circle your
choice and explain your reasoning. (Note: Assume that family income is spread evenly
within each of the six class intervals used in the histogram: 010, 1020, 2030, 3050,
5070 and 7080.) 10. (10 points) (a) What is the correlation coefﬁcient between x and y in the data set below? K Y 15 3
15 17
14 19
12 20
14 20
16 21
‘19  4O ‘ Now: To save you a little time, the sum of the xcolumn is 105, and the sum of the ycolumn
is 140.  (b) What is the correlation coefﬁcient between x and y in the data set below? _"__..__X__ 15
15
14
12
14
16
19 HFQCDGMU! 11. (10poims) An aerobics study involved 645 men. The average weight of the men was 166.5
pounds. The histogram of weight followed the normal curve closely. Out of the 645 men,
there were 200 men who weighed under 150 pounds. (a) HOW many weighed over 183 pounds?
(b) How many weighed under 140 pounds? [2, . (10 points) Agroup .of 100 managers has an average salary of $52,000; the so is $10,000.
’  3:333:19, 4O womenin the group. The average of their salaries is also $52,000, but the SD is
a D. . I . . (0) Find the .aVerage of'the Jnen’s salaries. (b) Find the SD of the men's salaries. [3 (lOpoims) r A list of numbers has an aVera'ge of40 and an SD of 15. A new list is formed '
 . by adding 10' to every number on the list. If possible, ﬁnd the r.m.s. of the numbers on the new list. If it is not possible, explain why not. I (5. ohm) The average age of a group of 25 programmers is 28 years. The SD is4 years.
' 011% member of the group, who is 40, leaves. Noone ishired to replace mm. For the
reduced group:  
(3) average age = ' .(b)'SD age: {fl . (10 points) A study is made of the Math and Verbal SAT scores for the entering class at a
. certain college. The summary statistics are as follows: average M~SAT=560 SD '= 120
average VSAT = 520 SD = 110
a correlation coefficient = 0.64 The scatter diagram is football—shaped. (a) Some one who scored 500 on the MSAT would have a percentile rank (within the entering
class) on the M—SAT of . (b) Of all those who scored 500 on the M—SAT, about " percent had a higher
percentile rank on the V~SAT than on the M—SAT. [a (5 points) The men enrolled in a large sports medicine course had an average weight of 160
pounds and an SD of 30 pounds. rFheir weights followed the normal curch closely. Consider the men in the course who weighed somewhere between 180 and 200 pounds. The
average weight of these men would be: M equal to 190 pounds.
bigger than 190 pounds.
smaller than 190 pounds. can’t tell without more information. Check (NI) one of the above options. Please explain your choice. r1: ( 5 points) There are 1600 ﬁrstyear students at a certain university. Their scores on the
Verbal SAT followed the normal curve closely, and the average score was 550 points.
Around 360 students had scores in the range from 550 to 600 points. How many had scores
in the range from 600 to 650 points? I ‘ n .  . .  ' ‘  b ' . the
. ' ' ‘ Th ' vera e of a hat of numbers IS 108. A statistician calculates the r.m.s. o
(a higiggggstﬁognaﬂie agerage of the list and gets 5.0. Unfortunately, he happened to misread;
to average of the list as 106, and used that instead of 108 when calculating the devrattons. o the SD is not 5.0. Find the correct SD. 1 9 (Spoints) Ten people are in a ' ‘ I ' ’ l l
I _ I room, waitlng to be interviewed. There are tWo br '
iglrpup, but these are the only two that are related. The people are called in one atoaﬂhﬁrhselIh‘ihbz9
emewed. The choice of who goes next is done at random. , ’ e (a) Find the the;brothers are the ﬁrst two people to be interviewed. Findth ‘ ' (b) is the ﬂitghance the older brotheris first person to be interwewed and the younger brother 2 O . ( 5 paints) Someone shufﬂes a deck of cards and dealsouptwo cards. Then he this again with asecond deck of cards. Find the chanc th tth" '
red the two from the second deck are black. 6 a 6 two cards from thc ﬁrm deCk am 2 1 . A die is rolled three times. Find the chance that: (a) The three numbers rolled are either all the same or all different. (b) Two of the numbers are the same and the third is different. 22. I( 5 points) Two draws are made at random with replacement from the box: IIIIHEIHEH Frnd the chance the second number is bigger than twice the first one. 2 3 . (5 points) A box contains two red, two white, and one blue marble. Tomorrow, two marbles
will be drawn at random, with replacement from the box. Find the chance the second marble is a different color than the ﬁrst one. For both parts, answer yes or no and explain your answer. Zaf '. (10 points) In a certain game, a player picks a number from 1 to 6 and bets on it. Then a die
' is rolled 4 times. If the player’s number shows up one or more times, the player wins.
Otherwise, the player loses. 3th people are playing the game: Oliver bets on 1, Tanya on 2, Thad on 3, Felice on 4, Filene
on 5, and Sam on .6. (a) Find the chancethat Sam wins.
(in) Find the chance neither Oliver nor Tanya win. (c) Find the chance both Oliver and Tanya win but no one else does. 7,; . '( 5 points) Two cards are dealt from a deck of cards. Find thechance that neither card is a
diamond. (A deck of cards contains 52 cards: 13 spades, 13 hearts, 13 diamonds, and 13 clubs.) 118 at a time, from a deck of cards. Find the chance that the 26” (5 POW9) Five cards are dealt) 0 d these are the only hearts in the five cards. (YOU ‘10 not _ ‘ I first and fifth cards are hearts, an
have to work out the arithmetic.) 2 7  "(5 points) First one ticket, and then a second one, are drawn at random from the box shown 7.
below. The draws aremade without replacement. Us ...
View
Full
Document
This note was uploaded on 02/08/2011 for the course STAT 21 taught by Professor Anderes during the Fall '08 term at University of California, Berkeley.
 Fall '08
 anderes

Click to edit the document details