(a) the standard deviation.

(b) the mean.

(c) the correlation coefficient.

(d) the median.

(e) all of the above.

(f) only (a) (b) and (c).

Explain why you did, or did not, pick (a).

Explain why you did, or did not, pick (b).

Explain why you did, or did not, pick (c).

Explain why you did, or did not, pick (d).

2

2. (10 points) Suppose a social scientist, investigating death certificates, records the age at

which fifteen people died. The ages are displayed in the following stemplot:

leaf unit = 1 N = 15

(a) Describe the shape [mode, skewness/symmetry, and outliers] of the

distribution.

(b) Find the median and the mean. Show your calculations to derive the mean.

3. (10 points) The histogram and some descriptive statistics for a dataset are shown:

Descriptive Statistics: C1

N Mean Standard Deviation Minimum Q1 Median Q3 Maximum

520 1.2975 1.6834 0.00189 0.3187 0.7326 1.5733 8.0000

(a) Report (or compute as necessary) the value of the most appropriate measure of

center, and the value of the most appropriate measure of spread. If you compute, show your

calculations.

(b) There are some data values that are clear outliers in the plot. If these outlying

values were removed, the effect on mean and standard deviation would be which

one of the following:

(i) Mean and standard deviation would both increase.

(ii) Mean would increase, but standard deviation would decrease.

(iii) Mean and standard deviation would both decrease.

(iv) Mean would decrease, but standard deviation would increase.

(v) Neither mean nor standard deviation would be affected.

Explain why you did, or did not, pick (i).

Explain why you did, or did not, pick (ii).

Explain why you did, or did not, pick (iii).

Explain why you did, or did not, pick (iv).

Explain why you did, or did not, pick (v).

4

4. (10 points) The boxplot below displays the delivery time for 1,000 deliveries from a certain

pizza shop.

(a) 50% of the delivery times were (choose the correct answer and EXPLAIN why you DID pick

that answer and DID NOT pick the other possible answers, for a total of 5 explanations):

(i) Between 5 and 50 minutes.

(ii) Greater than 50 minutes.

(iii) Between 10 and 25 minutes.

(iv) Below 10 minutes.

(v) Greater than 15 minutes

(b) 25% of the delivery times were: (choose the BEST answer and EXPLAIN why you DID pick

that answer and DID NOT pick the other possible answers, for a total of 5 explanations):

(i) Below 5 minutes.

(ii) Between 15 and 30 minutes

(iii) Between 30 and 45 minutes

(iv) Greater than 50 minutes.

(v) Grater than 35 minutes.

5

[# 4 continued]

(c) Based on the plot if THERE WERE an outlier at 60, the mean delivery time of the 1,000

deliveries is most likely

(choose the correct answer):

(i) 15 minutes.

(ii) Greater than 15 minutes.

(iii) Less than 15 minutes.

(iv) 15/1000 minutes.

Explain why you chose the answer you picked.

5. (5 points) Suppose that the distribution of lengths of time for connection between a student’s

dorm computer and the remotely-located University server is normal in shape,

with mean of 5 seconds and standard deviation of 1.2 seconds. The middle 95%

of all connections will occur between what times?

6

6. (10 points) What are the effects of exposure to an advertising message?

The answer may depend both on the length of the ad, and how often it is

repeated.

A study investigated this question using 80 undergraduate students as subjects.

In order make sure that the sample is balanced in terms of gender, the researchers

randomly selected 40 male students, and randomly selected 40 female students.

All the students saw a 40 minute television program that included ads

for a digital camera. The length of the commercial was either 30-seconds or 90-

seconds, and it was repeated either 1,3, or 5 times during the program.

The subjects were randomized to the different treatments, and at the end of the

show rated their intention of purchasing the camera on a scale from 1 to 10.

(a) This study is: (pick one) controlled experiment / observational study

(b) What are the explanatory and response variables?

(c) How many treatments are there altogether in this study?

(d) Can you draw causal conclusions from this study? If not, explain why. If yes,

what feature of this study allows you to do so? (one sentence is enough!!)

(e) The sampling method that was used in this study is: (indicate the correct answer)

Simple random sampling / stratified sampling / convenience sampling /

voluntary response / cluster sampling.

Explain why you picked the answer you picked.

7

7. (10 points) A random sample of 26 people was selected. Each person was asked about their

daily consumption of olive oil, and their cholesterol value. The results are represented in

the following plot:

(a) For the plot shown, choose the most reasonable correlation coefficient, r is (pick one):

(i) -2.1

(ii) -0.8

(iii) -0.03

(iv) 0

(v) 0.5

(vi) -1

Explain why you chose the answer you picked.

8

[#7 continued]

Suppose the equation of the least-squares regression line for predicting cholesterol point value

from grams of oil consumed is: cholesterol = 202 – 4.5 oil

(b) Complete the following:

For each extra gram of olive oil consumed, cholesterol value is predicted

to increase/ decrease (pick one),

by _____________ points (fill in the blank).

(c) What is the predicted cholesterol level for a person who consumes 3 grams of

olive oil daily?

(d) True or false: We can conclude from this study that consuming olive oil causes

changes in cholesterol level. Briefly explain.

9

8. (10 points) The question whether juvenile delinquency is related to birth order was examined

in a large study. A total of 1,060 boys attending public school were given a questionnaire that

measures delinquent behavior, and had also each boy indicate his birth order. The results are

summarized in the following two-way table:

Delinquent Not Delinquent

Oldest 77 56

In-Between 59 37

Youngest 52 53

334

(a) What is the explanatory variable, and what is the response variable?

(b) Based on your answer to (a), use the empty table below to supplement the two-way

table with the appropriate conditional percentages.

Delinquent Not Delinquent

Oldest

In-Between

Youngest

(c) Interpret your results in the context of the question. That is, based on part (b) write a

brief description of the relationship between birth order and juvenile delinquency as it

appears from the data.

(d) Complete the following sentence:

Since this is _________________________ (choose: a randomized experiment or

an observational study), we ______________________ (choose: can or cannot)

conclude that birth order is the cause for delinquent behavior.

10

9. (5 points). When conducting a survey, it is important to use a random sample in order to:

(a) get a sample that represents the population well.

(b) reduce bias resulting from poorly worded questions.

(c) reduce bias resulting from poorly ordered questions.

(d) reduce bias resulting from sensitive questions.

(e) None of the above.

Explain why you chose the answer you picked.

10. (10 points) Consider the following types of displays:

(1) histogram (2) pie-chart (3) scatterplot (4) two-way table (5) side-by-side boxplots

In each of the following situations data is recorded. Indicate which of the five choices above is

the most appropriate display for the data and why it is the most appropriate.

(a) A social scientist studying racial bias in the court system, records the race and guilty verdict

for 300 people on trial. [At least some people were found guilty and some people not guilty.]

(b) A health scientist investigates how well we can predict an athlete’s maximum bench press

weight (a measure of his/her strength) from knowing the number of 60-pound bench presses that

he/she can perform (before fatigue). The relevant data was collected from 125 athletes.

(c) A USA Today poll asked a sample of single men (ages 18-44) the

following question:

If I had an “X-rated” bachelor party, I’d…

The possible answers were: (i) tell fiancé all (ii) edit details (iii) say nothing. Data was recorded.

(d) Does cell phone use while driving impairs reaction times? A recent experiment compared the

reaction times (in milliseconds) of drivers who were engaged in a conversation on a cell phone to

drivers who were not.