500 MGMT BUSINESS STATISTICS
MIDTERM STUDY QUESTIONS
for eight countries during the 1990s Q1.
The table shown below contains information technology (IT) investment as a percentage
of total investment for eight countries during the 1990s. It also contains the average
annual percentage change in employment during the 1990s. Explain how these data shed
light on the question of whether IT investment creates or costs jobs. (Hint: Make use of
relevant graphical tools)
Country
Netherlands
Italy
Germany
France
Canada
Japan
Britain
U.S.
% IT
2.5%
4.1%
4.5%
5.5%
8.3%
8.3%
8.3%
12.4%
% Change
1.6%
2.2%
2.0%
1.8%
2.7%
2.7%
3.3%
3.7%
To analyze this we plot each countrys annual percentage change in employment during the 1990s
on the vertical axis and the corresponding information technology (IT) investment as a percentage
of total investment for eight countries on the horizontal axis. In order to sow the relationship, I
created a scatter plot graph by the help of stat tools function of excel. As the points tend to move
up and to the right, and as the correlation between these two variables is 0.930, this implies a
reasonably strong positive linear relationship between IT investment and the jobs. Briefly we can
say that IT investment creates jobs.
Q2.
1
The percentage of the US population without health insurance coverage for samples from the 50
states and District of Columbia for both 2003 and 2004 produced the following tables of
summary measures and correlations.
Summary Measures Table:
Count
Mean
Median
Standard deviation
Minimum
Maximum
First quartile
Third quartile
Skewness
Percentage in 2003
51.000
14.455
13.700
3.724
9.100
24.900
11.600
16.800
0.910
Percentage in 2004
51.000
14.855
14.200
4.098
8.000
26.300
12.200
16.500
0.699
Table of Correlations:
Percent 2003
Percent 2003
Percent 2004
1.000
0.903
a. Describe the distribution of
state percentages of Americans
without health insurance coverage in 2004. Be sure to employ both measures of central
location and dispersion in developing your characterization of this sample. ( 5 pts)
1.000
The variance is essentially the average of the squared deviations from the mean. A more intuitive
measure is the standard deviation, defined as the square root of the variance. The distribution of
state percentages of Americans without health insurance coverage in 2004 can be summarized as
below;
As the mean of the population is 14.855;
If the data was approximately symmetric, then the 1st,2nd and the 3rd quartiles had to include these
below values;
Approximately 68% of the observations are within 1 standard deviation of the mean, that is,
within the interval XS which turns this range (10.757,18.953)
Approximately 95% of the observations are within 2 standard deviation of the mean, that is,
within the interval X2S which turns this range (6.659,23.051)
Approximately 99.7% of the observations are within 3 standard deviation of the mean, that is,
within the interval X3S which turns this range (2.561,27.149)
As the distribution is skewed, we can say that in 2004, the state percentages of Americans
without health insurance coverage in 2004 wasnt normally distributed, and there was a
huge gap between the health insurance coverages.
2
b. Compare the 2004 distribution of percentages with the corresponding set of
percentages taken in 2003. How are these two sets similar? In what ways are they
different? (5 pts)
When I compare the 2004 distribution of percentages with the corresponding set of
percentages taken in 2003, I can see that the 2003 distribution was more skewed
compared to the 2004 distribution. Also the standard deviation of the 2003 values is
smaller than the 2004 values. The number of the observations is the same in each
variable. Also the means are closer, but the variance in the 1 st variable is smaller,
therefore the 2003 values exhibit less variability about the mean than do the 2004 values.
By the affect of this, range of the distribution in 2003 values are smaller than the 2004
values.
c. What does the table of correlations for the two given sets of percentages tell you in this
case? (5 pts)
The correlation between the 2003 and the 2004 variables is 0,903.It shows that there is a
strong positive relationship between these two variables and these observations are
affected by the same impacts/events.
d. Based on your answers to Questions b and c, what would you expect to find upon
analyzing similar data for 2005? (5 pts)
I would expect to find a distribution which would be less skewed, has a wider range, and
a bigger st. dev.
3
Q3.
The salaries of all Michigan State University business college professors have a mean salary of
$79,580, median salary of $79,000, and standard deviation of $13,500.
a. If you increased every professors salary by $2,000, what would happen to the mean and
median salary? (5 pts)
If you dont change the number of the professors during this period of time mean would increase
by 2, the median would increase by 2.
b. If you increased every professors salary by $2,000, what would happen to the sample
standard deviation of the salaries? Why? ( 5 pts)
Variance is essentially the average of the squared deviations from the mean. And the std.
dev is the square root of this value; the standard deviation wouldnt change, because the
observations will also increase by 2, as well as the mean.
c.
If you increased every professors salary by 5%, what would happen to the sample
standard deviation of the salaries? ( 5 pts)
If the salaries of each professor are d ifferent from each other, then the standard deviations
of the sample would definitely change. As every salary will increase by %5, the
differences from the mean will change/increase.
Q4.
An ice cream vendor sells three flavors: chocolate, strawberry, and vanilla. Forty five percent of
the sales are chocolate, while 30% are strawberry, with the rest vanilla flavored. Sales are by
the cone or the cup. The percentages of cones sales for chocolate, strawberry, and vanilla, are
75%, 60%, and 40%, respectively. For a randomly selected sale, define the following events:
A1 = chocolate chosen = 0,45
A2 = strawberry chosen = 0,30
A3 = vanilla chosen =0,25
B = ice cream on a cone =0,62
B = ice cream in a cup=0,38
a. Find the probability that the ice cream was sold on a cone and was chocolate flavor. (8
pts)
P(X=A1 and Y=B)=0,45*0,62=0,279
b. Find the probability that the ice cream was sold in a cup and was chocolate flavor. ( 8 pts)
P(X=B and Y=A1)=0,38*0,45=0,171
c. Find the probability that the ice cream was sold on a cone. ( 9 pts)
P(B)=0,62
4
Q5.
A set of final exam scores in a business statistics course was found to be normally distributed,
with a mean of 73 and a standard deviation of 8.
a. What is the probability of getting a score between 65 and 89 on this exam? (5 pts)
We can easily compute these variables by converting them to the Z value and using the
advantages of standard normal distribution.
P(65<X<89)=P(65-73/8<Z<89-73/8)=P(-1<Z<2)=P(Z<2)-(1-P(Z<1))=0.97725-0.1586
=0.818
b. What percentage of students scored between 75 and 80 on this exam? (5 pts)
P(75<X<80)=P(75-73/8<Z<80-73/8)=(0,25<Z<0,875)=0.5987-0.1908=0.4079
c. Only 5% of the students taking the test scored higher than what value? (5 pts)
Z=1.64
Upper limit=+Z. Stdev=73+1.64*8=86.15
d. If the professor grades on a curve (i.e., gives As to the top 10% of the class, regardless of the
score), are you better off with a score of 81 on this exam or a score of 68 on a different
exam, where the mean is 62 and the standard deviation is 3? Show your answer in detail, and
explain. (10 pts)
In order to be in the %10 of the class, the minimum note has to be;
Z=1.2815
Upper limit= +Z. Stdev= 73+1.2815*8=83,25
With score the of 81, you cant be better off as you have to score higher than 83,25 in order to be
in top %10.
For the second case;
Upper Limit= +Z. Stdev=62+1.2815*3=65.8445
As 68 is higher than 65.84, you would be better off in this case.
Q6.
A financial analyst collected useful information for 30 employees at Gamma Technologies,
Inc. These data include each selected employees gender, age, number of years of relevant
5
work experience prior to employment at Gamma, number of years of employment at Gamma,
the number of years of post-secondary education, and annual salary.
a. Indicate the type of data for each of the six variables included in this set. (5 pts)
A variable is numerical if meaningful arithmetic can be performed on it. Otherwise the
variable is categorical.
Age, number of years of relevant work experience prior to employment at Gamma, number of
years of employment at Gamma, the number of years of post-secondary education, are numerical
and discrete as the values can be counted. They are also discrete as the values can be counted.
Annual salary is also numerical but it is also continuous as it depends on an essentially
continuous measurement.
Gender is categorical as it can be defined as male and female on ly.
b. Based on the histogram shown below, how would you describe the age distribution for
these data? (5 pts)
The age distribution is neither skewed nor trendy. It can be said that the distribution of age is
symmetric as the age appear to follow the bell shaped normal distribution.
Histogram for Age
10
9
8
7
6
5
4
3
2
1
0
<=20
20- 30
30- 40
40- 50
>50
Category
c. Based on the histogram shown below, how would you describe the salary distribution for these
data? (5 pts)
For the salaries which are in between 20K and 50-60K, the distribution seems normal but as there
are no variables in 60-70K and 70-80K, and as there are only 6 observations between 80-90K and
higher than 90K, I can describe the salary distribution as a whole as Positively skewed?
6
Histogram for Annual Salary
8
7
6
5
4
3
2
1
0
<=20000
2000030000
3000040000
4000050000
5000060000
6000070000
7000080000
8000090000
>90000
Category
Q7.
State
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Dist. of
Columbia
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Montana
Nebraska
Verbal
560
518
523
569
501
554
515
500
Math
553
514
524
555
519
553
515
499
Perc_Take
10
53
32
6
49
27
85
73
489
499
494
487
540
585
501
593
584
559
564
505
511
518
563
587
562
587
537
569
476
499
493
514
539
597
506
602
585
557
561
501
515
523
573
593
547
585
539
576
77
67
73
60
20
10
64
5
9
12
8
76
68
85
11
10
5
8
29
8
7
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
507
522
501
554
497
499
582
538
569
527
501
503
491
594
567
493
565
516
515
528
524
587
551
514
521
514
543
510
507
601
542
566
528
502
502
495
597
557
499
556
512
509
531
514
596
546
40
80
83
14
87
70
5
28
7
56
74
72
62
5
16
52
7
66
71
52
19
7
12
Examine the relationship between the average scores on the verbal and math components of the
SAT test across the 50 states and the District of Columbia by generating scatter plots. Explore the
relationship between each of these variables and the proportion of high school graduates taking
the SAT. Interpret each of these scatter plots.
Scatter plot among the countries are created as below. When we compare the District of
Columbia with the other countries, we can easily say that, District of Columbias average verbal
score is one of the lowest among the other countries. There is only one countrys verbal score is
lower than Dist of Columbia and it is Hawaii.
8
9
Q8.
A sample of 1000 households was selected in Los Angeles to determine information concerning
consumer behavior. Among the questions asked was Do you enjoy shopping for clothing? Of
480 males, 272 answered yes. Of 520 females, 448 answered yes.
2 X 2 Contingency Table is given below.
Gender
Enjoy Shopping for Clothing
Yes
No
Total
Male
272
208
480
Female
448
72
520
Total
720
280
1000
a. What is the probability that a respondent chosen at random is a male and enjoys shopping
for clothing? (4 pts)
There are 272 people who are male and also enjoy shopping. Therefore the probab ility for a
respondent to be chosen at random to be a male and enjoys shopping for clothing is;
P(X=Male and X=Yes)=272/1000=0,272
b. What is the probability that a respondent chosen at random is a female and enjoys shopping for
clothing? (4 pts)
There are 448 people who are female and also enjoy shopping. Therefore the probab ility for a
respondent to be chosen at random to be a female and to enjoy shopping for clothing is;
P(X=Female and X=Yes)=448/1000=0,448
c. What is the probability that a respondent chosen at random is a female or enjoys
shopping for clothing? (4 pts)
P(X=Female) + P(X=Yes) - P(X=Female and X=Yes) =(520/1000)+(720/1000) (448/1000)
=0,792
d. What is the probability that a respondent chosen at random is a male or a female? (4 pts)
P(X=Male)=480/1000=0.48
P(X=Female)= 520/1000=0.52
e. What is the probability that a respondent chosen at random enjoys or does not enjoy shopping
for clothing? (4 pts)
Q9.
The service manager for a new appliances store reviewed sales records of the past 20 sales of new
microwaves to determine the number of warranty repairs he will be called on to perform in the
10
next 90 days. Corporate reports indicate that the probability any one of their new microwaves
needs a warranty repair in the first 90 days is 0.05. The manager assumes that calls for warranty
repair are independent of one another and is interested in predicting the number of warranty
repairs he will be called on to perform in the next 90 days for this batch of 20 new microwaves
sold.
a. What is the probability that none of the 20 new microwaves sold will require a warranty repair
in the first 90 days? (5 pts)
Number of microwaves: 20
Probability of failure= 0.05
P(F=0)=0.358
Function: BINOMDIST (0, 20, 0.05, 0) or (20,20,0.95,0)
e. What is the probability that at most two of the 20 new microwaves sold will require a
warranty repair in the first 90 days? (5 pts)
P(F<2)=0,9245
Function: BINOMDIST(2,20,0.05,1)
f.
What is the probability that only one of the 20 new microwaves sold will require a
warranty repair in the first 90 days? (5 pts)
P(F=1)=0.7358
Function: BINOMDIST(1,20,0.05,1)
g. What is the probability that between two and four (inclusive) of the 20 new microwaves
sold will require a warranty repair in the first 90 days? ( 5 pts)
P(2<F<4)=P(F<4)-P(F<2)=0.0729
0.9974
4failures
26
11
2 failures
0.92451
6
h. What is the expected number of the new microwaves sold that will require a warranty
repair in the first 90 days? ( 5 pts)
E(X)=N.P=20*0.05=1
Q10.
The height of a typical American male adult is normally distributed with a mean of 68 inches and
a standard deviation of 5 inches. We observe the heights of 12 American male adults.
X~N (68,25/12=2.08)
a. What is the probability that exactly half the male adults will be less than 62 inches tall? (10 pts)
=P(X<62)=P(Z<62-68/2.08)=P(Z<-2.88)=1-P(Z<2.88)=1-(0.998)=0.0019
This is for one person only. We have to multiply this probability with the probability that
we can get from the sample? Bu soruya bakmak lazm..
6/12* 0.0019=9.5x10^-4
i.
Let Y be the number of the 12 male adults who are less than 62 inches tall.
the mean and standard deviation of Y. (10 pts)
Determine
DR. JOHN WALSHFR. 101/BELL 415OFFICE: 412 J.C. LongAUTOMNE 2010; MWF 9h-9h50TEL: 953-6744FINAL EXAM: Dec. 8, 8h-11hOFFICE HOURS: MW, 10h-11h; 14h-15h and by appt.EMAIL: walshj@cofc.edu; j walshtrois@gmail.com (online office hour, W 20h30-22h on gma