Thanks

Statistics Problems.docx Download Attachment
Instructions:
1.
Wordprocess your solutions or use Excel. If using Excel, use 1 spreadsheet and label each tab
with problems. DO NOT send multiple spreadsheets.
2.
Show all steps used in...Instructions:
1.
Wordprocess your solutions or use Excel. If using Excel, use 1 spreadsheet and label each tab
with problems. DO NOT send multiple spreadsheets.
2.
Show all steps used in arriving at the final answers. Incomplete solutions will receive partial
credit.
3.
Use of software package is at your discretion. You can use any software package to solve
Problems 3 – 5.
4.
Be sure to clearly show the answers to each question, software generated or your computation.
10 Total Points: True or False? 1 point each, please highlight your responses or use a different font color
a. Assume that the histogram of a data set is symmetric and bell shaped, with a mean of 75 and
standard deviation of 10. Then, approximately 95% of the data values were between 55 and 95.
b. A low p–value provides evidence for accepting the null hypothesis and rejecting the alternative.
A ttest is used to determine whether the coefficients of the regression model are significantly
different from zero.
d. Decision trees are more appropriate tools than decision tables when a sequence of decisions must
be made.
e. If a solution to an LP problem satisfies all of the constraints, then is must be feasible and
bounded.
f. Correlation is measured on a scale from 0 to 1, where 0 indicates no linear relationship between
two variables, and 1 indicates a perfect linear relationship.
g. In multiple regression, the problem of multicollinearity affects the ttests of the individual
coefficients as well as the Ftest in the analysis of variance for regression, since the Ftest
combines these ttests into a single test.
c.
h.
In a random walk model, there are significantly more runs than expected, and the
autocorrelations are not significant.
i.
When we maximize or minimize the value of a decision variable by running several simulations
simultaneously, we have found an optimal solution to the problem and attitude toward risk
becomes irrelevant.
a.
1.`
2.
20 Total Points Multiple choice (2 Points each) Please highlight or clearly mark your responses
Expressed in percentiles, the interquartile range is the difference between the
a. 10th and 60th percentiles
b. 15th and 65th percentiles
c. 20th and 70th percentiles
d. 25th and 75th percentiles
e. 35th and 85th percentiles
Which of the following are considered measures of association?
a. Mean and variance
b. Variance and correlation
c. Correlation and covariance
d. Covariance and variance
e. First quartile and third quartile
3.
If A and B are mutually exclusive events with P(A) = 0.70, then P(B):
a. can be any value between 0 and 1
b. can be any value between 0 and 0.70
c. cannot be larger than 0.30
d. Cannot be determined with the information given
4.
Sampling error is evident when:
a. a question is poorly worded
b. the sample is too small
c. the sample is not random
d. the sample mean differs from the population mean
5.
After calculating the sample size needed to estimate a population proportion to within 0.05, you have
been told that the maximum allowable error (B) must be reduced to just 0.025. If the original calculation
led to a sample size of 1000, the sample size will now have to be:
a. 2000
c. 1000
b. 4000
d. 8000
6.
A multiple regression analysis including 50 data points and 5 independent variables results in
The multiple standard error of estimate will be:
a. 0.901
b. 0.888
c. 0.800
d. 0.953
e. 0.894
7.
In a random walk model the
a. series itself is random
b. series itself is not random but its differences are random
c. series itself and its differences are random
d. series itself and its differences are not random
8.
When using exponential smoothing, if you want the forecast to react quickly to movements in the series,
you should choose:
a. values of
near 1
b. values of
near 0
c. values of
midway between 0 and 1
d. it depends on the data set
9.
Consider the following linear programming problem:
Maximize
Subject to
40.
The above linear programming problem:
a.
b.
c.
d.
10.
has only one optimal solution
has more than one optimal solution
exhibits infeasibility
exhibits unboundedness
The expected value of perfect information (EVPI) is equal to:
a. EMV with posterior information – EMV with prior information
b. EMV with free perfect information – EMV with information
c. EMV with free perfect information – EMV with no information
d. EMV with perfect information – EMV with less than perfect information
Problem 3 (10 points)
Hunter Chemical Company claims that its major product contains on the average 4.0 fluid ounces of
caustic materials per gallon. It further states that the distribution of caustic materials per gallon is normal
and has a standard deviation of 1.3 fluid ounces.
a.
What proportion of the individual gallon containers for this product will contain more than 5.0 fluid
ounces of caustic materials?
b.
A government inspector randomly selects 100 gallonsize containers of the product and finds the
mean weight of caustic material to be 4.5 fluid ounces per gallon. What is the probability of finding
the mean of a sample of 100 that is 4.5 or greater? Do you think the production process was
producing its usual level of caustic materials when this sample was taken?
Problem 4 (10 points)
Southern Textiles wishes to predict employee wages by using the employee’s experience X 1 and the
employee’s education X2. Employees are categorized as having a college degree or not having a college
degree in their personnel files, so the variable “education” is a qualitative variable. Thus, X 2 is an
indicator (0, 1) variable. Data for the employees are given below:
Wages Y (Thousands
of dollars)
Experience X1
(Months)
27.1
20.1
25.1
22.3
25.2
47.2
40.1
37.1
44.7
41.9
Education (College Degree
= 1, No College Degree =
0)
1
0
1
0
1
27.4
13.8
11.0
22.4
30.3
28.5
26.7
21.9
22.1
18.7
21.8
11.8
14.1
23.1
30.8
46.1
17.0
29.2
30.7
59.8
48.0
55.3
42.9
47.2
40.1
36.5
20.0
30.7
36.8
49.9
1
1
0
1
0
1
0
0
0
0
1
0
0
1
1
(Economics for Management and Economics, Watson, Billingsley, Croft and Huntsberger, Fifth Edition,
1993, Page 685)
(a) Copy and paste the data from this document to an Excel file. Select Wages as the dependent variable
and experience and education as the independent variables. Conduct multiple regression using Excel.
Paste the output report below. Note: Follow the instructions given in module 5 to conduct simple
regression. At the step where you specify the input data range, instead of selecting the data for one
independent variable, select data for all the independent variables.
(b) Write the equation from the regression output report. If you are using symbols in the equation for the
variables, do define the symbols before using the symbols in the equation.
(c) Provide a clear and complete interpretation of the coefficients b 1 and b2 in the equation. There is no
need to interpret b0. Note: Use actual variable names and numbers in answering your question. b 1 and
b2 are slopes is not a sufficient answer.
(d) What is the value of R 2 for this model? Do you think that the model does a good job of explaining the
variation in wages? Why or why not?
(e) Set up the hypotheses to test whether the model is significant. Is the regression model significant at
0.05 as the level of significance? What does this mean?
(f) Set up the hypotheses to test for each of the regression coefficients individually and perform the test
at the 0.05 level of significance.
(g) What average wages do you predict for employees with college degrees and experience = 40 months?
Interpret your prediction.
Problem 5 (10 points)
The accompanying data are the times (in seconds) that it took a sample of employees to assemble a toy
truck at a Cole Industries assembly plant. Assembly times are normally distributed. At the 10% level, can
we conclude that the mean assembly time for this toy truck is not equal to 3 minutes? Use H a: µ ≠ 180
seconds as the alternative hypothesis.
Data
190
176
180
174
181
183
208
188
198
165
199
198
Problem 6 (10 points)
The accompanying data indicate the number of mergers that took place in an industry over a 19year
period.
Year
1
2
3
4
5
6
7
a)
Mergers
23
23
31
23
32
32
42
Year
8
9
10
11
12
13
Mergers
64
47
96
125
140
160
Year
14
15
16
17
18
19
Mergers
150
165
192
210
250
300
Fit a least squares trend line to the merger data.
b) What type of trend (linear or curved) might best fit to this time series?
c)
Compute the forecast for year 20 based on the trend (linear or curved) that best fits the data?
Problem 7 (10 points)
Create another problem similar to Problem 3, 4, 5, or 6 and provide solutions as well.
View Full Attachment Show more
Please... View Full Answer

Statistics and Probability8346572.doc Download Attachment
Instructions:
1.
Wordprocess your solutions or use Excel. If using Excel, use 1 spreadsheet and
label each tab with problems. DO NOT send multiple spreadsheets.
2. Show all steps used in...Instructions:
1.
Wordprocess your solutions or use Excel. If using Excel, use 1 spreadsheet and
label each tab with problems. DO NOT send multiple spreadsheets.
2. Show all steps used in arriving at the final answers. Incomplete solutions will receive
partial credit.
3.
Use of software package is at your discretion. You can use any software package to
solve Problems 3 5.
4. Be sure to clearly show the answers to each question, software generated or your
computation.
10 Total Points: True or False? 1 point each, please highlight your responses or use a different font color
a. Assume that the histogram of a data set is symmetric and bell shaped, with a mean of
75 and standard deviation of 10. Then, approximately 95% of the data values were
between 55 and 95.
True
b. A low pvalue provides evidence for accepting the null hypothesis and rejecting the
alternative.
False
c.
A ttest is used to determine whether the coefficients of the regression model are
significantly different from zero.
True
d. Decision trees are more appropriate tools than decision tables when a sequence of
decisions must be made.
True
e. If a solution to an LP problem satisfies all of the constraints, then is must be feasible and
bounded.
True
f.
Correlation is measured on a scale from 0 to 1, where 0 indicates no linear relationship
between two variables, and 1 indicates a perfect linear relationship.
False
g. In multiple regression, the problem of multicollinearity affects the ttests of the individual
coefficients as well as the Ftest in the analysis of variance for regression, since the Ftest combines these ttests into a single test.
True
h.
In a random walk model, there are significantly more runs than expected, and the
autocorrelations are not significant.
False
i.
When we maximize or minimize the value of a decision variable by running several
simulations simultaneously, we have found an optimal solution to the problem and
attitude toward risk becomes irrelevant.
False
a.
20 Total Points Multiple choice (2 Points each) Please highlight or clearly mark your
responses
1.`
Expressed in percentiles, the interquartile range is the difference between the
a. 10th and 60th percentiles
b. 15th and 65th percentiles
c. 20th and 70th percentiles
d. 25th and 75th percentiles
e. 35th and 85th percentiles
2.
Which of the following are considered measures of association?
a. Mean and variance
b. Variance and correlation
c. Correlation and covariance
d. Covariance and variance
e. First quartile and third quartile
3.
If A and B are mutually exclusive events with P(A) = 0.70, then P(B):
a. can be any value between 0 and 1
b. can be any value between 0 and 0.70
c. cannot be larger than 0.30
d. Cannot be determined with the information given
4.
Sampling error is evident when:
a. a question is poorly worded
b. the sample is too small
c. the sample is not random
d. the sample mean differs from the population mean
5.
After calculating the sample size needed to estimate a population proportion to within 0.05, you
have been told that the maximum allowable error (B) must be reduced to just 0.025. If the
original calculation led to a sample size of 1000, the sample size will now have to be:
a. 2000
c. 1000
b. 4000
d. 8000
6.
A multiple regression analysis including 50 data points and 5 independent variables results in
40. The multiple standard error of estimate will be:
a. 0.901
b. 0.888
c. 0.800
d. 0.953
e. 0.894
7.
In a random walk model the
a. series itself is random
b. series itself is not random but its differences are random
c. series itself and its differences are random
d. series itself and its differences are not random
8.
When using exponential smoothing, if you want the forecast to react quickly to movements in
the series, you should choose:
a. values of
near 1
b. values of
near 0
c. values of
midway between 0 and 1
d. it depends on the data set
9.
Consider the following linear programming problem:
Maximize
Subject to
The above linear programming problem:
a.
b.
c.
d.
10.
has only one optimal solution
has more than one optimal solution
exhibits infeasibility
exhibits unboundedness
The expected value of perfect information (EVPI) is equal to:
a. EMV with posterior information EMV with prior information
b. EMV with free perfect information EMV with information
c. EMV with free perfect information EMV with no information
d. EMV with perfect information EMV with less than perfect information
Problem 3 (10 points)
Hunter Chemical Company claims that its major product contains on the average 4.0 fluid
ounces of caustic materials per gallon. It further states that the distribution of caustic materials
per gallon is normal and has a standard deviation of 1.3 fluid ounces.
a.
What proportion of the individual gallon containers for this product will contain more than
5.0 fluid ounces of caustic materials?
Normal
Probabilities
Common Data
Mean
Standard Deviation
4
1.3
Probability for X >
X Value
5
Z Value
0.7692308
P(X>5)
0.2209
The 22.09 percent of the individual gallon containers for this product will contain more than 5.0
fluid ounces of caustic materials.
b.
A government inspector randomly selects 100 gallonsize containers of the product and
finds the mean weight of caustic material to be 4.5 fluid ounces per gallon. What is the
probability of finding the mean of a sample of 100 that is 4.5 or greater? Do you think the
production process was producing its usual level of caustic materials when this sample was
taken?
Normal Probabilities
Common Data
Mean
Standard Deviation
5
0.13
Probability for X >
X Value
4.5
Z Value
3.84615385
P(X>4.5)
0.9999
So the probability of finding the mean of a sample of 100 that is 4.5 or greater is 0.9999.
Problem 4 (10 points)
Southern Textiles wishes to predict employee wages Yby using the employees experience X1
and the employees education X2. Employees are categorized as having a college degree or not
having a college degree in their personnel files, so the variable education is a qualitative
variable. Thus, X2 is an indicator (0, 1) variable. Data for the employees are given below:
Wages Y
(Thousands of
dollars)
27.1
20.1
25.1
22.3
25.2
27.4
13.8
11.0
22.4
30.3
28.5
26.7
21.9
22.1
18.7
21.8
11.8
14.1
23.1
30.8
Experience X1
(Months)
47.2
40.1
37.1
44.7
41.9
46.1
17.0
29.2
30.7
59.8
48.0
55.3
42.9
47.2
40.1
36.5
20.0
30.7
36.8
49.9
Education (College
Degree = 1, No College
Degree = 0)
1
0
1
0
1
1
1
0
1
0
1
0
0
0
0
1
0
0
1
1
(Economics for Management and Economics, Watson, Billingsley, Croft and Huntsberger, Fifth
Edition, 1993, Page 685)
(a) Copy and paste the data from this document to an Excel file. Select Wages as the
dependent variable and experience and education as the independent variables. Conduct
multiple regression using Excel.... View Full Attachment Show more
Please... View Full Answer
Download Preview:
A5  Output Hypothesis
t Test for Hypothesis of the Mean
Data
Null Hypothesis
=
Level of Significance
Sample Size
Sample Mean
Sample Standard Deviation
180
0.1
12
186.666666667
12.470571419...