Online Excel file submission via Canvas is required. Please make sure that you read the
submission guidelines
Grading scale:
Proper Submission: 5 points
Proper File Name: 5 points
Qs
Pts
1 2
35 15
3
40
Questions
1. The fill volume of an automated filling
1
More on residual analysis
The jackknife residuals follow a t distribution with n k 2 degrees of freedom when model assumptions
hold, so it is possible to use them to check for outliers. Also, the hat or leverage values hi are used to assess
the extremen
1
Analysis of Variance versus Experimental Design
Not all data analyzed by ANOVA are from a designed experiment. On the other hand, some designed
experiments lead to data for which ANOVA methods are inappropriate. However, there is a strong historical
con
1
Visualization, checking assumptions for ANOVA
As with regression analysis, it is important both to visualize the data and to assess the assumptions of the
ANOVA model. We can use boxplots to visualize data for ANOVA models, and we can look at residualby
1
Testing hypotheses in regression
There are three general types of hypotheses that we may be interested in testing in multiple
regression: 1) H0 : 1 = 2 = . = k = 0, None of the x variables explain variation in
y (Overall regression F test), 2) H0 : j =
The best fitting line
In our previous lecture we considered ybi = 110 + 10xi to predict yi =
calories from xi = fat. One measure of how well this line fits the data is
given by:
2
2
2
2
2
(y1 yb1 ) + (y2 yb2 ) + (y3 yb3 ) + (y4 yb4 ) + (y5 yb5 ) =
5
X
(yi
1
Inferences about regression parameters
For our linear regression model yi = 0 + 1 xi + i , we have not made any assumptions about
the data to calculate the parameter estimates b0 and b1 , since we are simply applying the
method of least squares. To cond
1
Confidence interval, Prediction interval examples
For our small cereal data set, suppose we want to construct both a confidence interval for E(yn+1 ) and also
a prediction interval for yn+1 when xn+1 =1, using =.05. Remember that ybn+1 = 104.62 + (13.85
1
Matrix based approach to regression
For our small data set we have:
y1 = 0 + 1 x11 + 1 ,
y2 = 0 + 1 x21 + 2 ,
y3 = 0 + 1 x31 + 3 ,
y4 = 0 + 1 x41 + 4 ,
y5 = 0 + 1 x51 + 5 , where xi1 , for example, is the ith observations value of variable x1 .
This can
1
Multiple Regression Analysis
New challenges: 1) More difficult to choose a best model, 2) More difficult to visualize the model, 3)
Sometimes more difficult to interpret, 4) Extra computing is required. Written as
y = 0 + 1 x1 + . + k xk + or as yi = 0
A sampling distribution for the sample mean
From the population distribution of y, we can take repeated samples of
size n and calculate a sample mean from each sample:
Sample 1, y 1
Sample 2, y 2
.
Sample N, y N
Which yields a sampling distribution of the
Chapter 13 Mathematical statistics and data analysis by john rice
#1
Diabetic or Not
Alleles
data
Bb or bb
BB
Total
Diabetic Normal Total
12
4
16
39
49
88
51
53
104
Expected Counts
Diabetic Normal Total
Bb or bb 7.846154 8.153846
16
BB
43.15385 44.84615
8