Sample Final Exam
1.
What is the ANOVA test? What is its relation to the Ttest?
(5%)
In statistics, analysis of variance (ANOVA) is a collection of statistical models, and their
associated procedures, in which the observed variance in a particular variable is partitioned into
components attributable to different sources of variation. In its simplest form ANOVA provides a
statistical test of whether or not the means of several groups are all equal, and therefore generalizes
ttest to more than two groups. ANOVAs are helpful because they possess an advantage over a two
sample ttest. Doing multiple twosample ttests would result in an increased chance of committing
a type I error. For this reason, ANOVAs are useful in comparing three or more means.
A ttest is any statistical hypothesis test in which the test statistic follows a Student's t distribution if
the null hypothesis is supported. It is most commonly applied when the test statistic would follow a
normal distribution if the value of a scaling term in the test statistic were known. When the scaling
term is unknown and is replaced by an estimate based on the data, the test statistic (under certain
conditions) follows a Student's t distribution.
2.
What is correlation coefficient between two sets of numbers?
What can you say when
the correlation coefficient is zero? (Why)?
(10%)
All correlation coefficients range from 1.00 to +1.00. A correlation coefficient of 1.00 tells you
that there is a perfect negative relationship between the two variables. This means that as values on
one variable increase there is a perfectly predictable decrease in values on the other variable. In
other words, as one variable goes up, the other goes in the opposite direction (it goes down).
A correlation coefficient of +1.00 tells you that there is a perfect positive relationship between the
two variables. This means that as values on one variable increase there is a perfectly predictable
increase in values on the other variable. In other words, as one variable goes up so does the other.
A correlation coefficient of 0.00 tells you that there is a zero correlation, or no relationship, between
the two variables. In other words, as one variable changes (goes up or down) you can’t really say
anything about what happens to the other variable.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document3.
State the conditions i.e., assumptions of simplelinear regression and explain how do
you check against these assumptions.
(15%)
There are four principal assumptions which justify the use of linear regression models for purposes
of prediction:
(i) Linearity of the relationship between dependent and independent variables
(ii) Independence of the errors (no serial correlation)
(iii) Homoscedasticity (constant variance) of the errors
(a) Versus time
(b) Versus the predictions (or versus any independent variable)
(iv) Normality of the error distribution.
If any of these assumptions is violated (i.e., if there is nonlinearity, serial correlation,
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 NA
 Normal Distribution, serial correlation, independent variables, Heteroscedasticity

Click to edit the document details