Homework 1 Solutions
1. (a) |
| = 1 2 3 = 144
) = 1 + 2 + 3 = 20
V (: x) =
e = : (1 : )
1 = 12
by denition, the eigenvector : has the properties
e = 1
e = 1.
Comparing Mean Vectors for Several
Compare mean vectors for g treatments (or populations).
Randomly assign n units to the -th treatment (or take
independent random samples from g populations)
Measure p characteristics of each unit. Observat
Other hypotheses of interest (contd)
In addition to the simple null hypothesis of no treatment
eects, we might wish to test other hypothesis of the general
form (examples follow):
H0 : Ckg gp = 0, Comparisons among treatments
H0 : gpMpq = 0, Comparisons
In many observational or designed studies, observations are
collected simultaneously on more than one variable on each
Multivariate analysis is the collection of methods that can be
used to analyze these multiple measurem
Comparing Mean Vectors for Two Populations
Independent random samples, one sample from each of two
Randomized experiment: n1 units are randomly allocated to
treatment 1 and n2 units are randomly allocated to treatment
2. Sample sizes need no
Condence regions are multivariate extensions of univariate
Recall the denition of a 100(1 )% CI for a parameter
: for X f (x|), , the interval (t1(X ), t2(X ) is a
100(1 )% CI for if
Pr[t1(X ) t2(X )] = 1 .
Inferences about a Mean Vector
In the following lectures, we test hypotheses about a
p 1 population mean vector = (1, 2, . . . , p)
We could test p disjoint hypothesis (one for each j in ) but
that would not take advantage of the correlations between
Paired Comparisons and Repeated Measures
Hotellings T-squared tests for a single mean vector have
useful applications for studies with paired comparisons or
1. Paired comparison designs in which two treatments are
applied to each s
We now consider designs with two factors. Factor 1 has g
levels and factor 2 has b levels.
If Xikr is the p 1 vector of measurements on the rth unit
in the ith level of factor 1 and the kth level of factor 2:
Xikr = + i + k + ik + eikr ,
Basic Concepts in Matrix Algebra
An column array of p elements is called a vector of dimension p and is
xp1 = .2 .
The transpose of the column vector xp1 is row vector
x = [ x1 x2 . . . x p ]
A vector can be represented in p-spa
Carapace Measurements for Female Turtles
Data on three dimensions of female turtle carapaces (shells):
Since the measurements are all on the same scale, we
extracted the PCs from
Graphical Representation of Multivariate Data
One diculty with multivariate data is their visualization, in
particular when p > 3.
At the very least, we can construct pairwise scatter plots of
variables. Data from exercise 1.1 (transpose of Figure 1.1)
Principal Components I
When a very large number p of variables is measured on each
sample unit, interpreting results of analyses might be dicult.
It is often possible to reduce the dimensionality of the data
by nding a smaller set k of linear combinatio
Multivariate Linear Regression Models
Regression analysis is used to predict the value of one or
more responses from a set of predictors.
It can also be used to estimate the linear association between
the predictors and reponses.
Predictors can be cont
Multivariate Normal Distribution I
We will almost always assume that the joint distribution of
the p 1 vectors of measurements on each sample unit is the
p-dimensional multivariate normal distribution.
The MVN assumption is often appropriate:
Moment-generating Function of the Multivariate
If X Np(, ), then the moment-generating function is
given by mX(t) I cfw_exp (t X) = exp (t + 1 t t).
More Features of the
Multivariate Normal Distribution
If X Np(, ), then a li
Characterization of the Multivariate Normal
Cramer (1946) showed that the following characterizes a
multivariate normal distribution:
X Np(, ) if and only if a X N (0, 2) for every pvariate real vector a.
The only if part of the proof is s
Write your answers in the space provided on this exam. If you need more space,
use the back of a page or attach extra sheets of paper, but clearly indicate where
this is done.
At the beginning of the
Assessing Normality The Univariate Case
In general, most multivariate methods will depend on the
distribution of X or on distances of the form
n(X ) S 1(X )
Large sample theory tells us that if the sample observations
X1, ., Xn are iid from some popul