Stat 501
Homework 1 Solutions
1. (a) |
Spring 2005
| = 1 2 3 = 144
(b) trace(
) = 1 + 2 + 3 = 20
(c)
V (: x) =
e:
1
e
e
e
e = : (1 : )
:
:
1
1
=
1 = 12
11
by denition, the eigenvector : has the properties
e
1
e
(e)
e = 1
e and
:
:
00
6 0
02
1
1
e = 1.
:
(
Comparing Mean Vectors for Several
Populations
Compare mean vectors for g treatments (or populations).
Randomly assign n units to the -th treatment (or take
independent random samples from g populations)
Measure p characteristics of each unit. Observat
Other hypotheses of interest (contd)
In addition to the simple null hypothesis of no treatment
eects, we might wish to test other hypothesis of the general
form (examples follow):
H0 : Ckg gp = 0, Comparisons among treatments
H0 : gpMpq = 0, Comparisons
Introduction
In many observational or designed studies, observations are
collected simultaneously on more than one variable on each
experimental unit.
Multivariate analysis is the collection of methods that can be
used to analyze these multiple measurem
Comparing Mean Vectors for Two Populations
Independent random samples, one sample from each of two
populations
Randomized experiment: n1 units are randomly allocated to
treatment 1 and n2 units are randomly allocated to treatment
2. Sample sizes need no
Condence Regions
Condence regions are multivariate extensions of univariate
condence intervals.
Recall the denition of a 100(1 )% CI for a parameter
: for X f (x|), , the interval (t1(X ), t2(X ) is a
100(1 )% CI for if
Pr[t1(X ) t2(X )] = 1 .
If repre
Inferences about a Mean Vector
In the following lectures, we test hypotheses about a
p 1 population mean vector = (1, 2, . . . , p)
We could test p disjoint hypothesis (one for each j in ) but
that would not take advantage of the correlations between
th
Paired Comparisons and Repeated Measures
Hotellings T-squared tests for a single mean vector have
useful applications for studies with paired comparisons or
repeated measurements.
1. Paired comparison designs in which two treatments are
applied to each s
Two-way MANOVA
We now consider designs with two factors. Factor 1 has g
levels and factor 2 has b levels.
If Xikr is the p 1 vector of measurements on the rth unit
in the ith level of factor 1 and the kth level of factor 2:
Xikr = + i + k + ik + eikr ,
Basic Concepts in Matrix Algebra
An column array of p elements is called a vector of dimension p and is
written as
x1
x
xp1 = .2 .
.
.
xp
The transpose of the column vector xp1 is row vector
x = [ x1 x2 . . . x p ]
A vector can be represented in p-spa
Carapace Measurements for Female Turtles
Data on three dimensions of female turtle carapaces (shells):
X1=log(carapace length)
X2=log(carapace width)
X3=log(carapace height)
Since the measurements are all on the same scale, we
extracted the PCs from
Graphical Representation of Multivariate Data
One diculty with multivariate data is their visualization, in
particular when p > 3.
At the very least, we can construct pairwise scatter plots of
variables. Data from exercise 1.1 (transpose of Figure 1.1)
Principal Components I
When a very large number p of variables is measured on each
sample unit, interpreting results of analyses might be dicult.
It is often possible to reduce the dimensionality of the data
by nding a smaller set k of linear combinatio
Multivariate Linear Regression Models
Regression analysis is used to predict the value of one or
more responses from a set of predictors.
It can also be used to estimate the linear association between
the predictors and reponses.
Predictors can be cont
Multivariate Normal Distribution I
We will almost always assume that the joint distribution of
the p 1 vectors of measurements on each sample unit is the
p-dimensional multivariate normal distribution.
The MVN assumption is often appropriate:
Variables
Moment-generating Function of the Multivariate
Normal Distribution
If X Np(, ), then the moment-generating function is
given by mX(t) I cfw_exp (t X) = exp (t + 1 t t).
E
2
134
More Features of the
Multivariate Normal Distribution
If X Np(, ), then a li
Characterization of the Multivariate Normal
Distribution
Cramer (1946) showed that the following characterizes a
multivariate normal distribution:
X Np(, ) if and only if a X N (0, 2) for every pvariate real vector a.
The only if part of the proof is s
STAT 501
Spring 2001
Instructions:
1.
FINAL EXAM
Name _
Write your answers in the space provided on this exam. If you need more space,
use the back of a page or attach extra sheets of paper, but clearly indicate where
this is done.
At the beginning of the
Assessing Normality The Univariate Case
In general, most multivariate methods will depend on the
distribution of X or on distances of the form
n(X ) S 1(X )
.
Large sample theory tells us that if the sample observations
X1, ., Xn are iid from some popul