Stat 430/Math468 – Notes #5 Chapter 3 Sample Geometry and Random Sampling A single multivariate observation is the collection of measurements of p different variables taken on the same item or trial. If n observations have been obtained, the entire data set can be placed in an matrix: np × Variable1 Variable2 ... Variable p ~1 1 11 12 1 1 21 22 2 2 2 ~ 2 12 ~ ' ' ... ' item 1 ' ... ' item 2 ' ... ' item n ' ' p or or p nn n p n n n x x xx x x x x x x x ⎛⎞ ⎜⎟ == = = ⎝⎠ x x X x r r MM M M M M M M r Each row of X is a p-variate observation. We say that the data are a sample of size n from a p-variate “population”. Another consideration: ( 11 12 1 21 22 2 ... ... || | ... p p p n p x x x X ) yy y L M M . The elements in each column 1 2 i i i ni x x x = y M are the n measurements on the i-th variable, 1,2,. .., ip = . Let the sample mean of the i-th variable 1 ( ... ), 1,2,. .., ii in i x x i p n =+ + += . Let be the vector which forms equal angle with each of the coordinate axes. Length of is ( 1,1, . ..,1 ' n 11 ) 1 n × 1 '? L = = 1 The projection of i y on 1 is ' ' i i i i i x x x x ⋅= = y1 M . 1

Each column i y can be decomposed into two perpendicular vectors: 1 2 () ii i i i i in i i xx x x x x ⎛⎞⎛ ⎜⎟⎜ =+− = + =+ ⎝⎠⎝ y 1 y 11 MM d , where d is called the i-th deviation vector (residual). i Geometric Interpretation of the Sample 1. The projection of a column i y of the data matrix X onto the equal angular vector is the vector 1 i x 1 , which has length i nx .
Stat 430/Math468 Notes#5

