Principal_Component_Analysis - Principal Component...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
30 Principal Component Analysis: Linear reduction technique. Example (Johnson and Wichern) : Weekly return of five stocks (Allied Chemical, du Pont, Union Carbide, Exxon, Texaco). Let 5 2 1 ,...... , x x x denote observed weekly rates. [] 0037 . 0063 . 0057 . 0048 . 0054 . ' = x = Σ 1 523 . 426 . 322 . 462 . 523 . 1 436 . 389 . 387 . 426 . 436 . 1 599 . 509 . 322 . 389 . 599 . 1 577 . 462 . 387 . 509 . 577 . 1 ˆ 857 . 2 ˆ 1 = λ [] 421 . 421 . 470 . 457 . 464 . ˆ ' 1 = e 809 . ˆ 2 = [] 582 . 526 . 260 . 509 . 240 . ˆ ' 2 = e 540 . ˆ 3 = [] 435 . 541 . 335 . 178 . 612 . ˆ ' 3 = e 452 . ˆ 4 = [] 382 . 472 . 662 . 206 . 387 . ˆ ' 4 = e 343 . ˆ 5 = [] 385 . 176 . 400 . 676 . 451 . ˆ ' 5 = e So, Algebraically, PC’s are particular linear combinations of the p random variables. Geometrically, these linear combinations represent the selection of a new coordinate system that represents the directions of maximum variability.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
31 Reasons for using PCA: a) Data screening. b) Clustering. c) Discriminant analysis. d) Regresion. Objectives of PCA. 1) Data reduction. 2) Interpretation. Definition Let the random vector [] P x x x X L 2 1 = have the covariance matrix Σ with eigenvalues 0 2 1 p λ K p p x a x a x a x a Y 1 2 12 1 11 ' 1 1 + + + = = L
Background image of page 2
32 p Y k i k i i i i a a Y Y Cov a a Y Var Σ = Σ = ' ' ) ( ) ( The PC’s are those uncorrelated linear combinations ) , , ( 1 p Y Y L whose variances are as large as possible. The first PC is the linear combination with maximum variance. Each succeeding PC accounts for as much of the remaining variability as possible Also, = = = Λ = Σ = p i i p i i Y Tr Tr X 1 1 ) var( ) ( ) ( ) var( Principal component score: Vector loading and Component loading vectors: Eigenvectors are normalized to unit length. Estimation of PC’s: µ ˆ Σ ˆ λ ˆ a ˆ e ˆ Determining the number of PC’s:
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
33 Proportion Scree plots Correlation between original variable measurements and pc scores.
Background image of page 4
34 Options ps=50 ls=74 pageno=1 nodate; Title 'Cereal First Example'; Data Cereal_1; Infile "C:\CPSC MULTI\Data Sets\Cereal_1.csv" dlm="," firstobs=2; Input ID calories protein fat sodium; drop ID; cards; proc print;run; proc princomp data=Cereal_1 covariance out=princereal1 ; proc print data=princereal1; run; Cereal First Example 1 The PRINCOMP Procedure Observations 6 Variables 4 Simple Statistics calories protein fat sodium Mean 88.33333333 3.166666667 1.833333333 154.1666667 StD 28.57738033 0.983192080 1.722401424 82.6085145 Covariance Matrix calories protein fat sodium calories 816.666667 -23.666667 41.666667 -761.666667 protein -23.666667 0.966667 -0.766667 -0.833333 fat 41.666667 -0.766667 2.966667 -94.166667 sodium -761.666667 -0.833333 -94.166667 6824.166667 Total Variance 7644.7666667
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
35 Eigenvalues of the Covariance Matrix Eigenvalue Difference Proportion Cumulative 1 6920.63710
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/26/2010 for the course CPSC 499 taught by Professor Staff during the Spring '08 term at University of Illinois, Urbana Champaign.

Page1 / 38

Principal_Component_Analysis - Principal Component...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online