This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: A TUTORIAL ON PRINCIPAL COMPONENT ANALYSIS Derivation, Discussion and Singular Value Decomposition Jon Shlens  jonshlens@ucsd.edu 25 March 2003  Version 1 Principal component analysis (PCA) is a mainstay of modern data analysis  a black box that is widely used but poorly understood. The goal of this paper is to dispel the magic behind this black box. This tutorial focuses on building a solid intuition for how and why principal component analysis works; furthermore, it crystallizes this knowledge by deriving from first prin cipals, the mathematics behind PCA . This tutorial does not shy away from explaining the ideas infor mally, nor does it shy away from the mathematics. The hope is that by addressing both aspects, readers of all levels will be able to gain a better understand ing of the power of PCA as well as the when, the how and the why of applying this technique. 1 Overview Principal component analysis ( PCA ) has been called one of the most valuable results from applied lin ear algebra. PCA is used abundantly in all forms of analysis  from neuroscience to computer graphics because it is a simple, nonparametric method of extracting relevant information from confusing data sets. With minimal additional effort PCA provides a roadmap for how to reduce a complex data set to a lower dimension to reveal the sometimes hidden, simplified dynamics that often underlie it. The goal of this tutorial is to provide both an intu itive feel for PCA , and a thorough discussion of this topic. We will begin with a simple example and pro vide an intuitive explanation of the goal of PCA . We will continue by adding mathematical rigor to place it within the framework of linear algebra and explic itly solve this problem. We will see how and why PCA is intimately related to the mathematical tech nique of singular value decomposition ( SVD ). This understanding will lead us to a prescription for how to apply PCA in the real world. We will discuss both the assumptions behind this technique as well as pos sible extensions to overcome these limitations. The discussion and explanations in this paper are informal in the spirit of a tutorial. The goal of this paper is to educate . Occasionally, rigorous mathe matical proofs are necessary although relegated to the Appendix. Although not as vital to the tutorial, the proofs are presented for the adventurous reader who desires a more complete understanding of the math. The only assumption is that the reader has a working knowledge of linear algebra. Nothing more. Please feel free to contact me with any suggestions, corrections or comments. 2 Motivation: A Toy Example Here is the perspective: we are an experimenter. We are trying to understand some phenomenon by mea suring various quantities (e.g. spectra, voltages, ve locities, etc.) in our system. Unfortunately, we can not figure out what is happening because the data appears clouded, unclear and even redundant. This is not a trivial problem, but rather a fundamental...
View
Full
Document
This note was uploaded on 05/08/2010 for the course CS 6.345 taught by Professor Glass during the Spring '10 term at MIT.
 Spring '10
 Glass

Click to edit the document details