Lecture 3_PCA1 - Principal Components Analysis (PCA) Click...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
Click to edit Master subtitle style Principal Components Analysis (PCA) Tommy W S Chow Dept of EE CityU 2011
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Outline n What is feature reduction and Why? n Feature reduction algorithms n What is PCA? n Basic definitions n Computation analysis of PCA n Examples and applications n Summary
Background image of page 2
What is high Dimensional data set? 13-D financial data set Each pixel is considered as a variable When we have N images, we have
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Other high dimensional data: RF module testing data set A total of 180 dimensions , each dimension corresponds to different output signal. Different instances correspond to different test modules.
Background image of page 4
Other high dimensional data: cDNA A cDNA may well up to a dimension of 25k, each attribute is a gene of the tumor cell cDNA of Different patients The numerical value under each attribute (gene) indicates the amount of transcription of the gene
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
UCI benchmark dataset Wine Data Set (3 types of wine) Number of Instances:178 ( different samples ) Number of Attributes:13 ( Features ) From the dataset alone, it is difficult or impossible to tell which data corresponds to which type of wine!
Background image of page 6
77 Example Illustration Wine Data (178 13) Alcohol Malic Acid Ash Magnesium Phenol Proline 14.23 1.71 2.43 127 2.8 1065 13.2 1.78 2.14 100 2.65 1050 13.16 2.36 2.67 101 2.8 1185 14.37 1.95 2.5 113 3.85 1480 13.24 2.59 2.87 118 2.8 735 13.27 4.28 2.26 120 1.59 835 13.17 2.59 2.37 120 1.65 840 14.13 4.1 2.74 96 2.05 560
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
What is feature reduction? n Feature reduction refers to the mapping of the original high-dimensional data points onto a lower-dimensional feature space, while most of intrinsic information is preserved. n Let x be the original high-dimensional data with p dimensionality, y be the low-dimensional data with d dimensionality ( y is unknown and d << p ), we want to find a mapping matrix G satisfying The high-dimensional data x will be mapped to the low –dimensional data y
Background image of page 8
n Most machine learning and data mining techniques may not be effective for high-dimensional data ¨ Curse of Dimensionality ¨ Query accuracy and efficiency degrade rapidly as the dimension increases. n The intrinsic dimension may be small. ¨ For example, the number of genes responsible for a certain type of disease may be small. Why feature reduction?
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
n Complexity (running time) increases with dimension d. n If the number of features d is large, the number of samples m , may be too small for accurate parameter estimation. For example , covariance matrix has d2 parameters: For accurate estimation, m should be much bigger than d2 , otherwise model is too complicated for the data. Curse of Dimensionality (1)
Background image of page 10
In general, if m samples is dense enough in 1D , for example: If we decide to incorporate a second feature and aim to preserve the granularity of each axis, then we will raise the number of bins from 3 (in 1D) to 32=9 (in 2D). Curse of Dimensionality (2)
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 95

Lecture 3_PCA1 - Principal Components Analysis (PCA) Click...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online