dimension

dimension - Dimension Reduction Dimension Reduction PR ,...

Info iconThis preview shows pages 1–13. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Dimension Reduction Dimension Reduction PR , ANN, & ML 2 Dimension Reduction x Curse of dimensionality b with 50 features (dimensions), each quantized to 20 levels, create 20 50 possible feature combinations, imagine how many samples you need to estimate p( x |w)? b how do you visualize the structure in a 50 dimensional space? PR , ANN, & ML 3 Other problems x Size of the local regions needed for density estimation getting larger and larger b To capture r% of the data, edge length is r 1/n h n=10, r=0.01, x =0.63, h n=10, r=0.1, x=0.8 x Data tend to boundary, creating boundary skew b Consider uniform distribution, p % interior b Exterior probability is 1-p n h n=10, p=0.8, 0.89 exterior h N=100, p=0.8, 0.999. . exterior PR , ANN, & ML 4 Solutions - Reduction x Fishers linear discriminant b Preserve class separation (special case of principle component analysis) x Multi-dimensional scaling b Preserve distance measures x Principal component analysis b Best data representation (not necessarily best class separation ) PR , ANN, & ML 5 Fishers linear discriminant (2-class) x Given n d-dimensional samples x a linear transform which b maps d-D samples onto a line b best preserves class separation x Intuitively, good features are those with large separation of means relative to variances } ,..., , { 2 1 n x x x X = n n n n n = + = = 2 1 2 2 2 2 1 1 1 1 | | , | | , X X X X x w t = y PR , ANN, & ML 6 x 2 x 1 x 2 x 1 ( , ) w w 1 2 ( , ) x x 1 2 y PR , ANN, & ML 7 Caveats x The nature of the problem is that ambiguity might arise when you reduce problem dimension (a good reduction algorithm may minimize the problem, but may not completely eliminate the problem) PR , ANN, & ML 8 Caveats (cont) x The figures also suggest that, sometimes, to get better performance, it is necessary to increase the dimension (more features), not to decrease it PR , ANN, & ML 9 In the original d-dimensional space x Between class scatter x Within class scatter x Ideally, function should be large --= + i i t i i s s s X x m x m x ) ( ) ( 2 2 2 2 1 2 2 2 1 2 2 1 | | s s +-m m =-i i i n X x x m m m 1 | | 2 2 1 PR , ANN, & ML 10 In the transformed 1-dimensional space x Between class scatter x Within class scatter x Ideally, function should be large = = =-i x i i i i n y n m m m m w x w t t 1 1 | | 2 2 1 $ $ $ ( $ ) s s s y m i i 1 2 2 2 2 2 + =- 2 2 2 1 2 2 1 | | ) ( s s m m F +-= w PR , ANN, & ML 11 x Or w S w w ) m )(m m (m w m w m w t t t t t b m m =--=-=-2 1 2 1 2 2 1 2 2 1 ) ( | | w S w )w S (S w w S w w ) m (x ) m (x w m w x w t t t t X x t X x t t w i i i i i i s s m y s i i = + = + =--=-=-= 2 1 2 2 2 1 2 2 2 ) ( ) ( w S w w S w w t B t = +-= 2 2 2 1 2 2 1 | | ) ( s s m m w F PR , ANN, & ML 12 The Analysis x F(w): generalized Rayleigh quotient x To maximize F(w), w is the generalized eigenvector associated with the largest generalized eigenvalue ) m (m S w w w S S w S w S 2 1 1 w B 1 w B-= = =-- or w w w w S w t B t PR , ANN, & ML...
View Full Document

Page1 / 42

dimension - Dimension Reduction Dimension Reduction PR ,...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online