# marrayclass17 - Reduction of Dimensionality Dr Edgar Acuna...

This preview shows pages 1–9. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Reduction of Dimensionality Dr. Edgar Acuna Departmento de Matematicas Universidad de Puerto Rico- Mayaguez math.uprrm.edu/~edgar Dimension Reduction Feature Selection : The main aim of doing feature selection is to reduce the dimensionality of the feature space, by selecting relevant and no redundant features and then removing the remaining irrelevant features. That is, feature selection selects “q" features from the entire set of “p" features such that q ≤ p. Ideally q <<< p. Feature Extraction : A smaller set of features is constructed by applying a linear (or nonlinear) transformation to the original set of features. The best known method is principal components analysis (PCA). Others: PLS, Principal curves. Feature selection We will consider only supervised classification problems. Goal: Choose a small subset of features such that : a) The accuracy of the classifier on the dataset does not decrease in a significant way. b) The resulting conditional distribution of a class C, given the selected vector feature G, is as close as possible to the original conditional distribution given all the features F. o The computational cost of the classification will be reduced since the number of features will be less than before. o The complexity of the classifier is reduced since redundant and irrelevant features are eliminated. o It helps to deal with the “curse of dimensionality” effect. Advantages of feature selection Steps of Feature selection 1. A generation procedure : The search of the optimal subset could be: complete, heuristic, random. 2. An evaluation function : Distance measures, Information measures, consistency measures, dependency measures, classification error rate). 3. A stopping criterion : A threshold, a prefixed number of iterations, a prefixed size of the best subset of features. 4. (Optional) A validation procedure to check whether the subset is valid . Guidelines for choosing a feature selection method Ability to handle different types of features (continuous, binary, nominal, ordinal) Ability to handle multiple classes Ability to handle large datasets. Ability to handle noisy data. Low complexity time. Categorization of feature selection methods (Dash and Liu, 1997) Generation Evaluation Measures Heuristic Complete Random Distance Relief Branch and Bound - Information Trees MDL - Dependency POEIACC - - Consistency FINCO Focus LVF Classifier Error rate SFS, SBS,SFS Beam Search Genetic Algorith The methods in the last row are also known as the “wrapper” methods. Filter methods They do not require a classifier, instead they use measures that allow us to select the features distinguishing the classes most....
View Full Document

{[ snackBarMessage ]}

### Page1 / 40

marrayclass17 - Reduction of Dimensionality Dr Edgar Acuna...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online