This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Dimensionality Reduction: A Comparative Review L.J.P. van der Maaten * , E.O. Postma, H.J. van den Herik MICC, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands. Abstract In recent years, a variety of nonlinear dimensionality reduction techniques have been proposed, many of which rely on the evaluation of local properties of the data. The paper presents a review and systematic comparison of these techniques. The performances of the techniques are investigated on artificial and natural tasks. The results of the experiments reveal that nonlinear techniques perform well on selected artificial tasks, but do not outperform the traditional PCA on real-world tasks. The paper explains these results by identifying weaknesses of current nonlinear techniques, and suggests how the performance of nonlinear dimensionality reduction techniques may be improved. Key words: Dimensionality reduction, manifold learning, feature extraction. 1. Introduction Real-world data, such as speech signals, digital photographs, or fMRI scans, usually has a high dimen- sionality. In order to handle this data adequately, its dimensionality needs to be reduced. Dimensionality re- duction is the transformation of high-dimensional data into a meaningful representation of reduced dimension- ality. Ideally, the reduced representation should have a dimensionality that corresponds to the intrinsic dimen- sionality of the data. The intrinsic dimensionality of data is the minimum number of parameters needed to account for the observed properties of the data . Di- mensionality reduction is important in many domains, since it mitigates the curse of dimensionality and other undesired properties of high-dimensional spaces . As a result, dimensionality reduction facilitates, among others, classification, visualization, and compression of high-dimensional data. Traditionally, dimensionality reduction was performed using linear techniques such as Principal Components Analysis (PCA) and factor * Corresponding author. Email address: [email protected] (L.J.P. van der Maaten). analysis. However, these linear techniques cannot ade- quately handle complex nonlinear data. Therefore, in the last decade, a large number of nonlin- ear techniques for dimensionality reduction have been proposed (see for an overview, e.g., [23,96,114]). In contrast to the traditional linear techniques, the nonlin- ear techniques have the ability to deal with complex nonlinear data. In particular for real-world data, these nonlinear dimensionality reduction techniques may of- fer an advantage, because real-world data is likely to be highly nonlinear. Previous studies have shown that non- linear techniques outperform their linear counterparts on complex artificial tasks. For instance, the Swiss roll dataset comprises a set of points that lie on a spiral-like two-dimensional manifold within a three-dimensional space. A vast number of nonlinear techniques are per- fectly able to find this embedding, whereas linear tech-...
View Full Document
This note was uploaded on 11/09/2011 for the course CIS 6930 taught by Professor Staff during the Fall '08 term at University of Florida.
- Fall '08
- The Land