Decision II - Morgan C. Wang Department of Statistics...

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Morgan C. Wang Department of Statistics Orlando, Florida 32816-2370 Auxiliary Uses of Decision Trees 3/29/2010 1 Morgan C. Wang Outlines Introduction Data Exploration Data Preparation Conclusions Morgan C. Wang 3/29/2010 2 Introduction 3/29/2010 3 Morgan C. Wang Data Exploration 3/29/2010 4 Morgan C. Wang Data Exploration Interpretability No strict assumptions concerning the functional form of the model Computational efficiency Robust against the presence of outliers Resistant to the curse of dimensionality Less data preparation work Interpretability: The model produced by decision trees methodology is a collection of If statements. It is much easier to interpret than models produced by other data mining methods such as neural networks and regression. Data Exploration Data Exploration No strict assumptions concerning the functional form of the model: Decision trees is a very flexible modeling tool with many little assumptions about the final form and the parameters of the model. It is a natural choice to explore the data if we have very little knowledge about what the model should look like. Data Exploration Computational efficiency: There are typically many irrelevant and redundant variables in the data. The computation time can be significantly affected by the presence of these variables for some modeling tools such as neural networks and regression. Since decision trees are less affected by the presence of extra variables, they can be used at the data exploration stage of the data mining process. Data Exploration Robust against the presence of outliers: To identify the best split of a given variable, only the rank instead of the value of each observation needs to be used, i.e., it is relatively more robust against the presence of outlier observations than other value based data mining methods. Data Exploration Resistant to the curse of dimensionality: Although the curse of dimensionality causes problems for any data mining methods, decision trees are relatively resistant to the curse of dimensionality. Data Exploration Less data preparation work: Data preparation is a pre-step of any meaningful predictive modeling. Decision trees need not to use dummy variables representation and have build-in mechanism to deal with missing values. This can significantly reduce the data preparation time, i.e., decision trees are an idea tool for exploration data analysis. Data Exploration Data Preparation 3/29/2010 12 Morgan C. Wang Data Preparation Dimension Reduction Variable Selection Collapsing ordinal variable with many levels Collapsing nominal variable with many levels Dimension Enhancement Discretizing interval scaled variables Missing Value Imputation Interaction Detection Data Preparation...
View Full Document

This note was uploaded on 09/22/2011 for the course STA 6714 taught by Professor Staff during the Spring '11 term at University of Central Florida.

Page1 / 53

Decision II - Morgan C. Wang Department of Statistics...

This preview shows document pages 1 - 14. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online