2.2.Pruning parameters - 11s1 COMP9417 Machine Learning and...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
11s1: COMP9417 Machine Learning and Data Mining Lecture : Decision Tree Learning Topic : Pruning Parameters (C4.5 / J48) Last revision : Tue Mar 15 17:51:52 EST 2011 1 Introduction Lecture slides 40–55 introduce decision tree pruning for overfitting avoidance. Two categories of pruning are: pre-pruning stop growing when data split not statistically significant post-pruning grow full tree, then remove sub-trees which are overfitting 2 Pre-pruning As stated in the lecture, pre-pruning (or stopping) was implemented in early decision-tree learning systems (e.g. Quinlan’s ID3). However, it was discovered that pre-pruning using a statistical test (such as chi-square in ID3) could give uneven results, working well on some data sets but not others. In Quinlan’s C4.5 (the successor to ID3) and the Weka implementation of C4.5 called J48 there is a parameter usually called the M parameter (“minNumObj” in J48) which is used to control stopping. This parameter (the default value is M=2) sets a
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 2

2.2.Pruning parameters - 11s1 COMP9417 Machine Learning and...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online