Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Sec. 5.1] Rules and trees from data: first principles 53 tiated by ML-leaning statisticians (see Spiegelhalter, 1986) and statistically inclined ML theorists (see Pearl, 1988) may change this. Although marching to a different drum, ML people have for some time been seen as a possibly useful source of algorithms for certain data-analyses required in industry. There are two broad circumstances that might favour applicability: 1. categorical rather than numerical attributes; 2. strong and pervasive conditional dependencies among attributes. As an example of what is meant by a conditional dependency, let us take the classification of vertebrates and consider two variables, namely “breeding-ground” (values: sea, fresh- water, land) and “skin-covering” (values: scales, feathers, hair, none). As a value for the first, “sea” votes overwhelmingly for FISH. If the second attribute has the value “none”, then on its own this would virtually clinch the case for AMPHIBIAN. But in combination with “breeding-ground = sea” it switches identification decisively to MAMMAL. Whales and some other sea mammals now remain the only possibility. “Breeding-ground” and “skin-covering” are said to exhibit strong conditional dependency. Problems characterised by violent attribute-interactions of this kind can sometimes be important in industry. In predicting automobile accident risks, for example, information that a driver is in the age- group 17 – 23 acquires great significance if and only if sex = male. To examine the “horses for courses” aspect of comparisons between ML, neural-net and statistical algorithms, a reasonable principle might be to select datasets approximately evenly among four main categories as shown in Figure 5.2. attributes all or mainly numerical all or mainly categorical + + (-) (+) pervasive absent strong and weak or conditional dependencies Key: + ML expected to do well (+) ML expected to do well, marginally (-) ML expected to do poorly, marginally Fig. 5.2: Relative performance of ML algorithms. In StatLog , collection of datasets necessarily followed opportunity rather than design, so that for light upon these particular contrasts the reader will find much that is suggestive, but less that is clear-cut. Attention is, however, called to the Appendices which contain additional information for readers interested in following up particular algorithms and datasets for themselves. Classification learning is characterised by (i) the data-description language, (ii) the language for expressing the classifier, – i.e. as formulae, rules, etc. and (iii) the learning algorithm itself. Of these, (i) and (ii) correspond to the “observation language” and
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
54 Machine Learning of rules and trees [Ch. 5 “hypothesis language” respectively of Section 12.2. Under (ii) we consider in the present chapter the machine learning of if-then rule-sets and of decision trees. The two kinds of language are interconvertible, and group themselves around two broad inductive inference
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 20


This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online