This preview shows page 1. Sign up to view the full content.
Unformatted text preview: leaf
node. By navigating the decision tree you can assign a value or class to a case by deciding which
branch to take, starting at the root node and moving to each subsequent node until a leaf node is
reached. Each node uses the data from the case to choose the appropriate branch.
Decision trees models are commonly used in data mining to examine the data and induce a tree
and its rules that will be used to make predictions. A number of different algorithms may be used
for building decision trees including CHAID (Chi-squared Automatic Interaction Detection),
CART (Classification And Regression Trees), Quest, and C5.0.
Neural networks are of particular interest because they offer a means of efficiently modeling large
and complex problems in which there may be hundreds of predictor variables that have many
interactions. (Actual biological neural networks are incomparably more complex.) Neural nets are
most commonly used for regressions but may also be used in classification problems.
A neural network (see figure) starts with an input layer, where each node corresponds to a
predictor variable. These input nodes are connected to a number of nodes in a hidden layer. Each
input node is connected to every node in the hidden layer. The nodes in the hidden layer may be
connected to nodes in another hidden layer, or to an output layer. The output layer consists of one
or more response variables. 10 Output layer
Hidden layer A neural network with one hidden layer.
After the input layer, each node takes in a set of inputs, multiplies them by a connection weight
adds them together, applies a function (called the activation or squashing function) to them, and
passes the output to the node(s) in the next layer. For example, the node above has five inputs (x0
through x4) each of which is multiplied by a weight and then added together resulting in a sum I:
I = .3X1+.7X2-.2X3+.4X4-.5X5= .3-.7-.2+.4+.5=.3
This output y is then the sum that has been transformed by the non-linear activation function, in
this case to a value of .57. x1 = +1 0.3 x2 = -1 0.7 x3 = +1 -0.2
0.4 x4 = +1 -0.5 Output (y)
Input (I) x0 = -1
The goal of training the neural net is to estimate the connection weights so that the output of the
neural net accurately predicts the test value for a given input set of values. The most common
training method is backpropagation. Each training method has a set of parameters that control
various aspects of training such as avoiding local optima or adjusting the speed of conversion. 11 Neural networks differ in philosophy from many statistical methods in several ways. First, a
neural network usually has more parameters than does a typical statistical model. For example, a
neural network with 100 inputs and 50 hidden nodes will have over 5,000 parameters. Because
they are so numerous, and because so many combinations of parameters result in similar
predictions, the parameters become uninterpretable and the network serves as a “black box”
predictor. However, this is acceptable in CRM applications. A bank may assign the probability of
View Full Document
This note was uploaded on 11/25/2010 for the course CENG ceng taught by Professor Ceng during the Spring '10 term at Universidad Europea de Madrid.
- Spring '10