However neural networks require as much data

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: o an account and may not care what are the causes. One myth of neural networks is that data of any quality can be used to provide reasonable predictions and they will sift through it to find the truth. However, neural networks require as much data preparation as any other method, which is to say they require a lot of data preparation. The most successful implementations of neural networks (or decision trees, or logistic regression, or any other method) involve very careful data cleansing, selection, preparation and preprocessing. For instance, neural nets require that all variables be numeric. Therefore categorical data such as “state” is usually broken up into multiple dichotomous variables (e.g., “California,” “New York”) , each with a “1” (yes) or “0” (no) value. The resulting increase in variables is called the categorical explosion. Clustering Clustering divides a database into different groups. The goal of clustering is to find groups that are very different from each other, and whose members are very similar to each other. Unlike classification, you don’t know what the clusters will be when you start, or by which attributes the data will be clustered. Consequently, someone who is knowledgeable in the business must interpret the clusters. After you have found clusters that reasonably segment your database, these clusters may then be used to classify new data. Some of the common algorithms used to perform clustering include Kohonen feature maps and K-means. Don’t confuse clustering with segmentation. Segmentation refers to the general problem of identifying groups that have common characteristics. Clustering is a way to segment data into groups that are not previously defined, whereas classification is a way to segment data by assigning it to groups that are already defined. 12...
View Full Document

This note was uploaded on 11/25/2010 for the course CENG ceng taught by Professor Ceng during the Spring '10 term at Universidad Europea de Madrid.

Ask a homework question - tutors are online