This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ollecting, Cleaning, and Preparing Data
Obtain necessary data from various internal and external sources. Resolve representation and encoding differences. Join data from various tables to create a homogeneous source. Check and resolve data conflicts, outliers (unusual or exception values), missing data, and ambiguity. Use conversions and combinations to generate new data fields such as ratios or rolled-up summaries.. Source: Claire Castell, Data Mining for Dummies Validating the Models
Test the model for accuracy on an independent dataset, one that has not been used to create the model. Assess the sensitivity of a model. Pilot test the model for usability. Source: Claire Castell, Data Mining for Dummies Deploying the Model
For a predictive model, use the model to predict results for new cases, then use the prediction to alter organizational behavior. Deployment may require building computerized systems that capture the appropriate data and generate a prediction in real time so that a decision maker can apply the prediction. For example, a model can determine if a credit card transaction is likely to be fraudulent. Source: Claire Castell, Data Mining for Dummies Monitoring
Whatever you are modeling, it is likely to change over time. Monitoring models requires constant revalidation of the model on new data to assess if the model is still appropriate. Source: Claire Castell, Data Mining for Dummies Clementine Screen Shot 1 Clementine Screen Shot 2 Specific Data Mining Applications
U.S. News Colle...
View Full Document
This note was uploaded on 09/17/2009 for the course IT it771 taught by Professor Jenisha during the Fall '09 term at University of Advancing Technology.
- Fall '09
- Data Mining