This preview shows page 1. Sign up to view the full content.
Unformatted text preview: r understand the process. 6 1. Define the business problem. Each CRM application will have one or more business
objectives for which you will need to build the appropriate model. Depending on your
specific goal, such as “increasing the response rate” or “increasing the value of a response,”
you will build a very different model. An effective statement of the problem will include a
way of measuring the results of your CRM project.
2. Build a marketing database. Steps two through four constitute the core of the data
preparation. Together, they take more time and effort than all the other steps combined. There
may be repeated iterations of the data preparation and model building steps as you learn
something from the model that suggests you modify the data. These data preparation steps
may take anywhere from 50% to 90% of the time and effort of the entire data mining process!
You will need to build a marketing database because your operational databases and
corporate data warehouse will often not contain the data you need in the form you need it.
Furthermore, your CRM applications may interfere with the speedy and effective execution
of these systems.
When you build your marketing database you will need to clean it up – if you want good
models you need to have clean data. The data you need may reside in multiple databases such
as the customer database, product database, and transaction databases. This means you will
need to integrate and consolidate the data into a single marketing database and reconcile
differences in data values from the various sources. Improperly reconciled data is a major
source of quality problems. There are often large differences in the way data is defined and
used in different databases. Some inconsistencies may be easy to uncover, such as different
addresses for the same customer. Making it more difficult to resolve these problems is that
they are often subtle. For example, the same customer may have different names or — worse
— multiple customer identification numbers.
3. Explore the data. Before you can build good predictive models, you must understand your
data. Start by gathering a variety of numerical summaries (including descriptive statistics
such as averages, standard deviations and so forth) and looking at the distribution of the data.
You may want to produce cross tabulations (pivot tables) for multi-dimensional data.
Graphing and visualization tools are a vital aid in data preparation, and their importance to
effective data analysis cannot be overemphasized. Data visualization most often provides the
Aha! leading to new insights and success. Some of the common and very useful graphical
displays of data are histograms or box plots that display distributions of values. You may also
want to look at scatter plots in two or three dimensions of different pairs of variables. The
ability to add a third, overlay variable greatly increases the usefulness of some types of
4. Prepare data for modeling. This is the f...
View Full Document
- Spring '10