Data Mining Methods Basics.txt - Data Mining is the process of extracting valid useful unknown and comprehensible information from data and using it to

Data Mining Methods Basics.txt - Data Mining is the process...

This preview shows page 1 - 2 out of 4 pages.

Data Mining is the process of extracting valid, useful, unknown and comprehensible information from data and using it to make proactive knowledge- driven business decisions. Data mining uses statistical procedures to find unexpected patterns in data and identifies associations between variables. --------- Which of the following is not applicable to Data Mining? Involves working with known information The process of extracting valid, useful, unknown info from data and using it to make proactive knowledge driven business is called. Data mining --------- Data Mining Tasks Let's now move on to the common classes of Data Mining tasks - Anomaly Detection, Associate Learning, Cluster Detection, Classification and Regression Anomaly Detection refers to identifying items, events or observations that do not adhere to the expected pattern or the other items in the dataset. A good example is how the tax department models typical tax returns and then identifies returns that differ from this model using anomaly detection. This is used for audits and reviews. Association learning is the ability to learn and remember the relationship between unrelated items or stimuli or behavior. Association learning is the type of data mining that drives the recommendation engines in major sites like Amazon and Netflix. This would let you know that customers who bought a particular item also bought another item. Cluster Detection is a type of pattern recognition particularly useful in recognizing distinct clusters or sub-categories within the data. The purchasing habits of hobbyists like gardeners, artists and model builders would look quite different. By analyzing the purchasing behavior using clustering algorithms, one can detect the various subgroups within the dataset. Classification - If an existing structure is already known, you can use data mining to classify new cases into these pre-determined categories. The algorithms can be trained to detect systematic differences between items in each group by learning from a large set of pre-classified examples. The algorithm can then apply these rules to the new classification problems. For instance, a classifier can predict borrowers who cheat on loan payments. Regression/Prediction uses the historical relationship between a dependent and one or more independent variables to predict values of the dependent variable.
Image of page 1
Image of page 2

You've reached the end of your free preview.

Want to read all 4 pages?

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture