Unformatted text preview: PART I
Foundations of Business Analytics (BA) CHAPTER 1
Business Analytics (BA) at a Glance
Introduction to Business Analytics
Business Analytics and Its Importance in Modern Business Decisions
Types of Business Analytics
Tools of Business Analytics
Descriptive Analytics: Graphical and Numerical Methods in BA
Tools of Descriptive Analytics
Most Widely Used Predictive Analytics Models
Data Mining, Regression Models, and Time Series Forecasting
Other Predictive Analytics Models
Recent Applications and Tools of Predictive Modeling
Other Areas Associated with Predictive Modeling
Data Mining, Machine Learning, Neural Network, and Deep Learning
Prescriptive Analytics and Tools of Prescriptive Analytics
Applications and Implementation
Summary and Application of Business Analytics (BA) Tools:
Analytical Models and Decision Making Using Models
Glossary of Terms Related to Analytics
Introduction to Business Analytics
A recent trend in data analysis is the emerging field of business analytics
This book deals with BA—an emerging area in modern business decision
BA is a data driven decision making approach that uses statistical and
quantitative analysis, information technology, and management science
(mathematical modeling, simulation), along with data mining and factbased data to measure past business performance to guide an organization
in business planning and effective decision making.
BA tools are also used to visualize and explore the patterns and trends in the
data to predict future business outcomes with the help of forecasting and
In this age of technology, companies collect massive amount of data.
Successful companies use their data as an asset and use them for competitive
advantage. Most businesses collect and analyze massive amounts of data
referred to as Big Data using specially designed big data software and data
analytics. Big data analysis is now becoming an integral part of BA. The companies use BA tools as an organizational commitment to data-driven
decision making. BA helps businesses in making informed business decisions.
It is also critical in automating and optimizing business processes.
BA makes extensive use of data and descriptive statistics, statistical analysis,
mathematical and statistical modeling, and data mining to explore, investigate
and understand the business performance. Through data, BA helps to gain
insight and drive business planning and decisions. The tools of BA focus on
understanding business performance based on the data and a number of
models derived from statistics, management science, and operations research
BA also uses statistical, mathematical, optimization, and quantitative tools for
explanatory and predictive modeling .
Predictive modeling uses statistical models, such as, different types of
regression to predict outcomes  and is synonymous with the field of data
mining and machine learning. It is also referred to as predictive analytics. We
will provide more details and tools of predictive analytics in subsequent
Business Analytics and Its Importance in Modern Business
BA helps to address, explore and answer a number of questions that are
critical in driving business decisions. It tries to answer the following
What is happening and Why did something happen?
Will it happen again?
What will happen if we make changes to some of the inputs?
What the data is telling us that we were not able to see before?
BA uses statistical analysis and predictive modeling to establish trends,
figuring out why things are happening, and making a prediction about how
things will turn out in the future.
BA combines advanced statistical analysis and predictive modeling to give us
an idea of what to expect so that you can anticipate developments or make
changes now to improve outcomes.
BA is more about anticipated future trends of the key performance indicators.
This is about using the past data and models to make predictions. This is
different from the reporting in business intelligence (BI). Analytics models use
the data with a view to drawing out new, useful insights to improve business
planning and boost future performance. BA helps the company adapt to the
changes and take advantage of future developments.
One of the major tools of analytics is Data Mining, which is a part of
predictive analytics. In business, data mining is used to analyze business data.
Business transaction data along with other customer and product related data
are continuously stored in the databases. The data mining software are used to
analyze the vast amount of customer data to reveal hidden patterns, trends,
and other customer behavior. Businesses use data mining to perform market
analysis to identify and develop new products, analyze their supply chain, find
the root cause of manufacturing problems, study the customer behavior for
product promotion, improve sales by understanding the needs and
requirements of their customer, prevent customer attrition and acquire new customers. For example, Wal-Mart collects and processes over 20 million
point-of-sale transactions every day. These data are stored in a centralized
database, and are analyzed using data mining software to understand and
determine customer behavior, needs and requirements. The data are analyzed
to determine sales trends and forecasts, develop marketing strategies, and
predict customer-buying habits
A large amount of data and information about products, companies, and
individuals are available through Google, Facebook, Amazon, and several
other sources. Data mining and analytics tools are used to extract meaningful
information and pattern to learn customer behavior. Financial institutions
analyze data of millions of customers to assess risk and customer behavior.
Data mining techniques are also used widely in the areas of science and
engineering, such as bioinformatics, genetics, medicine, education, and
electrical power engineering.
BA, data analytics, and advanced analytics are growing areas. They all come
under the broad umbrella of BI. There is going to be an increasing demand of
professionals trained in these areas. Many of the tools of data analysis and
statistics discussed here are prerequisite to understanding data mining and
BA. We will describe the analytics tools including data analytics, advanced
analytics later in this chapter.
Types of Business Analytics
The BA area can be divided into different categories depending upon the types
of analytics and tools being used. The major categories of BA are:
Each of the previous categories uses different tools and the use of these
analytics depends on the type of business and the operations a company is
involved in. For example, an organization may only use descriptive analytics
tools; whereas another company may use a combination of descriptive and
predictive modeling and analytics to predict future business performance to
drive business decisions. Other companies may use prescriptive analytics to
optimize business processes.
Tools of Business Analytics
The different types of analytics and the tools used in each.
1.Descriptive analytics: graphical and numerical methods and tools in BA
Descriptive analytics involves the use of descriptive statistics including the graphical
and numerical methods to describe the data.
Descriptive analytics tools are used to understand the occurrence of certain business
phenomenon or outcomes and explain these outcomes through graphical, quantitative
and numerical analysis. Through the visual and simple analysis using the collected
data we can visualize and explore what has been happening and the possible reasons
for the occurrence of certain phenomenon. Many of the hidden patterns and features
not apparent through mere examination of data can be exposed through graphical and numerical analysis. Descriptive analytics uses simple tools to uncover many of the
problems quickly and easily. The results enable us question many of the outcomes so
that corrective actions can be taken.
Successful use and implementation of descriptive analytics requires the understanding
of types of data, graphical/visual representation of data, and graphical techniques
using computer. The graphical and visual techniques are explained in detail
in Chapter 4. The descriptive analytics tools include the commonly used graphs and
charts along with some newly developed graphical tools such as, bullet graphs, tree
maps, and data dashboards. Dashboards are now becoming very popular with big
data. They are used to display the multiple views of the business data graphically.
The other aspect of descriptive analytics is an understanding of numerical methods
including the measures of central tendency, measures of position, measures of
variation, and measures of shape, and how different measures and statistics are used
to draw conclusions and make decision from the data. Some other topics of interest
are the understanding of Empirical Rule and the relationship between two variables—
the covariance, and correlation coefficient. The tools of descriptive analytics are
helpful in understanding the data, identifying the trend or patterns in the data, and
making sense from the data contained in the databases of companies. The
understanding of databases, data warehouse, web search and query, and big data
concepts are important in extracting and applying descriptive analytics tools. Figure 1.1 Tools of descriptive analytics
Tools of Descriptive Analytics: Figure 1.1 outlines the tools and methods used in
descriptive analytics. These tools are explained in subsequent chapters.
Predictive Analytics: As the name suggests predictive analytics is the application of
predictive models to predict future business outcomes and trends.
Most Widely Used Predictive Analytics Models
The most widely used predictive analytics models are regression, forecasting,
and data mining techniques. These are briefly explained in the following.
Data mining techniques are used to extract useful information from huge
amounts of data using predictive analytics, computer algorithms, software,
mathematical, and statistical tools. Regression models are used for predicting the future outcomes. Variations of
regression models include: (a) Simple regression models, (b) Multiple
regression models, (c) Non-linear regression models including the quadratic
or second-order models, and polynomial regression models, (d) Regression
models with indicator or qualitative independent variables, and (e) Regression
models with interaction terms or interaction models. Regression models are
one of the most widely used models in various types of applications. These
models are used to explain the relationship between a response variable and
one or more independent variables. The relationship may be linear or
curvilinear. The objective of these regression models is to predict the response
variable using one or more independent variables or predictors.
Forecasting techniques are widely used predictive models that involve a class
of Time Series Analysis and Forecasting models. The commonly used
forecasting models are regression based models that uses regression analysis
to forecast future trend. Other time series forecasting models are simple
moving average, moving average with trend, exponential smoothing,
exponential smoothing with trend, and forecasting seasonal data. All these
predictive models are used to forecast the future trend. Figure 1.2 shows the
widely used tools of predictive analytics.
Other Predictive Analytics Tools
Besides the tools described in Figure 1.2, an understanding of a number of
other analytics tools is critical in describing and drawing meaningful
conclusions from the data. These include: (a) Probability theory and its role in
decision making, (b) Sampling and inference procedures, (c) Estimation and
confidence intervals, (d) Hypothesis testing/inference procedures for one and
two population parameters, and (e) Chi-square and non-parametric tests. The
understanding of these tools is critical in understanding and applying
inferential statistics tools—a critical part of data analysis and decision making.
These tools are outlined in Figure 1.3. Figure 1.2 Tools of predictive analytics Figure 1.3 Prerequisite to predictive analytics
Additional Tools and Applications of Predictive Analytics
Predictive analytics methods are also used in detecting anomalies (or outlier)
detection, patterns, association learning, and the concepts of classification
and clustering to predict the probabilities and future business outcomes. We
briefly describe here anomaly, association learning, classification, and
clustering. Figure 1.4 shows the broad categories and applications of
predictive analytics. Figure 1.4 Categories of predictive analytics
Association learning is used to identify the items that may co-occur and the
possible reasons for their co-occurrence. Classification and clustering
techniques are used for association learning.
Anomaly detection is also known as outlier detection and is used to identify
specific events, or items, which do not conform to usual or expected pattern in
the data. Typical example would be the detection of bank fraud.
Classification and clustering algorithms are used to divide the data into
categories or classes. The purpose is to predict the probabilities of future
outcomes based on the classification. Clustering and classification both divide
the data into classes and therefore, seem to be similar but they are two
different techniques. They are learning techniques used widely to obtain
reliable information from a collection of raw data. Classification and
clustering are widely used in data mining.
Classification is a process of assigning items to pre specified classes or
categories. For example, a financial institution may study the potential
borrowers to predict whether a group of new borrowers may be classified as having a high degree of risk. Spam filtering is another example of
classification, where the inputs are e-mail messages that are classified into
classes as “spam” and “no spam.”
Classification uses the algorithms to categorize the new data according to the
observations of the training set. Classification is a supervised learning
technique where a training set is used to find similarities in classes. This
means that the input data are divided into two or more classes or categories
and the learner creates a model that assigns inputs to one or more of these
classes. This is typically done in a supervised way. The objects are classified on
the basis of the training set of data.
The algorithm that implements classification is known as the classifier. Some
of the most commonly used classification algorithms are K-Nearest Neighbor
algorithm and decision tree algorithms. These are widely used in data mining.
An example of classification would be credit card processing. A credit card
company may want to segment customer database based on similar buying
Clustering technique is used to find natural groupings or clusters in a set of
data without pre specifying a set of categories. It is unlike classification where
the objects are classified based on pre specified classes or categories. Thus,
clustering is an unsupervised learning technique where a training set is not
used. It uses statistical tools and concepts to create clusters with similar
features within the data. Some examples of clustering are:
•Cluster of houses in a town into neighborhoods based on similar features like houses
with overall value of over million dollars.
•Marketing analyst may define distinct groups in their customer bases to develop
targeted marketing programs.
•City-planning may be interested in identifying groups of houses according to their
house value, type, and location.
•In cellular manufacturing, the clustering algorithms are used to form the clusters of
similar machines and processes to form machine-component cells.
•Scientists and Geologists may study the Earthquake epicenters to identify clusters of
fault lines with high probability of possible earthquake occurrences.
Main Article: Cluster Analysis
Cluster analysis is the assignment of a set of observations into subsets
(called clusters) so that observations within the same cluster are similar
according to some pre specified criterion or criteria, while observations drawn
from different clusters are dissimilar. Clustering techniques differ in
application and make different assumptions on the structure of the data. In
clustering, the clusters are commonly defined by some similarity metric or
similarity coefficient and may be evaluated by internal
compactness (similarity between members of the same cluster)
and separation between different clusters. Other clustering methods are
based on estimated density and graph connectivity. It is important to note
that clustering is unsupervised learning, and commonly used method
in statistical data analysis. The Difference Between Clustering and Classification
Clustering is an unsupervised learning technique used to find groups or
clusters of similar instances on the basis of features. The purpose of clustering
is a process of grouping similar objects to determine whether there is any
relationship between them. Classification is a supervised learning technique
used to find similarities in classification based on a training set. It uses
algorithms to categorize the new data according to the observations in the
Other Areas Associated with Predictive Analytics
Figure 1.5 outlines recent applications and tools of predictive analytics.
The tools outlined in the Figure 1.5 are briefly explained in the following.
Extensive applications have emerged in recent years using these methods,
which are hot topics of research. A number of applications in business,
engineering, manufacturing, medicine, signal processing, and computer
engineering using machine learning, neural networks, and deep learning are
being reported. Figure 1.5 Recent applications and tools of predictive modeling
Machine Learning, Data Mining, and Neural Networks
In the broad area of data and predictive analytics, machine learning is a
method used to develop complex models and algorithms that are used to make
predictions. The analytical models in machine learning allow the analysts to
make predictions by learning from the trends, patterns, and relationships in
the historical data. Machine learning automates model building. The
algorithms in machine learning are designed to learn iteratively from data
without being programmed.
According to Arthur Samuel, machine learning gives “computers the ability to
learn without being explicitly programmed.” Samuel, an American pioneer in the field of computer gaming and artificial intelligence, coined the
term “machine learning” in 1959 while at IBM.
Machine learning algorithms are extensively used for data-driven predictions
and in decision making. Some applications where machine learning has been
used are e-mail filtering, detection of network intruders or detecting a data
breach, optical character recognition (OCR), learning to rank, and computer
vision. Machine learning is employed in a range of computing tasks. Often
designing and programming explicit algorithms that are reproducible and
have repeatability with good performance is difficult or infeasible.
Machine Learning and Data Mining
Machine learning and data mining are similar in some ways and often overlap
in applications. Machine learning is used for prediction, based
on known properties learned from the training data; whereas data mining
algorithms are used for discovery of (previously) unknown patterns. Data
mining is concerned with knowledge discovery in databases (KDD).
Data mining uses many machine learn...
View Full Document
What students are saying
As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.
Kiran Temple University Fox School of Business ‘17, Course Hero Intern
I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.
Dana University of Pennsylvania ‘17, Course Hero Intern
The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.
Jill Tulane University ‘16, Course Hero Intern