Regression Analysis III
Model buliding
Variable selection
Residual Analysis
General Linear Model
Models in which the parameters (0, 1, . . . , p ) all
have exponents of one are called linear models.
A general linear model involving p independent
variables

Experimental Design and Analysis of
Variance
Elements of a designed experiment
Completely Randomized Designs
Multiple comparison of means
Randomized Block Design
Factorial Experiments
Elements of experimental design
Statistical studies can be classifi

Test of Hypotheses
Hypothesis: An assumption, theory, or claim concerning a
parameter of a population.
For example:
Average annual salary of business undergraduate students =
$60,000
Average mile per gallon of a U.S.-build SUV = 20 miles /
gallon.
A na

Optimization
In this section we are going to look at optimization problems. In
optimization problems we are looking for the largest value or the
smallest value that a function can take. We saw how to solve
one kind of optimization problem in the Absolute

ONE WAY ANOVA
Sum of
D.F
Source
Square
.
s
Among Groups
c-1
SSA
Within Groups
n-c
SSW
Total
n-1
SST
Note
SSA = MSA(c - 1) ;
Mean Square (Variance)
F
MSA = SSA / (c 1)
MSW = SSW / (n c)
FSTAT = MSA / MSW
=
Tukey-Kramer Procedure
SST = SSW + SSA
MSE
r
Step

The Definition of the Definite Integral
In this section we will formally define the definite integral and
give many of the properties of definite integrals. Lets start off
with the definition of a definite integral.
Definite Integral
Given a function
that

Resampling Method
Resampling methods are an indispensable tool in modern statistics
They involve repeatedly drawing samples from a training set and refitting a model
of interest on each sample in order to obtain additional information about the
fitted mod

Linear Regression
Linear regression is a useful tool for predicting a quantitative response
The importance of having a good understanding of linear regression before
studying more complex learning methods cannot be overstated
Simple Linear Regression
It i

Statistical Learning
Example: Suppose that we are statistical consultants hired by a client to provide
advice on how to improve sales of a particular product. The Advertising data set
consists of the sales of that product in 200 different markets, along w

Multiple Linear Regression
Simple linear regression is a useful approach for predicting a response on the
basis of a single predictor variable
However, in practice we often have more than one predictor.
One option is to run three separate simple linear re

Linearity
Linear models are relatively simple to describe and implement, and have
advantages over other approaches in terms of interpretation and inference
However, standard linear regression can have significant limitations in terms of
predictive power
P

Linear Model
The linear model has distinct advantages in terms of inference and, on real-world
problems, is often surprisingly competitive in relation to non-linear methods.
Alternative fitting procedures can yield better prediction accuracy and model
int

Support Vector Machine
Support Vector Machines have been shown to perform well in a variety of
settings, and are often considered one of the best out of the box classifiers.
The support vector machine is a generalization of a simple and intuitive classifi

Models
There is no free lunch in statistics: no one method dominates all others over all
possible data sets
On a particular data set, one specific method may work best, but some other
method may work better on a similar but different data set
Quality of F

Basics of Statistics
Statistical learning refers to a set of tools for modeling and understanding
complex datasets
It is a recently developed area in statistics and blends with parallel developments
in computer science and, in particular, machine learning

Classification
The linear regression model assumes that the response variable Y is quantitative.
But in many situations, the response variable is instead qualitative
Often qualitative variables are referred to as categorical; we will use these terms
inter

ONE WAY ANOVA
Sum of
D.F
Source
Square
.
s
Among Groups
c-1
SSA
Within Groups
n-c
SSW
Total
n-1
SST
Note
SSA = MSA(c - 1) ;
Mean Square (Variance)
F
MSA = SSA / (c 1)
MSW = SSW / (n c)
FSTAT = MSA / MSW
=
It would take [RE] times as many observations in a

Homework 7
Due On Monday December 5th, 2016 at 11:50 PM
Please Note: I will not post a spreadsheet for this assignment, Instead
this assignment is based on the spreadsheets bank.xls and
bankWithInteraction.xls. You need to create a spreadsheet called
HW7.

Inference on two population means and
proportions
Inferences About the Difference Between
Two Population Means: s 1 and s 2 Known
Inferences About the Difference Between
Two Population Means: s 1 and s 2 Unknown
Inferences About the Difference Between
Two