Chapter 9
Nonparametric Regression Analysis
Some type of regression is one of the most used statistical techniques. It involves a relationship
(usually linear) between a predictor (also called independent variable) x (assumed to be non-
random!) and a random variable, Y, called the response variable (or dependent variable because
its value depends on the value of x).
General Setup
•
Let x
1
, x
2
, …, x
n
be
known values
of an independent variable
•
Model the
response
variable corresponding to each x
i
as
o
Y
i
= α + βx
i
+ e
i
, for i = 1, 2, …, n, where
α and β are the intercept and the slope of the true regression line (all unknown)
e
1
, e
2
, …, e
n
is a random sample from a population that has continuous
distribution.
Since the e’s come from the same population, they have the same distribution.
Also, because we have a random sample, the e’s are independent of each other.
•
Objective:
Make inferences about the slope, β
o
Point estimation of β
o
Confidence interval for β
o
Testing hypotheses about β
Ho: β = β
0
vs. Ha: β > β
0
Ho: β = β
0
vs. Ha: β < β
0
Ho: β = β
0
vs. Ha: β ≠ β
0
Hypothesis Testing – Theil’s Test
•
Suppose we want to test Ho: β = β
0
vs. Ha: β > β
0
•
Remember the regression model, Y
i
= α + βx
i
+ e
i
, for i = 1, 2, …, n,
•
For each i, i = 1, 2, …, n, define D
i
= Y
i
– β
0
x
i
•
Look at
•
D
i
values for each x
i
are given in the general case, as well as under Ho and under Ha in
the following table:
x
D
D under Ho
D under Ha
x
1
D
1
= Y
1
– β
0
x
1
D
1
= α + e
1
D
1
= α + (β – β
0
)x
1
+ e
1
x
2
D
2
= Y
2
– β
0
x
2
D
2
= α + e
2
D
2
= α + (β – β
0
)x
2
+ e
2
x
n
D
n
= Y
n
– β
0
x
n
D
n
= α + e
n
D
n
= α + (β – β
0
)x
n
+ e
n
•
Under Ho, the D
i
’s are independent and identically distributed random variables and D
i
and x
i
are independent.
•
Under Ha, D
i
is positively correlated with x=i
At this stage we have n pairs of (x
i
, D
i
):