Logistic Regression:
Models for which the response y is a qualitative variable at two levels (success or failure)
We use dummy (coded 0 ,1) variables to represent the qualitative response variable.
The mean response E(y) represents the probability that a new business with certain owner-related
characteristics will be a success.
When response is binary, the expected response:
has special meaning:
Let π = P(y=1) and 1 − π = P(y=0), 0 ≤ π ≤ 1.
E(y) = Ʃyi· p(y)
= (1)· P(y=1) + (0) · P(y=0)
= P(y=1)
= π
(π is the probability that y=1 for given values ofx1,x2,...,xk)
PROBLEMS FOR FITTING A LINEAR MODEL WITH BINARY RESPONSE
OLS Assumption
The error term has a population mean of zero -> unbiased model
All independent variables are uncorrelated with the error term (unpredictable random error)
Observations of the error term are uncorrelated with each other (no pattern in residual)
The error term has a constant variance.
No independent variable is a perfect linear function of other explanatory variables( multicollinearity)
The error term is normally distributed.
When OLS is used to fit models with a binary response, the wellknown problems are:
1.
Non-normal errors:
y=1, ε = 1 − (β0 + β1x)
y=0, ε = − β0 − β1x.
2.
Unequal variances
The variance σ2 of the random error is a function of π, the probability that the response y equals 1.

3.Restricting the predicted response to be between 0 and 1Since the predicted value Y^ estimates E(y) = π, the probability that the response y equals 1, we would like ?have the property that 0 ≤?
to