Statistical Data Mining ORIE 474 Fall 2007 Tatiyana Apanasovich 10/15/07 Generalized Linear Models & Logistic Regression

11.3 Generalized Linear Models (GLMs) Recall: Linear model (i) The Y(i) are independent random variables, with distribution N(μ(i), σ 2 ), where μ(i)=Σ j=0,. .,p a j x j (i) (ii) The parameters enter the model in a linear way via the sum ν(i)=Σ j=0,. .,p a j x j (i) (iii)The ν(i) and μ(i) are linked by ν(i)=μ(i) This permits 2 immediate generalizations in (i) and (iii), while retaining the advantages of (ii)
Generalizations of Linear Models to GLMs In (i): We can relax the requirement that Y(i) follow a normal distribution In (iii): We can generalize the link to a more general link function g : ν(i) = g(μ(i)) or Σ j=0,. .,p a j x j (i) = g(μ(i))

Ex: Logistic Regression Consider the following situation: The observed response in binary, that is y(i) in {0,1} Ex: Whether or not a person gets a certain question right Whether or not a person is diagnosed with a certain disease Assume: Probability that i th object yields 1 is p(i) Responses of different objects are independent This is the generalization of (i) above, that is, the Bernoulli distribution is replacing the normal distr.
