6_Logistic - What is Logistic Regression Advanced Topics in...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Advanced Topics in Forest Biometrics FOR6934 Logistic Regression What is Logistic Regression? A form of regression that allows the prediction of discrete (categorical) variables using continuous and/or discrete predictors Addresses the same questions that discriminant analysis and multiple regression do but without distributional assumptions on the predictors the predictors need not be linearly related to the dependent variable Homoscedasticity (equal variance in each group) not necessary Uses a logit link function to transform the probability, p: p 1 ln ) logit( Why use Logistic Regression? Logistic regression is often used when: The dependent variable is discrete The relationship between the dependent and independent variables is non-linear Example: the probability of forking changes very little with for slow growing trees, but a 30- cm change in height greatly increases the probability of forking Questions Can categories be correctly predicted given a set of predictors? Usually once this is established the predictors are manipulated to see if the equation can be simplified. Comparison of equation with predictors plus intercept to a model with just the intercept What is the strength of association between the outcome variable and a set of predictors? What is the relative importance of each predictor? How does each variable affect the outcome? Does a predictor make the solution better / worse / have no effect? Questions - 2 Are there interactions among predictors? Does adding interactions among predictors (continuous or categorical) improve the model? Continuous predictors may have to be “centered” (more on this later!) in order to avoid multicollinearity when interactions are present How good is the model? Can parameters be accurately predicted? Can the model accurately classify cases for which the outcome is known? How good is the “fit” of the model? Assumptions The outcome must be discrete There must be “enough” responses in every given category If there are too many cells with no responses: parameter estimates and standard errors can “blow up” groups may be perfectly separable (e.g. multicollinear) which makes maximum likelihood estimation impossible If the distributional assumptions of discriminant analysis (DA) or multiple regression are met, they may be more powerful Note, however, that DA has been shown to overestimate the association using discrete predictors in some cases
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 Assumptions - 2 There must be a linear relationship in the logit The systematic component of the equation should have a linear relationship with the logit form of the dependent variable The “usual assumptions”: Lack of multicollinearity Lack of “driving” outliers Independent data Odds ratios The odds in favor of an event are: p / (1 − ) , where: is the probability of the event Odds are usually written as:
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 6

6_Logistic - What is Logistic Regression Advanced Topics in...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online