{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

6_Logistic_v2

# 6_Logistic_v2 - What is Logistic Regression Advanced Topics...

This preview shows pages 1–3. Sign up to view the full content.

1 Advanced Topics in Forest Biometrics FOR6934 Logistic Regression What is Logistic Regression? A form of regression that allows the prediction of discrete (categorical) variables using continuous and/or discrete predictors Addresses the same questions that discriminant analysis and multiple regression do but without distributional assumptions on the predictors the predictors need not be linearly related to the dependent variable Homoscedasticity (equal variance in each group) not necessary Uses a logit link function to transform the probability, p: p p p 1 ln ) logit( Why use Logistic Regression? Logistic regression is often used when: The dependent variable is discrete The relationship between the dependent and independent variables is non-linear Example: the probability of forking changes very little with for slow growing trees, but a 30- cm change in height greatly increases the probability of forking Questions Can categories be correctly predicted given a set of predictors? Usually once this is established the predictors are manipulated to see if the equation can be simplified. Comparison of equation with predictors plus intercept to a model with just the intercept What is the strength of association between the outcome variable and a set of predictors? What is the relative importance of each predictor? How does each variable affect the outcome? Does a predictor make the solution better / worse / have no effect? Questions - 2 Are there interactions among predictors? Does adding interactions among predictors (continuous or categorical) improve the model? Continuous predictors may have to be “centered” (further reading) in order to avoid multicollinearity when interactions are present How good is the model? Can parameters be accurately predicted? Can the model accurately classify cases for which the outcome is known? How good is the “fit” of the model? Assumptions The outcome must be discrete There must be “enough” responses in every given category If there are too many cells with no responses: parameter estimates and standard errors can “blow up” groups may be perfectly separable (e.g. multicollinear) which makes maximum likelihood estimation impossible If the distributional assumptions of discriminant analysis (DA) or multiple regression are met, they may be more powerful Note, however, that DA has been shown to overestimate the association using discrete predictors in some cases

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document