HW4_Peer_Assessment.pdf - Homework 4 Peer Assessment...

This preview shows page 1 out of 1 page.

Homework 4 Peer Assessment Background The owner of a company would like to be able to predict whether employees will stay with the company or leave. The data contains information about various characteristics of employees. See below for the description of these characteristics. Data Description The data consists of the following variables: 1. Age.Group : 1-9 (1 corresponds to teen, 2 corresponds to twenties, etc.) (numerical) 2. Gender : 1 if male, 0 if female (numerical) 3. Tenure : Number of years with the company (numerical) 4. Num.Of.Products : Number of products owned (numerical) 5. Is.Active.Member : 1 if active member, 0 if inactive member (numerical) 6. Staying : Fraction of employees that stayed with the company for a given set of predicting variables Note: Please do not treat any variables as categorical. Read the data A data.frame: 6 × 8 Age.Group Gender Tenure Num.Of.Products Is.Active.Member Stay Employees Staying <int> <int> <int> <int> <int> <int> <int> <dbl> 1 2 1 3 1 0 5 11 0.4545455 2 2 1 4 1 0 5 10 0.5000000 3 2 1 4 1 1 2 13 0.1538462 4 2 0 7 1 0 3 10 0.3000000 5 2 1 7 1 0 2 14 0.1428571 6 2 0 4 2 0 4 12 0.3333333 Question 1: Fitting a Model - 6 pts Fit a logistic regression model using Staying as the response variable with Num.Of.Products as the predictor and logit as the link function. Call it model1 . (a) 2 pts - Display the summary of model1. What are the model parameters and estimates? Call: glm(formula = Staying ~ Num.Of.Products, family = "binomial", data = data, weights = Employees) Deviance Residuals: Min 1Q Median 3Q Max -4.2827 -1.4676 -0.1022 1.4490 4.7231 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 2.1457 0.1318 16.27 <2e-16 *** Num.Of.Products -1.7668 0.1031 -17.13 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 981.04 on 157 degrees of freedom Residual deviance: 632.04 on 156 degrees of freedom AIC: 1056.8 Number of Fisher Scoring iterations: 4 The model parameters are β0 and β1 whose estimates are: β0 : 2.1457 β1 : -1.7668 In logistic regression since there are no error terms there is no parameter for variance as we have in standard linear regression (b) 2 pts - Write down the equation for the odds of staying. The equation for odds of staying is (c) 2 pts - Provide a meaningful interpretation for the coefficient for Num.Of.Products with respect to the log-odds of staying and the odds of staying. With a 1-unit increase of Num.Of.Products, the log odds of an employee staying in the company decreases by 1.7668. With a 1-unit increase in Num.Of.Products the odds of staying decreases by 82.92%. This is found as Question 2: Inference - 9 pts (a) 3 pts - Using model1, find a 90% confidence interval for the coefficient for Num.Of.Products . Waiting for profiling to be done... 5 %: -1.93836096298482 95 %: -1.59896517002452 The 90% confidence interval for Num.Of.Products is ~ (-1.94,-1.60). (b) 3 pts - Is model1 significant overall? How do you come to your conclusion? 0 We will use a chi-square test to compare the fitted model to the null model.We see above that the our calculated p-value is very close to 0, this therefore means that the model is statistically significant.

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture