View the step-by-step solution to:

# A Simulation Study on Logisctis Regression ## First define a function to generate the simulated sample. Use the following function to generate the...

A Simulation Study on Logisctis Regression

## First define a function to generate the simulated sample. Use the following function to generate the simulated data.

sim_quadratic_logistic_data = function(sample_size = n)

{ x = rnorm(n = sample_size)

eta = -1 + 0.5 * x + x ^ 2

p = 1 / (1 + exp(-eta))

y = rbinom(n = sample_size, size = 1, prob = p)

data.frame(y, x) }

## You will need the following two lines in the simulation of 2000 runs and n = 50.

sim_data = sim_quadratic_logistic_data(sample_size = 50)

fit_logistic = glm(y ~ x + I(x^2), data = sim_data, family = binomial)

set.seed(12345) # you have to use this seed to initiate the 2000 simulations BUT place it outside the loop ### Repeat the simulation 2000 times with sample size = 50. Then check if the simulated results confirm the claims we typically obtained from running the logistic regression through glm( ... , family = binomial). More specifically,

(1) check if the simulated values of the coefficient of the linear term is following approximately a normal distribution;

(2) check if the simulated values of the coefficient of the quadratic term is following approximately a normal distribution.

(3) how many of the 2000 simulated 95% confidence intervals for the coefficient of the linear term contain the true value?

(4) how many of the 2000 simulated 95% confidence intervals for the coefficient of the quadratic term contain the true value?

(5) Obtain a scatterplot of the 2000 simulated pairs of the estimates of the coefficients of the linear and quadratic terms; can a ellipse capture the majority of these points? What are the implications of your conclusion?

(6) What other useful information you would be able to learn or extract from this simulation? Explain.

(7) Refer to the attached "Given_Data"; sample size = 50. Assume the same logistic model inside the function defined above for generating the simulated data. Can we use a similar simulation to find an approximate p-value for testing the quadratic term is equal to zero for the given data? Why or why not? Carry out the work if your answer is yes; otherwise explain why not.

### Why Join Course Hero?

Course Hero has all the homework and study help you need to succeed! We’ve got course-specific notes, study guides, and practice tests along with expert tutors.

### -

Educational Resources
• ### -

Study Documents

Find the best study resources around, tagged to your specific courses. Share your own to gain free Course Hero access.

Browse Documents
• ### -

Question & Answers

Get one-on-one homework help from our expert tutors—available online 24/7. Ask your own questions or browse existing Q&A threads. Satisfaction guaranteed!

Ask a Question