# Q1 1/1 point graded De±ne a dataset using the following...

• 6
• 95% (19) 18 out of 19 people found this document helpful

This preview shows page 1 - 2 out of 6 pages.

5/9/2020 Comprehension Check: Logistic Regression | 3.1: Linear Regression for Prediction | PH125.8x Courseware | edX 1/6 Course Section… 3.1: Lin… Compr… Comprehension Check: Logistic Regression Q1 1/1 point (graded) De±ne a dataset using the following code: set.seed(2) #if you are using R 3.5 or earlier set.seed(2, sample.kind="Rounding") #if you are using R 3.6 or later make_data <- function(n = 1000, p = 0.5, mu_0 = 0, mu_1 = 2, sigma_0 = 1, sigma_1 = 1){ y <- rbinom(n, 1, p) f_0 <- rnorm(n, mu_0, sigma_0) f_1 <- rnorm(n, mu_1, sigma_1) x <- ifelse(y == 1, f_1, f_0) test_index <- createDataPartition(y, times = 1, p = 0.5, list = FALSE) list(train = data.frame(x = x, y = as.factor(y)) %>% slice(-test_index), test = data.frame(x = x, y = as.factor(y)) %>% slice(test_index)) } dat <- make_data() Note that we have de±ned a variable x that is predictive of a binary outcome y dat\$train %>% ggplot(aes(x, color = y)) + geom_density() . Set the seed to 1, then use the make_data() function de±ned above to generate 25 di²erent datasets with :