Session 4 Workshop - SolutionLogistic regression I: Binary logistic regressionMSBX-5130: Customer Analytics2/11/20191) Objectives & setup•Workshop task: Estimate “demand” for a potential partner•We will use online dating data on profile views for inference–Website users browse profiles of potential partners–After viewing, they decide whether or not to send the profile owner an email–Outcomes = send email (1) or not (0)–We observe certain characteristics of the profile owner and the “match” with browsing user•Using these data we will demonstrate how to:–Estimate a binary logit model usingglm()–Predict expected utilties for profiles and the probability of email contact–Calculate marginal effects (average effect on outcome probabilities)•Here is the data description:You have access to online dating profile viewing data. In total, we observe 160,000 profile views and assocaitedoutcomes (send email or not). The data are in the fileOnline-Dating.RData(the file is available on Canvas).The variables in the dataset are:VariableDescriptionprofile_genderGender of person in profile, male or femalefirst_contact1 = first-contact e-mail sent, 0 = otherwiseageAge of the person in the profile, in yearsage_older1 = potential mate in profile is at least 5 years olderage_younger1 = potential mate in profile is at least 5 years youngerlooksNumerical looks ratingheightInchesheight_taller1 = potential mate at least 2 inches tallerheight_shorter1 = potential mate at least 2 inches shorterbmiBody mass indexyrs_educationYears of educationeduc_more1 = potential mate has at least 2 more years of educationeduc_less1 = potential mate has at least 2 years less of educationincome$1,000 annual incomediff_ethnicity1 = potential mate has different ethnicity than browserWorkshop task workflow1. Setup1. Download data & R Markdown file2. Import data3. Subset and summarize data2. Model estimation and comparisom1. Simple logit model1
2. Logit model with all available regresors3. Model prediction1. Baseline prediction - mean utilities (V)1. Usingpredict()2. Using matrix algebra3. Show equivalence of methods2. Baseline prediction - choice probabilties (Pr(first_contact=1))1. Usingpredict()2. Using predicted mean utilities3. Show equivalence of methods4. Marginal effects1. Computation of marginal effects1. UsingmaBina()2. Using predicted expected utilities2. Application of marginal effects1. Average effect on email probability from 5% increase inincome2. Average effect on email probability from 25% increase inincome1.1) Download data & R Markdown fileIf you have not already done so, download the data fileOnline-Dating.RDatafrom Canvas (available in theSession 4 module). Also download this R markdown file,Session4-Workshop.Rmd.Now launch RStudio, and change the working directory to where you have downloaded the previouslymentioned files.