Numerical variables hotel spend holiday spend flight

  • No School
  • AA 1
  • 30

This preview shows page 10 - 13 out of 30 pages.

Numerical variables: Hotel Spend, Holiday Spend, Flight Spend, Days since first book, Days since last book, total bookings till date Discrete Numerical variables: Days since first book, Days since last book, total bookings till date Continuous Numerical variables: Hotel Spend, Holiday Spend, Flight Spend Categorical variables: Has an app, Discount seeker, Frequency score Ordinal Categorical variables: Frequency score Nominal variables: Has an app, Discount seeker b. Univariate and Bi-variate analysis Univariate analysis: This is done to get the descriptive statistics for each of the variable (recall the box plot, histogram in statistics, term 1) o Univariate helps understand the central tendency (mean, mode, median) and dispersion (Range, quartile, IQR) for continuous variables o It helps identify the outliers and missing values o Note: for categorical variables, we use frequency distribution o Example : below is a snapshot of univariate analysis of living area of house from JMP!
Image of page 10

Subscribe to view the full document.

Analytics Handbook Business Technology Club | Indian School of Business 10 Bi-variate analysis: This is done to identify the relationship between two variables o For 2 numerical continuous variables, we use scatter plot o For 2 categorical variables, we use stacked column charts/ chi-square test o For categorical & continuous, we use Z-test or ANOVA o Example: below is the scatterplot of different stock returns from JMP c. Missing value treatment and outlier detection Missing values can be due to issues with data collection or with data extraction (e.g. while importing data in Excel)
Image of page 11
Analytics Handbook Business Technology Club | Indian School of Business 11 Missing values can be treated by either imputation (filling the value with mean/ median/ mode) or by removing the data row (however, this lowers the sample size) Outliers are observations that are far away from rest of the values and diverge from overall sample. Note, not all outliers are bad. Sometimes outliers can reveal interesting insights also Outliers can be due to data collection/ data extraction/ natural (i.e. they are corrrect) Outliers can be removed by binning (that is grouping the data) or by imputation (similar to missing value treatment) d. Feature engineering and variable transformations This step involves extraction of more information from existing data To do this, we can transform the variables or create additional variables Variable transformation: This includes replacing variable with the logarithm/ square root/ exponential of the variable (to name a few functions) o Variable transformation helps change the scale and the distribution o For example, below is the distribution of living area from JMP. While the original data is skewed, the log (living area) is a normal distribution. This transformation makes it easier to apply statistical models Creation of additional variables: This is required if we believe (that is hypothesize) that much rich information can be captured from the variables.
Image of page 12

Subscribe to view the full document.

Image of page 13
  • Fall '19

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes