BUS708 Statistics and Data Analysis LECTURE 03 DESCRIPTIVE STATISTICS Outline One categorical variable Summary statistics: frequency table , proportion Visualization: bar chart, pie chart Two categorical variables Summary statistics: two-way table, difference in proportions Visualization: segmented or side-by-side bar chart One quantitative variable Summary statistics: mean, standard deviation, median, IQR Visualization: dot plot, histogram, box plot One quantitative and one categorical variables Summary statistics: statistics by group, difference in means Visualization: side-by-side boxplots/dot plot/ histogram Two quantitative variables Summary statistics: correlation coefficient Visualization: scatter plot 2 Textbook Section 2.1 – 2.5 1 2
Data Description Recall last week … 3 Statistics: Unlocking the Power of Data Lock 5 One Categorical Variable: Proportion The proportion in a category is found by 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑖𝑛 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑡𝑜𝑡𝑎𝑙 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 Proportion for a sample: 𝑝 (“p-hat”) Proportion for a population: p 3 4
Statistics: Unlocking the Power of Data Lock 5 Proportion What proportion of adults sampled do not own a cell phone? Android 458 iPhone 437 Blackberry 141 Non Smartphone 924 No cell phone 293 Total 2253 𝑝 ො = 293 2253 = 0.13 or 13% Proportions and percentages can be used interchangeably Statistics: Unlocking the Power of Data Lock 5 Two Categorical Variables Look at the relationship between two categorical variables Data collected from university students on 1. Relationship status 2. Gender 5 6
Statistics: Unlocking the Power of Data Lock 5 Two-Way Table 1. What proportion of students are in a relationship? 2. What proportion of females are in a relationship? 3. What proportion of students in a relationship are female? 4. What proportion of males are in a relationship? Female Male Total In a Relationship 32 10 42 It’s Complicated 12 7 19 Single 63 45 108 Total 107 62 169 42/169 25% 32/107 30% 32/42 76% 10/62 16% Statistics: Unlocking the Power of Data Lock 5 Difference in Proportions A difference in proportions is a difference in proportions for one categorical variable calculated for different levels of the other categorical variable Example: proportion of females in a relationship – proportion of males in a relationship 𝑝 ி − 𝑝 = 0.30 − 0.16 = 0.14 7 8
Statistics: Unlocking the Power of Data Lock 5 Difference in Proportions 𝑝̂ = proportion of students in a relationship who are female 𝑝̂ = proportion of single students who are female Find the difference in proportions: 𝑝 − 𝑝 Female Male Total In a Relationship 32 10 42 It’s Complicated 12 7 19 Single 63 45 108 Total 107 62 169 32/42 – 63/108 = 0.762 – 0.583 = 0.179 Statistics: Unlocking the Power of Data Lock 5 One Quantitative Variable Recall from last week: to describe a quantitative variable, we need to consider: Shape