# Lecture #2 – Descriptive Statistics & Graphical Representations + Measures of Central Tendency & Dis

• 8

This preview shows page 1 - 3 out of 8 pages.

HS 324: BiostatisticsLecture #3 – Descriptive Statistics & Graphical Representations +Measures of Central Tendency & DispersionTuesday January 14th, 2020Frequency and Percentage DistributionsFrequency Distribution: a table reporting the number of observations falling into each category of the variablePercentage Distribution: a table showing the percentage of observations falling into each category of the variableExposure to the second-hand smoke at home by age group, 2016Age Groupf%12 to 19 years125113.3020 to 34 years149815.9335 to 44 years9019.5845 to 64 years381840.6065 years and older193520.58Total9403100.00Proportions and PercentagesProportion (P): frequency obtained by dividing the frequencyin each category by the total number of casesPercentage (%): a relative frequency obtained by dividing thefrequency in each category by the total number of cases andmultiplying by 100oN = total number of casesCumulative Percentage DistributionCumulative Percentage Distribution: a distribution showing the percentage at or below each category (class interval or score) of the variableCumulative Percentage Distribution for Exposure to second-hand smoke at home by age group, 2016Age Groupf%Cum. %12 to 19 years125113.3013.3020 to 34 years149815.9329.2435 to 44 years9019.5838.8245 to 64 years381840.6079.4265 years and older193520.58100.00Total9403100.00Tells us that a majority of people are in the 45-64 (late boomer) period; age progression and its decline to exposure to second-hand smoke – influence of no-smoking policies, anti-tobacco campaigns Collapsing Variable Categories in Frequency/Percentage DistributionsP=fN% = P(100)N = total number of cases
HS 324: BiostatisticsLecture #3 – Descriptive Statistics & Graphical Representations +Measures of Central Tendency & DispersionTuesday January 14th, 2020At times we will encounter frequency or percentage distributions that contain detailed categorizations which are not necessarily relevant to our analysis. We can resolve this by collapsing categoriesCollapsing eliminates unnecessary information, therefore making the presentation of data more clear, efficientWe need to be careful since there is the risk of eliminating too much informationEXERCISE:Open CCHS 2015-16Select variable DHHGAGE & run frequency table for DHHGAGE1.What age categories were collapsed to make the table presented in the previous slides?2.Can you think of a reason why these categories were collapsed in this way?oIf there is more than 10 categories, typically try to collapsing Charts and Graphs: A SummaryFor NominalVariables pie charts; bar graphsFor OrdinalVariables pie charts; bar graphsFor Interval-RatioVariables histograms; line graphs; time-series chartsMeasures of Central TendencyMeasure of Central Tendency: descriptive statistics that give us the average (i.e. the most occurring or middle) number in a variable distributionoMode