CH2 : Context : who,what,when,where,why,how. Case : An individual we have data for* Catagorical : categories(words/numerical)* Quantiative : Numerical w/ units* WCGW* Numbers can be categorical (area codes) *CH3* Frequency table : Table that lists the categories in a categorical variable and gives the count (or %) of observations for each category.* Distribution : Gives possible values and relative frequency of each value* Area principle : In a stats display, each data value should be represented by the same amount of area* Categorical data condition : Categorical graphs aren’t good for quantitative variables* Contingency table : Table that displays counts and sometimes % for individuals falling into named categories on 2 or more variables. The table categorizes the individual on all variables at once, revealing patterns.* Marginal Distribution : In a contingency table, the distribution of either variable alone is called the MD. The counts or %s are the totals found in the margins (last row) of the table* Conditional distribution : Distrubution of a variable restricting the who to consider only a smaller group of individuals* Independence : Variables are independent if the conditional distribution of one variable is the same for each category of the other* Segmented bar chart : Displays the conditional distribution of a categorical variable within each category of another variable* Simpson’s paradox :When averages are taken across different groups, they can appear to contradict the overall averages* WCGW *Don’t violate the area principle!*Make sure percentages add up and units match*Don’t confuse similar sounding percentages-read carefully*Don’t forget to look aat the variables separately*Don’t overstate your case: Most variables aren’t entirely independent of each other. Simpson’s Paradox: when comparing different variables make sure that the quantities you are averaging are comparable (average elevation with population-unrelated) * CH4* Histogram
