CH2: Context: who,what,when,where,why,how. Case: An individual we have data for*Catagorical: categories(words/numerical)*Quantiative: Numerical w/ units*WCGW* Numbers can be categorical (area codes)*CH3*Frequency table: Table that lists the categories in a categorical variable and gives the count (or %) of observations for each category.*Distribution: Gives possible values and relative frequency of each value*Area principle: In a stats display, each data value should be represented by the same amount of area*Categorical data condition: Categorical graphs aren’t good for quantitative variables*Contingency table: Table that displays counts and sometimes % for individuals falling into named categories on 2 or more variables. The table categorizes the individual on all variables at once, revealing patterns.*Marginal Distribution: In a contingency table, the distribution of either variable alone is called the MD. The counts or %s are the totals found in the margins (last row) of the table*Conditional distribution: Distrubution of a variable restricting the who to consider only a smaller group of individuals*Independence: Variables are independent if the conditional distribution of one variable is the same for each category of the other*Segmented bar chart: Displays the conditional distribution of a categorical variable within each category of another variable*Simpson’s paradox:When averages are taken across different groups, they can appear to contradict the overall averages*WCGW*Don’t violate the area principle!*Make sure percentages add up and units match*Don’t confuse similar sounding percentages-read carefully*Don’t forget to look aat the variables separately*Don’t overstate your case: Most variables aren’t entirely independent of each other. Simpson’s Paradox: when comparing different variables make sure that the quantities you are averaging are comparable (average elevation with population-unrelated) *CH4* Histogram
This is the end of the preview.
access the rest of the document.