Prelim_1_Cheat_Sheet - Chap 1 2 Categorical Variables#s are...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Categorical Variables- #s are just labels and values are arbitrary Quantitative Variable- measured units (income, height, weight, age) Ordial- values are not categorical but not quite quantitative Context- Ideally tells Who What How Where When Why Data- systematically recorded information, whether numbers or labels, together with its context Case- an individual about whom or which we have data Variable- holds information about the same characteristic for many cases Chap. 3 Simpson’s Paradox- when averages are taken across different groups, they can appear to be contradictory Frequency table- lists the categories in a categorical variable and gives the counts or %s of observations of each category. Distribution- gives the possible values and relative frequency of the variable Area Principle- each data value should be represented by the same amount of area Contingency table- displays counts and percentages of individuals falling into named categories on 2 or more variables. Marginal distribution- In a contingency table, the distribution of either variable alone. The counts or percentages are the totals found in the margins of the table. Conditional distribution- restricting the Who to consider only a smaller group of individuals. Independence- if the conditional distribution of one variable is the same for each category of the other. Chap. 4 Histogram- uses adjacent bars to show the distribution of values in a quantitative variable. Each bar represents the frequency of values fallings in an interval of values. Stem and Leaf- shows quantitative data values in a way that sketches the distribution of the data. Dotplot- graphs a dot for each case against a single axis. Shape- to describe the shape, -single v. multiple modes -symmetry v skewness -outliers, clusters, or gaps Mode- a hump or local high point in the shape of the distribution of a variable. Uniform-roughly flat distribution Chap. 5 IQR- difference between quartiles Upper Q- Lower Q IQR- a reasonable summary of spread, but because it only uses 2 quartiles of data, it ignores much of the information how individual values vary. Variance- the sum of squared deviations from the mean, divided by count minus 1 *Formula on Back Standard Deviation-the square root of the variance *Formula on Back Center- the mean or median Median- the middle value with half of the data above and half below. Mean- the sum of all the data values divided by the count Spread- standard deviation, IQR, and range. Range-.Range = Max-Min 5# Summary- the min, the max, Q1, Q3, and the median Boxplots- displays the 5# summary as a central box with whiskers. SHAPE, CENTER, and SPREAD -If shape is skewed, report the median & IQR. The fact that the mean and median don’t agree is a sign that the distribution may be skewed. -If shape is symmetric report the mean and standard deviation and possibly the
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/27/2008 for the course ILRST 2120 taught by Professor Vellemanp during the Fall '06 term at Cornell.

Page1 / 2

Prelim_1_Cheat_Sheet - Chap 1 2 Categorical Variables#s are...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online