Prelim_1_Cheat_Sheet

# Prelim_1_Cheat_Sheet - Chap 1 2 Categorical Variables#s are...

This preview shows pages 1–2. Sign up to view the full content.

Categorical Variables- #s are just labels and values are arbitrary Quantitative Variable- measured units (income, height, weight, age) Ordial- values are not categorical but not quite quantitative Context- Ideally tells Who What How Where When Why Data- systematically recorded information, whether numbers or labels, together with its context Case- an individual about whom or which we have data Variable- holds information about the same characteristic for many cases Chap. 3 Simpson’s Paradox- when averages are taken across different groups, they can appear to be contradictory Frequency table- lists the categories in a categorical variable and gives the counts or %s of observations of each category. Distribution- gives the possible values and relative frequency of the variable Area Principle- each data value should be represented by the same amount of area Contingency table- displays counts and percentages of individuals falling into named categories on 2 or more variables. Marginal distribution- In a contingency table, the distribution of either variable alone. The counts or percentages are the totals found in the margins of the table. Conditional distribution- restricting the Who to consider only a smaller group of individuals. Independence- if the conditional distribution of one variable is the same for each category of the other. Chap. 4 Histogram- uses adjacent bars to show the distribution of values in a quantitative variable. Each bar represents the frequency of values fallings in an interval of values. Stem and Leaf- shows quantitative data values in a way that sketches the distribution of the data. Dotplot- graphs a dot for each case against a single axis. Shape- to describe the shape, -single v. multiple modes -symmetry v skewness -outliers, clusters, or gaps Mode- a hump or local high point in the shape of the distribution of a variable. Uniform-roughly flat distribution Chap. 5 IQR- difference between quartiles Upper Q- Lower Q IQR- a reasonable summary of spread, but because it only uses 2 quartiles of data, it ignores much of the information how individual values vary. Variance- the sum of squared deviations from the mean, divided by count minus 1 *Formula on Back Standard Deviation-the square root of the variance *Formula on Back Center- the mean or median Median- the middle value with half of the data above and half below. Mean- the sum of all the data values divided by the count Spread- standard deviation, IQR, and range. Range-.Range = Max-Min 5# Summary- the min, the max, Q1, Q3, and the median Boxplots- displays the 5# summary as a central box with whiskers. SHAPE, CENTER, and SPREAD -If shape is skewed, report the median & IQR. The fact that the mean and median don’t agree is a sign that the distribution may be skewed. -If shape is symmetric report the mean and standard deviation and possibly the

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/27/2008 for the course ILRST 2120 taught by Professor Vellemanp during the Fall '06 term at Cornell.

### Page1 / 2

Prelim_1_Cheat_Sheet - Chap 1 2 Categorical Variables#s are...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online