STA309 Exam I Cheat Sheet

STA309 Exam I Cheat Sheet - STA309 Exam I Examining...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
STA309 Exam I Examining distribution with graphs: Variables: the characteristics of the individuals. Individuals may be people or objects. A variable can take different values for different individuals. o Quantitative variables: takes numerical values for which arithmetic makes sense. o Categorical variables: record which category a person or thing falls into. Distributions: Describes a variable’s pattern of variation. Tells the values & how often they occur. Categorical variables : Distribution lists the categories and gives either the count or the percent of individuals in each category: Quantitative variables o Histograms : Divide the range of data into classes of equal width. Count the number of observations in each class. Rows = cases; columns = variables. Use excel only for shape; data very wrong; esp. 1 st interval. skewness (+) right skewed; (-) left skewed; > 1 pretty skewed o Stem plots : Separate each observation into a Stem - all but final digit Leaf – the final digit Write the stems from top (smallest) to bottom (largest) Write each leaf to the right of the stem in increasing order. Advantages of Stem plots: Used for small data sets. Quicker to make. Presents more detailed information. o Time plots : Illustrates measurements taken over time X-axis - time Y-axis- measurement. Used to describe trends, seasonality, fluctuations, and cycles Examining distributions : Look for the overall pattern and striking deviations from the pattern (such as outliers) o Use histograms and stemplots to describe: Shape: Symmetric - right and left sides are approximately mirror images (don't expect perfection in real data); Skewed - one tail extends longer than the other side. Right skewed long right (quite common); Bimodal: two humps (ex. Old Faithful’s eruptions) Describing distributions with Numbers: Comparing median and mean: The mean is influenced by extreme observations and the median is not. Median is said to be restrictive b/c it is not affected much by outliers. Symmetric mean = median; otherwise, median better measure of center. The median is said to be “resistant”. Measuring the Spread: o Quartiles: The p th percentile of a distribution is the value such that p percent of the observations fall at or below it. The median is just the 50 th percentile, the first quartile is the 25th percentile, and the third quartile is the 75th percentile. IQR = range of the middle half of the data IQR = Q3 - Q1 ; IQR is not affected by outliers Standard deviation: The standard deviation is most common measure of spread. The standard deviation has the same units as the original measurements. Standard deviation is strongly influenced by outliers. Summarizing a Distribution:
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 3

STA309 Exam I Cheat Sheet - STA309 Exam I Examining...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online