{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Review Notes highlighted

# Review Notes highlighted - STAT 2120 Notes on Topic 1...

This preview shows pages 1–2. Sign up to view the full content.

STAT 2120, Notes on Topic 1, Spring 2010 Introduction to Examining Distributions: A variable records characteristics of individuals ( i.e. , objects of interest) in its values. Classify a variable by its possible values: o Categorical: records group labels; numeric labels mean nothing, except possible order. o Quantitative: records meaningful numbers; may be discrete or continuous A time series is a record of values across time. A variable’s distribution describes the counts or relative proportions of its values. Exploratory data analysis seeks to describe distributions and relationships in data. Displaying a distribution with graphs: Bar graphs and pie charts describe the distribution of a categorical variable. o Bar graphs emphasize counts; pie charts, proportions. o A Pareto chart is a bar graph with categories ordered by decreasing frequency. Histograms are essentially bar graphs of a quantitative variable. o Bar-widths are not absolute; use equal bar- widths and “eyeball” for best picture. o Look for overall pattern, shape center, spread, deviations in shape, and “outlier” deviations. o A symmetric distribution is such that its histogram mirrors itself about its center. o A right- or left-skewed distribution shows a long tail to the right or left in its histogram. Stemplots are back-of-the-envelope histograms drawn with the digits of quantitative values o “Stem” digits define bars; “leaf” digits display counts and sub-counts. o Customize by rounding digits and splitting stems. Time plots graph time series values by time. o Emphasize patterns of change over time, such as trends and seasonal variations. o Some time plots are seasonally adjusted. Describing distributions with numbers: Denote by     the values of observations. th percentile is a number such that percent of values fall on or below. Describe a distribution with numerical summaries of shape, center, and spread. A summary is resistant if it is insensitive to changes in skewness or extreme values. Measure of center: mean, o     , the arithmetic average. o is not resistant. Measure of center: median, o is the 50 th percentile. o Calculate as the middle value or average of two middle values. o is resistant. Measure of spread: extreme values o Smallest and largest values o Extreme values are not resistant. Measure of spread: quartiles, and o is the 25 th percentile; is the 75 th percentile o Calculate and as medians of values falling to the left or right of (but not on) . o and are resistant. Measure of spread: standard deviation, o    , where      , a rescaled average of squared-deviations from .

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern