Stat 226 - Section 1.2

Stat 226 - Section 1.2 - Recall - Exploratory Data Analysis...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Section 1.2 Describing Distributions with Numbers Section 1.2 1 Recall - Exploratory Data Analysis Use some statistical tools to describe some of the main features of the dataset. This process is known as exploratory data analysis . Begin by examining each variable by itself (Chapter 1). Then look at relationships between variables (Chapter 2). Begin by visualizing important features with graphs (Section 1.1). Then focus on specifics by using numerical summaries (Section 1.2). Section 1.2 2 Recall - Describing Distributions In section 1.1, we defined some aspects of the overall pattern of the distribution of a quantitative variable and how to identify them on a graph. In section 1.2, we will specify some numerical descriptions. Shape Graphical: Symmetric or skewed Numerical? Center Graphical: Histogram peak Numerical: Mean and median Spread Graphical: Extent of histogram classes Numerical: Standard deviation, range, quartiles Section 1.2 3 Measuring Center - Mean The mean is the arithmetic average of a distribution. The mean is the most commonly used measure of center. To find the mean of a set of observations, compute the sum of their values and divide by the number of observations. ¯ x = x 1 + x 2 + . . . + x n n ¯ x = 1 n n i = 1 x i Section 1.2 4 Notation The symbol is the Greek capital letter sigma. In mathematics, it is commonly used to represent summation (add all elements together). The subscripts i = 1 , 2 , . . . , n on the x i are simply a way of identifying each individual distinctly. The subscripts do not necessarily indicate any kind of order. The overbar ¯ x is used to symbolize the mean of x , the variable of interest. Often, the mean is referred to as “x-bar.” Section 1.2 5 Measuring Center - Median The median M of a distribution is the number such that half of the observations are smaller and half are larger. This can be thought of as the midpoint of the distribution. How do we find the median? 1. Arrange the observations from smallest to largest. 2. If the number of observations n is odd, then the median is the center observation in the ordered list. Count ( n + 1 ) / 2 observations from the bottom to find it. 3. If the number of observations n is even, then the median is the average of the center two observations in the ordered list. Count n / 2 and ( n / 2 )+ 1 observations from the bottom to find these. Section 1.2 6
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Example - 2005-06 Tuition 2005-06 Resident Tuition at Land Grant Universities Iowa State 5,634 Connecticut 7,912 SUNY Binghamton 5,840 Michigan State 7,923 SUNY Buffalo 5,863 Ohio State 8,082 SUNY Albany 5,880 UC Davis 8,129 Wisconsin 6,220 Illinois 8,670 Texas A&M 6,234 Minnesota 8,822 Purdue 6,458 Michigan 9,213 Indiana 7,112 Massachusetts 9,456 Virginia 7,370 Vermont 10,748 California 7,434 Penn State 11,024 Texas 7,438 Section 1.2 7 Measuring Center - Tuition 2005-06 Resident Tuition Tuition ($) Count 6000 7000 8000 9000 10000 11000 12000 0 1 2 3 4 5 6 What is the mean for the tuition dataset?
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 7

Stat 226 - Section 1.2 - Recall - Exploratory Data Analysis...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online