This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: H OLLOMAN S AP S TATISTICS APS N OTES 01, P AGE 1 OF 13 Exploratory Data Analysis Variables Definition The big idea of statistics is that we have a question about some large group (the population) that can be answered through measurement. That characteristic which we measure is called the variable . Perhaps we want to know the average mass of a kumquat mass is the variable. Perhaps we want to know the proportion of pink VW's in the U.S. color is the variable. The things that we measure (kumquats, VW's) are called individuals . The collection of all individuals is called the population . Quantitative vs. Qualitative Variables come in two basic categories quantitative and qualitative (this isn't the only way to classify variablesjust the only distinction that's important to us). Quantitative variables measure quantities mass, time, charge, number, length, etc. Qualitative variables measure qualities color, flavor, opinions, etc. Discrete vs. Continuous Quantitative variables can be broken down into two further categories discrete and continuous . Discrete variables have gaps in their possible valuesthey can only take on discrete (certain) values. The set of Integers ( ) is an example of a discrete set. Discrete variables will almost always measure the number of some thingthe number of houses; the number of people; the number of cars, etc. Continuous variables have no gaps in their possible values. The set of Real numbers ( ) is an example of a continuous set. Continuous variables will typically measure physical phenomena mass, length, volume, ratio, etc. The Distribution of a Variable Definition The Distribution of a Variable is a list (chart, picturesomething) that shows what values the variable can take, and how often it takes each value. It turns out that most of the calculations that we'll make this year depend on knowing some things about the distributions of various variables. Main Points There are three main features of a distribution that we want to know center , spread and shape . H OLLOMAN S AP S TATISTICS APS N OTES 01, P AGE 2 OF 13 The center can be described as the typical value of the variable; or the most common value; orwell, there are lots of ways to say this. More on this later. The spread can be described as the range of possible values; how wide is the distribution? Again, there are many ways to say this. Again, more on this later. The shape is a feature that can only be seen. There are two categories of shape that are important to us: symmetric and skew . Symmetric is self-explanatory. Skew means that one end is larger (taller) than the other. The side that is smaller is the direction of the skew. For example, the distribution in Figure 1 is Skew Right, while Figure 2 shows a fairly symmetric distribution....
View Full Document
This note was uploaded on 12/09/2011 for the course STAT 101 taught by Professor O during the Fall '08 term at Lake Land.
- Fall '08