This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Test 1 1.1 Displaying Distributions with Graphs I ndividuals- are the objects described by a set of data. Individuals may be people, but they may also be business firms, common stocks, or other objects. Variable- any characteristic of an individual Categorical variable- places an individual into one of several groups or categories Quantitative variable- takes numerical values for which arithmetic operations such as adding and averaging make sense. Distribution- tells what values it takes and how it takes and how often it takes these values Pareto chart- bar graph whose categories are ordered from most frequent to least frequent Outlier- an individual value that falls outside the overall pattern Time plot- variable plots each observation against the time at which it was measured Seasonal Variation- a pattern in a time series that repeats itself at known regular intervals of time Exploratory Data Analysis- uses graphs and numerical summaries to describe the variable in a data set and the relations among them 1.2 Describing Distributions with Numbers Mean-(x) average Median- (M) midpoint of a distribution Five number summary- median, the quartiles, max, and min Box plots- based on the five number summary, useful for comparing several distributions Variance and Standard Deviation- common measures of the spread about the mean as the center; standard deviation is zero when there I no spread and gets larger as the spread increases 1.3 The Normal Distributions Density Curve- is always on or above the horizontal axis and has exactly an area of 1 underneath it. I t describes the overall pattern of a distribution. The mean and median are equal for symmetric density curves. The mean of a skewed curve is located farther toward the long tail than is the median The 68- 95- 99 .7 rule- 68% of the observations fall within standard deviation of the mean, 95% of the observations fall within 2*standard deviation of the mean, 99.7% of the observations fall within 3*standard deviation of the mean Standardized Value- z score = (X- mean)/standard deviation Normal Curves ( Normal Distributions ) – described by a special family of bell shaped symmetric density curves 2.1 Scatter plots Explanatory Variable- usually X and it explains or even causes changes in another variable Response Variable- usually Y and it usually endures a change from a explanatory variable Scatter plot- displays the relationship between two quantitative variables measured on the same individual Form : Linear relationships- where the points show a straight line pattern, are an important form of relationship between two variables. Curved and clusters are two other forms. Di rections : If the relationship has a clear direction, we speak of either positive association (high values of the two variables tend to occur together) or negative (high values of one variable tend to occur with low values of the other variable) Strength : the strength of a relationship is determined by how close the points in the scatter plot lie to a simple form...
View Full Document
This note was uploaded on 06/24/2011 for the course STA 309 taught by Professor Gemberling during the Spring '07 term at University of Texas.
- Spring '07