Unformatted text preview: 2 of 14 1.1 Frequency distributions: univariate data Descriptive statistics Descriptive statistics. Definition 1.1 summarize and describe collected data (sample). Contrast with inferential statistics which makes inferences and generaliza tions about populations. Useful characteristics for summarizing data Example 1 (Heights of students in class) . If we are interested in heights of students and we measure the heights of students in this class, how can we summarize the data in useful ways? What to look for in data shape are most of the data all around the same value, or are they spread out evenly between a range of values? How is the data distributed ? center what is the average value of the data? variation how much do the values vary from the center? outliers are any values significantly different from the rest? Class height data Load the class data, then look at what variables are defined. R: load ( ClassData . RData) R: l s () [ 1 ] class . data R: names( class . data ) [ 1 ] gender height [ 3 ] forearm height mother [ 5 ] corrective lenses hair color [ 7 ] transportation work hours [ 9 ] credits R: class . data$ height [ 1 ] 65 68 71 66 68 65 62 68 77 62 69 70 65 63 67 73 68 70 Task: describe student heights 1. Determine general distribution (shape) of data. 2. Find a measure(s) of the center. 3. Find a measure of the variation. 4. Look for outliers. 1.1 Frequency distributions: univariate data univariate data. Definition 1.2 measurements made on only one variable per observation. bivariate data. Definition 1.3 measurements made on two variables per observation. multivariate data. Definition 1.4 measurements made on many variables per observation. Anthony Tanbakuchi MAT167 Visualizing Data 3 of 14 STANDARD FREQUENCY DISTRIBUTIONS frequency distribution. Definition 1.5 A table listing the frequency (number of times) data values occur in each interval. Good first summary of data! Steps: 1. Choose number of classes (typically 520) 2. Calculate the class width: class width max value min value number of classes (1) 3. Choose starting point, typically min data value or 0. 4. List in table lower class limits and then upper limits 5. Tally the data in each class, can use tick marks. Given: R: class . data$ height [ 1 ] 65 68 71 66 68 65 62 68 77 62 69 70 65 63 67 73 68 70 (min=62, max=77, n=18) Question 1 . Construct a frequency table for the class heights: Anthony Tanbakuchi MAT167 4 of 14 1.1 Frequency distributions: univariate data Information in a frequency distribution 1. shape 2. center 3. variation 4. outliers RELATIVE FREQUENCY DISTRIBUTIONS Relative frequency distribution. Definition 1.6 A table of relative frequency counts....
 Spring '11
 Tanbakuchi
 Statistics, StemAndLeaf Plots, Inferential Statistics

