chap4-b - Business Statistics (BUSA 3101) Dr. Lari H....

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Business Statistics (BUSA 3101) Dr. Lari H. Arjomand lariarjomand@clayton.edu Slide 1 Chapter 4 (Part B) Descriptive Statistics: Numerical Measures s Measures of Distribution Shape, Relative Location, and Detecting Outliers s Exploratory Data Analysis s Measures of Association Between Two Variables s The Weighted Mean and Working with Grouped Data Slide 2 Measures of Distribution Shape, Relative Location, and Detecting Outliers s s s s s Distribution Shape z­Scores Chebyshev’s Theorem Empirical Rule Detecting Outliers Slide 3 Distribution Shape: Skewness s An important measure of the shape of a distribution is called skewness. s The formula for computing skewness for a data set is somewhat complex. s Skewness can be easily computed using statistical software. Slide 4 Distribution Shape: Skewness Pearson’s Coefficient of Skewness Equation SK = 3(Mean – Median)/Standard Deviation -3 <= SK <=+3 Left-Skewed Mean Median Symmetric Mode Mean = Median Right-Skewed = Mode Mode Median Mean Slide 5 Distribution Shape: Skewness Symmetric (not skewed, SK = 0) • If skewness is zero, then • Mean and median are equal. .35 Relative Frequency s Skewness = 0 .30 .25 .20 .15 .10 .05 0 Slide 6 Distribution Shape: Skewness Moderately Skewed Left • Is skewness is negative (left skewed SK >= ­3), then • Mean will usually be less than the median. .35 Relative Frequency s Skewness = − .31 .31 .30 .25 .20 .15 .10 .05 0 Slide 7 Distribution Shape: Skewness Moderately Skewed Right • If skewness is positive (right skewed, SK >= +3), then • Mean will usually be more than the median. .35 Relative Frequency s Skewness = .31 .30 .25 .20 .15 .10 .05 0 Slide 8 Distribution Shape: Skewness s Highly Skewed Right • Skewness is positive (often above 1.0). • Mean will usually be more than the median. Relative Frequency .35 Skewness = 1.25 .30 .25 .20 .15 .10 .05 0 Slide 9 Distribution Shape: Skewness s Example: Apartment Rents Seventy efficiency apartments were randomly sampled in a small college town. The monthly rent prices for these apartments are listed in ascending order on the next slide. Slide 10 Distribution Shape: Skewness Skewness = .87 425 440 450 465 480 510 575 430 440 450 470 485 515 575 ∑x x= 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 34, 356 = = 490.80 n 70 i 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 s = s 2 = 2996.47 = 54.74 Median = (475 + 475)/2 = 475 Mode = 450 Slide 11 Distribution Shape: Skewness Example (Continued) SK = 3(Mean – Median) / Standard deviation SK = 3(490.80 – 475) / 54.74 = 0.87 Slide 12 Distribution Shape: Skewness Relative Frequency .35 Skewness = .87 .30 .25 .20 .15 .10 .05 0 Slide 13 z­Score OR Standardized Value The z­score is often called the standardized value. The z­score is often called the standardized value. It denotes the number of standard deviations a data It denotes the number of standard deviations a data value xi is from the mean. value xi is from the mean. xi − x zi = s Slide 14 z­Scores An observation’s z­score is a measure of therelative location of the observation in a data set. A data value less than the sample mean will have a z­score less than zero—the z­score is negative. A data value greater than the sample mean will have a z­score greater than zero—the z­score is positive. A data value equal to the sample mean will have a z­score of zero. Slide 15 z­Scores or Standardized Values Example s z­Score of Smallest Value (425) x i − x 425 − 490.80 z= = = − 1.20 s 54.74 Standardized Values for Apartment Rents ­1.20 ­0.93 ­0.75 ­0.47 ­0.20 0.35 1.54 ­1.11 ­0.93 ­0.75 ­0.38 ­0.11 0.44 1.54 ­1.11 ­0.93 ­0.75 ­0.38 ­0.01 0.62 1.63 ­1.02 ­0.84 ­0.75 ­0.34 ­0.01 0.62 1.81 ­1.02 ­0.84 ­0.75 ­0.29 ­0.01 0.62 1.99 ­1.02 ­0.84 ­0.56 ­0.29 0.17 0.81 1.99 ­1.02 ­0.84 ­0.56 ­0.29 0.17 1.06 1.99 ­1.02 ­0.84 ­0.56 ­0.20 0.17 1.08 1.99 ­0.93 ­0.75 ­0.47 ­0.20 0.17 1.45 2.27 ­0.93 ­0.75 ­0.47 ­0.20 0.35 1.45 2.27 Slide 16 Chebyshev’s Theorem At least (1 ­ 1/z22)) of the items in any data set will be At least (1 ­ 1/z of the items in any data set will be within z standard deviations of the mean, where z is within z standard deviations of the mean, where z is any value greater than 1.. any value greater than 1 In other words, the Chebyshev’s theorem indicates that for any set of observations (sample or population), the minimum proportion of the values that lie within z standard deviations of the mean is at least 100[1 ­ 1/z2], where z is any constant greater than 1. Slide 17 Chebyshev’s Theorem 75% 75% At least of the data values must be At least of the data values must be z = 2 standard deviations z = 2 standard deviations within of the mean. within of the mean. 89% 89% At least of the data values must be At least of the data values must be z = 3 standard deviations z = 3 standard deviations within of the mean. within of the mean. 94% 94% At least of the data values must be At least of the data values must be z = 4 standard deviations z = 4 standard deviations within of the mean. within of the mean. Slide 18 Chebyshev’s Theorem For example: x Let z = 1.5 with = 490.80 and s = 54.74 At least (1 − 1/(1.5)2) = 1 − 0.44 = 0.56 or 56% of the rent values must be between x ­ z(s) = 490.80 − 1.5(54.74) = 409 and x + z(s) = 490.80 + 1.5(54.74) = 573 Slide 19 Empirical Rule For data having a bell­shaped distribution: 68.26% of the values of a normal random variable 68.26% of the values of a normal random variable +/­ 1 standard deviation are within of its mean. +/­ 1 standard deviation are within of its mean. 95.44% of the values of a normal random variable 95.44% of the values of a normal random variable +/­ 2 standard deviations are within of its mean. +/­ 2 standard deviations are within of its mean. 99.72% of the values of a normal random variable 99.72% of the values of a normal random variable +/­ 3 standard deviations are within of its mean. +/­ 3 standard deviations are within of its mean. Slide 20 Empirical Rule 99.72% 95.44% 68.26% µ – 3σ µ – 1σ µ – 2σ µ µ + 3σ µ + 1σ µ + 2σ x Slide 21 Example s s Q: Suppose we give an exam to 100 students. How many students will score within two standard deviations of the mean? A: 95.44% of 100 or 95 students—only about 5 students will have scores more than two standard deviations from the mean. Slide 22 Detecting Outliers An outlier is an unusually small or unusually large value in a data set. A data value with a z­score less than ­3 or greater than +3 might be considered an outlier. It might be: • an incorrectly recorded data value • a data value that was incorrectly included in the data set • a correctly recorded data value that belongs in the data set Slide 23 Detecting Outliers The most extreme z­scores are ­1.20 and 2.27 Using |z| > 3 as the criterion for an outlier, there are no outliers in this data set. Standardized Values for Apartment Rents ­1.20 ­0.93 ­0.75 ­0.47 ­0.20 0.35 1.54 ­1.11 ­0.93 ­0.75 ­0.38 ­0.11 0.44 1.54 ­1.11 ­0.93 ­0.75 ­0.38 ­0.01 0.62 1.63 ­1.02 ­0.84 ­0.75 ­0.34 ­0.01 0.62 1.81 ­1.02 ­0.84 ­0.75 ­0.29 ­0.01 0.62 1.99 ­1.02 ­0.84 ­0.56 ­0.29 0.17 0.81 1.99 ­1.02 ­0.84 ­0.56 ­0.29 0.17 1.06 1.99 ­1.02 ­0.84 ­0.56 ­0.20 0.17 1.08 1.99 ­0.93 ­0.75 ­0.47 ­0.20 0.17 1.45 2.27 ­0.93 ­0.75 ­0.47 ­0.20 0.35 1.45 2.27 Slide 24 Measures of Association Between Two Variables s s Covariance Correlation Coefficient This concepts will be explained in more details in chapter 12 Slide 25 The Weighted Mean and Working with Grouped Data s s s s Weighted Mean Mean for Grouped Data Variance for Grouped Data Standard Deviation for Grouped Data Slide 26 Weighted Mean When the mean is computed by giving each data value a weight that reflects its importance, it is referred to as a weighted mean. In the computation of a grade point average (GPA), the weights are the number of credit hours earned for each grade. When data values vary in importance, the analyst must choose the weight that best reflects the importance of each value. Slide 27 Weighted Mean ∑w x x= ∑w ii i where: xi = value of observation i wi = weight for observation i Slide 28 Weighted Mean Example During a one hour period on a busy Friday night, fifty soft drinks were sold at the Kruzin Cafe. Compute the weighted mean of the price of the soft drinks. (Price ($), Number sold): (0.5, 5), (0.75, 15), (0.9, 15), and (1.10, 15). The weighted mean is $[0.5× 5 + + 0.75× 15 + 0.9× 15 + 1.1× 15]/[5 + 15 + 15 + 15] = $43.75/50 = $0.875. ∑w x x= ∑w ii i Slide 29 Grouped Data The weighted mean computation can be used to obtain approximations of the mean, variance, and standard deviation for the grouped data. To compute the weighted mean, we treat the midpoint of each class as though it were the mean of all items in the class. We compute a weighted mean of the class midpoints using the class frequencies as weights. Similarly, in computing the variance and standard deviation, the class frequencies are used as weights. Slide 30 Mean for Grouped Data s Sample Data ∑fM x= i ∑fM µ= i i n s Population Data i N where: fi = frequency of class i Mi = midpoint of class i Slide 31 Sample Mean for Grouped Data Example Given below is the previous sample of monthly rents for 70 efficiency apartments, presented here as grouped data in the form of a frequency distribution. Rent ($) 420-439 440-459 460-479 480-499 500-519 520-539 540-559 560-579 580-599 600-619 Frequency 8 17 12 8 7 4 2 4 2 6 Slide 32 Sample Mean for Grouped Data (Example Continued) Rent ($) 420-439 440-459 460-479 480-499 500-519 520-539 540-559 560-579 580-599 600-619 Total fi 8 17 12 8 7 4 2 4 2 6 70 Mi 429.5 449.5 469.5 489.5 509.5 529.5 549.5 569.5 589.5 609.5 f iMi 3436.0 7641.5 5634.0 3916.0 3566.5 2118.0 1099.0 2278.0 1179.0 3657.0 34525.0 34,525 x= = 493.21 70 This approximation differs by $2.41 from the actual sample mean of $490.80. ∑fM x= i n i Slide 33 Variance for Grouped Data s For sample data ∑ f i ( Mi − x ) 2 s2 = n −1 s For population data ∑ f i ( Mi − µ ) 2 σ2 = N Slide 34 Sample Variance for Grouped Data (Example Continued) Rent ($) 420-439 440-459 460-479 480-499 500-519 520-539 540-559 560-579 580-599 600-619 Total fi 8 17 12 8 7 4 2 4 2 6 70 Mi 429.5 449.5 469.5 489.5 509.5 529.5 549.5 569.5 589.5 609.5 Mi - x -63.7 -43.7 -23.7 -3.7 16.3 36.3 56.3 76.3 96.3 116.3 (M i - x )2 f i (M i - x )2 4058.96 32471.71 1910.56 32479.59 562.16 6745.97 13.76 110.11 265.36 1857.55 1316.96 5267.86 3168.56 6337.13 5820.16 23280.66 9271.76 18543.53 13523.36 81140.18 208234.29 continued Slide 35 Sample Variance for Grouped Data (Example Continued) s Sample Variance s2 = 208,234.29/(70 – 1) = 3,017.89 s Sample Standard Deviation s = 3,017.89 = 54.94 This approximation differs by only $.20 from the actual standard deviation of $54.74. ∑ f i ( Mi − x ) 2 s2 = n −1 Slide 36 An Example of Grouped Data Using Excel (Student) Given Rent ($) 420-439 440-459 460-479 480-499 500-519 520-539 540-559 560-579 580-599 600-619 Total fi 8 17 12 8 7 4 2 4 2 6 70 Use Excel, to find: Mean. Median, Mode, Variance, Standard Deviation, Range, First and Third Quartiles, CV, Interquartile Range. Is the data skewed? Explain. Slide 37 Using SWStat+ For Grouped Data DATA Slide 38 Problem to be Solved by Students s The following is a frequency distribution of grades for a statistics examination. s s Examination Grade Frequency 40 ­ 49 3 50 ­ 59 5 60 ­ 69 11 70 ­ 79 22 80 ­ 89 15 90 ­ 99 6 s Treating these data as a sample, compute the following: s a. b. c. d. s s s s s s s s Mean, median, and mode Variance, Standard deviation, and coefficient of variation First, second and the third quartiles 25th and 50th and 75th percentiles Slide 39 Solution Using SWStat+ 50th Percentile 25th and 75th percentile DATA Slide 40 End of Chapter 4, Part B Slide 41 ...
View Full Document

This document was uploaded on 11/25/2011.

Ask a homework question - tutors are online