chap4-a - Business Statistics (BUSA 3101) Dr. Lari H....

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Business Statistics (BUSA 3101) Dr. Lari H. Arjomand lariarjomand@clayton.edu Slide 1 Chapter 4 (Part A) Descriptive Statistics: Numerical Measures s s Measures of Location Measures of Variability Numerical Data Properties Central Tendency Variation Shape Mean Range Skew Median Kurtosis Mode Interquartile Range Variance Midrange Standard Deviation Midhinge Coeff. of Variation Slide 2 Measures of Location s Mean s Median Mode s s s If the measures are computed for data from a sample, they are called sample statistics. Percentiles Quartiles If the measures are computed for data from a population, they are called population parameters. A sample statistic is referred to as the point estimator of the corresponding population parameter. For example, the sample mean is a point estimator of the population mean. Slide 3 Mean s s The mean of a data set is the average of all the data values. As we said, the sample mean is the point estimator of the population mean µ. x Slide 4 Sample Mean x x= ∑x Sum of the values Sum of the values of the n observations of the n observations i n Number of Number of observations observations in the sample in the sample Slide 5 Population Mean µ µ= ∑x Sum of the values Sum of the values of the N observations of the i N Number of Number of observations in observations in the population the population Slide 6 Sample Mean Example: Apartment Rents Seventy efficiency apartments were randomly sampled in a small college town. The monthly rent prices for these apartments are listed in ascending order on the next slide. Slide 7 Sample Mean Example Continued 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Monthly Rent for 70 Apartments Slide 8 Sample Mean Example Continued ∑x x= 34, 356 = = 490.80 n 70 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 i 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Monthly Rent for 70 Apartments Slide 9 Properties of the Arithmetic Mean 1­ Every set of interval­level and ratio­level data has a mean. 2­ All the values are included in computing the mean. 3­ A set of data has a unique mean. 4­ The mean is affected by unusually large or small data values. 5­ The arithmetic mean is the only measure of central tendency where the sum of the deviations of each value from the mean is zero. See next Slide for An example ∑( X − X ) = 0 Slide 10 Illustration of Item Number 5 on Previous Slide Consider the set of values: 3, 8, and 4. The mean is 5. So (3 ­5) + (8 ­ 5) + (4 ­ 5) = ­2 + 3 ­ 1 = 0. Symbolically we write: ∑( X − X ) = 0 Slide 11 Median The median of a data set is the value in the middle when the data items are arranged in ascending order. Whenever a data set has extreme values, the median is the preferred measure of central location. The median is the measure of location most often reported for annual income and property value data. A few extremely large incomes or property values can inflate the mean. n +1 Positioning Point = 2 Slide 12 Median For an odd number of observations: 26 18 27 12 14 27 19 12 14 18 19 26 27 27 7 observations in ascending order the median is the middle value. Median = 19 Slide 13 Median For an even number of observations: 26 18 27 12 14 27 30 19 12 14 18 19 26 27 27 30 8 observations in ascending order the median is the average of the middle two values. Median = (19 + 26)/2 = 22.5 Slide 14 Median: Example Averaging the 35th and 36th data values: Median = (475 + 475)/2 = 475 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Monthly Rent for 70 Apartments Slide 15 Mode The mode of a data set is the value that occurs with greatest frequency. The greatest frequency can occur at two or more different values. If the data have exactly two modes, the data are bimodal. If the data have more than two modes, the data are multimodal. Slide 16 Mode: Example 450 occurred most frequently (7 times) Mode = 450 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Monthly Rent for 70 Apartments Slide 17 Mode: Another Example No Mode Raw Data: 10.3 4.9 11.7 6.3 7.7 One Mode Raw Data: 6.0 4.9 6.0 8.9 6.3 4.9 4.9 28 43 43 More Than 1 Mode Raw Data: 21 8.9 28 41 Slide 18 Use Excel to Compute the Mean, Median, and Mode s of the Following Data and Explain the Answers: 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 STUDENTS 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Slide 19 Percentiles A percentile provides information about how the data are spread over the interval from the smallest value to the largest value. Admission test scores for colleges and universities are frequently reported in terms of percentiles. s You are familiar with percentile score of national educational tests such as ACT, and SAT, which tell you where you stand in comparison with others. s For example, if you are in the 83th percentile, then 83% of the test­takers scored below you and you are in the top 17% of the test takers. Slide 20 Percentiles Definition s The pth percentile of a data set is a value such that at least p percent of the items take on this value or less and at least (100 ­ p) percent of the items take on this value or more. Slide 21 Steps for Finding Percentiles Arrange the data in ascending order. Compute index i, the position of the pth percentile. i = (p/100)n If i is not an integer, round up. The p th percentile is the value in the i th position. If i is an integer, the p th percentile is the average of the values in positions i and i +1. Slide 22 80th Percentile: Example i = (p/100)n = (80/100)70 = 56 Averaging the 56th and 57th data values: 80th Percentile = (535 + 549)/2 = 542 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Note: Data is in ascending order. Slide 23 80th Percentile: Example Continued “At least 80% of the items take on a value of 542 or less.” 56/70 = .8 or 80% 425 440 450 465 480 510 575 “At least 20% of the items take on a value of 542 or more.” 14/70 = .2 or 20% 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Slide 24 Use Excel to Find 80th Percentile Excel Formula Worksheet 1 2 3 4 5 6 A B Apart- Monthly ment Rent ($) 1 525 2 440 3 450 4 615 5 480 C 80th percentile D E 80th Percentile =PERCENTILE(B2:B71,.8) Note: Rows 7­71 are not shown. It is not necessary It to put the data in ascending order. in Slide 25 80th Percentile Excel Value Worksheet 1 2 3 4 5 6 A B Apart- Monthly ment Rent ($) 1 525 2 440 3 450 4 615 5 480 C D E 80th Percentile 537.8 Note: Rows 7­71 are not shown. Slide 26 EXAMPL Given the following data, use Excel to Given find the 25th percentile: find 357 654 763 621 900 550 290 700 789 605 Slide 27 Quartiles Quartiles are specific percentiles. First Quartile = 25th Percentile Second Quartile = 50th Percentile = Median Third Quartile = 75th Percentile s s Unless the sample size is large, percentiles may not make sense, since percentiles divide the data into 100 groups. In smaller samples, we might divide the data into four groups (quartiles). Since almost any sample can be divided into four groups, the quartiles are important descriptive statistics to explain. Slide 28 Third Quartile Excel Formula Worksheet 1 2 3 4 5 6 A B Apart- Monthly ment Rent ($) 1 525 2 440 3 450 4 615 5 480 C 3rd quartile D E Third Quartile =QUARTILE(B2:B71,3) Note: Rows 7­71 are not shown. It is not necessary It to put the data in ascending order. in Slide 29 Third Quartile Excel Value Worksheet 1 2 3 4 5 6 A B Apart- Monthly ment Rent ($) 1 525 2 440 3 450 4 615 5 480 C D E Third Quartile 522.5 Note: Rows 7­71 are not shown. Slide 30 EXAMPL Given the following data, use Excel to Given find the second quartile: find 357 654 763 621 900 550 290 700 789 605 Slide 31 Measures of Variability (Dispersion) It is often desirable to consider measures of variability (dispersion), as well as measures of location. For example, in choosing supplier A or supplier B we might consider not only the average delivery time for each, but also the variability in delivery time for each. Slide 32 Measures of Variability (Dispersion) s Range s Interquartile Range or Midspread s Variance s Standard Deviation s Coefficient of Variation Slide 33 Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of variability. It is very sensitive to the smallest and largest data values. Slide 34 Range: Example Range = largest value ­ smallest value Range = 615 ­ 425 = 190 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Monthly Rent for 70 Apartments Slide 35 Interquartile Range or Midspread The interquartile range of a data set is the difference between the third quartile and the first quartile. It is the range for the middle 50% of the data. It overcomes the sensitivity to extreme data values—it is not effected by the extreme values. Interquartile Range = Q3 − Q1 Slide 36 Interquartile Range: Example 3rd Quartile (Q3) = 525 1st Quartile (Q1) = 445 Interquartile Range = Q3 ­ Q1 = 525 ­ 445 = 80 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 Monthly Rent for 70 Apartments 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Slide 37 EXAMPL Given the following data, use Excel to Given find the Interquartile Range : find 357 654 763 621 900 550 290 700 789 605 Slide 38 Variance The variance is a measure of variability that utilizes all the data. It is based on the difference between the value of each observation (xi) and the mean ( for a sample, x µ for a population). Slide 39 Variance The variance is the average of the squared differences between each data value and the mean. The variance is computed as follows: ∑ ( xi − x ) 2 s2 = n −1 for a sample ∑ ( xi − µ ) σ= N 2 2 for a population Slide 40 Standard Deviation The standard deviation of a data set is the positive square root of the variance. It is measured in the same units as the data, making it more easily interpreted than the variance. Slide 41 Standard Deviation The standard deviation is computed as follows: s = s2 σ = σ2 for a sample for a population Slide 42 Coefficient of Variation The coefficient of variation indicates how large the standard deviation is in relation to the mean. The coefficient of variation is computed as follows: s ×100 % x σ ×100 % µ for a sample for a population Slide 43 Coefficient of Variation (Continued) s s s s s s Measure of relative dispersion Always a % CV is the standard deviation expressed as percent of the mean Used to compare two or more groups Weakness: CV is undefined if the mean is zero or if data are negative. Thus, CV is used only for variables whose values are X>=0 Slide 44 Example Continued Given the following monthly rent prices for 70 apartments, find Given variance, standard deviation, and the coefficient of variation.: use equations & Excel equations 425 440 450 465 480 510 575 430 440 450 470 485 515 575 430 440 450 470 490 525 580 435 445 450 472 490 525 590 435 445 450 475 490 525 600 435 445 460 475 500 535 600 435 445 460 475 500 549 600 435 445 460 480 500 550 600 Monthly Rent for 70 Apartments 440 450 465 480 500 570 615 440 450 465 480 510 570 615 Slide 45 Solutions s Variance s2 s ∑( x = − x )2 i = 2, 996.16 n −1 Standard Deviation s = s 2 = 2996.47 = 54.74 s Coefficient of Variation the standard deviation is about 11% of of the mean s 54.74 × 100 % = × 100 % = 11.15% x 490.80 Note that CV is the standard deviation expressed as percent of the Note CV is the standard deviation expressed as percent of the mean. Slide 46 Thinking Challenge Example You’re a financial analyst for Prudential­Bache Securities. You have collected the following closing stock prices of new stock issues: 17, 16, 21, 18, 13, 16, 12, 11. Describe the volatility (variation) of the stock prices. Slide 47 EXAMPL Given the following data, use Excel to Given find the followings: find 357 357 654 654 763 621 900 550 290 700 789 605 Slide 48 EXAMPLE s s s s s s Given the Given following data: following 357 357 654 763 621 900 550 290 700 789 605 If you need help with If this, see next slides. this, s Use Excel to find: A. The mean B. The mode C. The median D. The 75th percentile E. The first and the third quartile F. The range G. The interquartile range or midspread H. The standard deviation I. The coefficient of variation Slide 49 A Problem Using Excel s A private research private organization studying families in various countries reported the following data for the amount of time 4the year old children year spent alone with their fathers each day. fathers Country Time with Dad (minutes) Belgium 30 Canada China Finland Germany Nigeria Sweden 44 54 50 36 42 46 U.S.A. 42 Slide 50 A Problem Using Excel (Continued) s Use Excel, answer the following questions and explain your answers (round all numbers into two decimal places): • • • • • • • • • A. The mean B. The mode C. The median D. The 75th percentile E. The first and the third quartile F. The range G. The interquartile range or midspread H. The standard deviation I. The coefficient of variation Note: All results are rounded to two decimal places. Slide 51 Using SWStat+ (Creating Data Area) Data Area Slide 52 Using SWStat+ (Choose Statistics; Ungrouped Data; Choose Measures) Slide 53 Using SWStat+ (Numerical Data, Summary Measures (Sample); Calculate) Slide 54 Using SWStat+ (Results) Slide 55 Using SWStat+ (Numerical Data; Percentile; Calculate) Slide 56 Using SWStat+ (Results) Slide 57 End of Chapter 4, Part A Slide 58 ...
View Full Document

Ask a homework question - tutors are online