chap2 & 3-b

chap2 & 3-b - Business Statistics (BUSA 3101) Dr....

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Business Statistics (BUSA 3101) Dr. Lari H. Arjomand lariarjomand@clayton.edu Slide 1 Chapter 2 & 3 (Part B) Descriptive Statistics: Tabular and Graphical Presentations Exploratory Data Analysis s Crosstabulations and Scatter Diagrams s Descriptive Statistics: These are statistical methods used to describe data that have been collected. been y x Slide 2 Exploratory Data Analysis The techniques of exploratory data analysis consist of simple arithmetic and easy­to­draw pictures that can be used to summarize data quickly. One such technique is the stem­and­leaf display. Data: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 26 Slide 3 Stem­and­Leaf Display A stem­and­leaf display shows both the rank order and shape of the distribution of the data. It is similar to a histogram on its side, but it has the advantage of showing the actual data values. The first digits of each data item are arranged to the left of a vertical line. To the right of the vertical line we record the next digit for each item in rank order­­when the leaf unit is not shown, it is assumed to equal 1. Each line in the display is referred to as a stem. Each digit on a stem is a leaf. Slide 4 Example: Hudson Auto Repair The manager of Hudson Auto would like to have a better understanding of the cost of parts used in the engine tune­ups performed in the shop. She examines 50 customer invoices for tune­ups. The costs of parts, rounded to the nearest dollar, are listed on the next slide. Slide 5 Example: Hudson Auto Repair s Sample of Parts Cost for 50 Tune­ups 91 71 104 85 62 78 69 74 97 82 93 72 62 88 98 57 89 68 68 101 75 66 97 83 79 52 75 105 68 105 99 79 77 71 79 80 75 65 69 69 97 72 80 67 62 62 76 109 74 73 First step is to rearrange these data in rank order. See next slide. Slide 6 Solution: Hudson Auto Repair s Sample of Parts Cost for 50 Tune­ups 52 57 57 71 79 98 98 62 62 62 62 65 66 67 68 68 68 69 69 69 71 72 72 73 74 74 75 75 75 76 77 78 79 79 80 80 82 83 85 88 89 91 93 97 97 97 99 101 104 105 105 109 Data are rearranged in rank order. Slide 7 Solution: Stem­and­Leaf Display 5 6 7 8 9 10 a stem 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 So What? 1 3 7 7 7 8 9 Explain! 1 4 5 5 9 a leaf •When the leaf unit is not shown, it is assumed to equal 1. Slide 8 Stretched Stem­and­Leaf Display If we believe the original stem­and­leaf display has condensed the data too much, we can stretch the display by using two stems for each leading digit(s). Whenever a stem value is stated twice, the first value corresponds to leaf values of 0 − 4, and the second value corresponds to leaf values of 5 − 9. Slide 9 Example: Hudson Auto Repair Stretched Stem­and­Leaf Display s Sample of Parts Cost for 50 Tune­ups 91 71 104 85 62 78 69 74 97 82 93 72 62 88 98 57 89 68 68 101 75 66 97 83 79 52 75 105 68 105 99 79 77 71 79 80 75 65 69 69 97 72 80 67 62 62 76 109 74 73 First step is to rearrange these data in rank order. See next slide. Slide 10 Solution: Hudson Auto Repair s Sample of Parts Cost for 50 Tune­ups 52 57 57 71 79 98 98 62 62 62 62 65 66 67 68 68 68 69 69 69 71 72 72 73 74 74 75 75 75 76 77 78 79 79 80 80 82 83 85 88 89 91 93 97 97 97 99 101 104 105 105 109 Data are rearranged in rank order. Slide 11 Solution: Stretched Stem­and­Leaf Display 5 5 6 6 7 7 8 8 9 9 10 10 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9 The first value The corresponds to corresponds leaf values of 0 - 4, and the second value corresponds to leaf values of 5-9 Slide 12 Stem­and­Leaf Display s Leaf Units • A single digit is used to define each leaf. • In the preceding example, the leaf unit was 1. • Leaf units may be 100, 10, 1, 0.1, and so on. • Where the leaf unit is not shown, it is assumed to equal 1. Slide 13 Example: Leaf Unit = 0.1 If we have data with values such as 8.6 11.7 9.4 9.1 10.2 11.0 8.8 a stem­and­leaf display of these data will be Leaf Unit = 0.1 8 6 8 9 1 4 10 2 11 0 7 Slide 14 Example: Leaf Unit = 10 If we have data with values such as 1806 1717 1974 1791 1682 1910 1838 a stem­and­leaf display of these data will be Leaf Unit = 10 16 8 17 1 9 18 0 3 19 1 7 The 82 in 1682 is rounded down to 80 and is represented as an 8. You do this for all of the data. Slide 15 Another Example & Solution The Dean of the School of Business at OU reports the following number of students in the 15 sections of basic statistics offered this semester . Construct a stem­and­leaf chart for the data. 27 36 29 21 24 26 32 30 36 30 28 23 17 41 19 STEM 1 2 3 4 LEAF 79 1346789 00266 1 So What? Explain! Slide 16 Crosstabulation Or Contingency Table s Shows number of observations jointly in two categorical variables • Example: Male accounting student • Gender variable and Major variable • Can use categorized numerical variables s May include row %, column %, or total % s Helps find relationships s Used widely in marketing Slide 17 Crosstabulation Or Contingency Table Example Residence: Gender: C C O O C C O O C O MF FMMMFMMF (C=On­Campus, O=Off­Campus; M=Male, F=Female) Use gender as the explanatory variable. Gender Residence Male Female Total On-Campus 5 4 1 2 3 Off-Campus 5 Total 6 4 10 Slide 18 Crosstabulation Or Contingency Table Example (Row %) Gender Residence Male Female Total On-Campus 4 1 5 (80) (20) (100) Off-Campus 2 3 5 (40) (60) (100) Total 6 4 10 (60) (40) (100) (Cell Count) (100) Row Total (3/5)(100) = 60% Slide 19 Crosstabulation Or Contingency Table Example (Column %) Gender Residence Male Female Total On-Campus 4 1 5 (67) (25) (50) Off-Campus 2 3 5 (33) (75) (50) Total 6 4 10 (100) (100) (100) (Cell Count) (100) Column Total (3/4)(100) = 75% Slide 20 Crosstabulation Or Contingency Table Example (Total %) Gender Residence Male Female Total On-Campus 4 1 5 (40) (10) (50) Off-Campus 2 3 5 (20) (30) (50) Total 6 4 10 (60) (40) (100) (Cell Count) (100) Grand Total (3/10)(100) = 30% Slide 21 Which Percentage? s s s Compute % in direction of explanatory variable Then, for example, If explanatory variable is in row, use row total In previous example, gender is explanatory variable • ‘Explains’ residence choice Slide 22 Thinking Challenge Example You’re a marketing research analyst for Visa. You want to analyze data on credit card use & annual income. Use the following information, create a contingency table. In this example, use income as In the explanatory variable. the Income (000): 12 20 32 45 Use: Y N N Y (Income categories: Under $25,000; Use categories: Y = Use credit cards, 72 46 18 55 Y Y N Y $25,000 & over; N = Don’t use) Slide 23 Solution Use Income Under $25k Explanatory Explanatory variable variable $25K & Over Total Row percentages No Yes 2 (67) 1 (20) 3 (38) 1 (33) 4 (80) 5 (62) Total 3 (100) 5 (100) 8 (100) (4/5)(100) = 80% Slide 24 Crosstabulations and Scatter Diagrams As we indicated, often a manager is interested in tabular and graphical methods that will help understand the relationship between two variables. Crosstabulation (or Contingency Table) and a scatter diagram are two methods for summarizing the data for two (or more) variables simultaneously. Slide 25 Crosstabulation Or Contingency Table Remember that a crosstabulation is a tabular summary of data for two variables. s Crosstabulation can be used when: • one variable is qualitative and the other is quantitative, • both variables are qualitative, or • both variables are quantitative. As we said, the left and top margin labels define the classes for the two variables. Slide 26 Cross-tabulations Using SWStat STEPS: 1. From Data Area Set New Data Area 2. From Statistics Select Grouped Data 3. From the Window Select Cross-Tab 4.Click Calculate SEE NEXT EXAMPLE Slide 27 Cross-tabulations Using SWStat DATA Slide 28 Cross-tabulations Using SWStat Slide 29 Cross-tabulations Using SWStat : Solution Solution Cross-tabulations Slide 30 Crosstabulation Or Contingency Table s Example: Finger Lakes Homes The number of Finger Lakes homes sold for each style and price for the past two years is shown below. quantitative variable Price Range < $99,000 > $99,000 Total qualitative variable Home Style Colonial Log Split A­Frame 18 6 19 12 12 14 16 3 30 20 35 15 Total 55 45 100 Slide 31 Crosstabulation Or Contingency Table Frequency distribution for the price variable Price Range < $99,000 > $99,000 Total Home Style Colonial Log Split A­Frame 18 6 19 12 12 14 16 3 30 20 35 15 Total 55 45 100 Frequency distribution for the home style variable Slide 32 Crosstabulation Or Contingency Table s Insights Gained from Preceding Crosstabulation • The greatest number of homes in the sample (19) are a split­level style and priced at less than or equal to $99,000. • Only three homes in the sample are an A­Frame style and priced at more than $99,000. Slide 33 Crosstabulation Row Or Column Percentages? s As we said, converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables. See Next Two Slides Slide 34 Crosstabulation or Contingency Table (Row %) Price Range < $99,000 > $99,000 Home Style Colonial Log Split A­Frame 32.73 10.91 34.55 21.82 26.67 31.11 35.56 6.67 Total 100 100 Note: row totals are actually 100.01 due to rounding. (Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100 (Cell Count) (100) Row Total Slide 35 Crosstabulation or Contingency Table (Column %) Price Range < $99,000 > $99,000 Total Home Style Colonial Log Split A­Frame 60.00 30.00 54.29 80.00 40.00 70.00 45.71 20.00 100 100 100 100 (Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100 (Cell Count) (100) Column Total Slide 36 Scatter Diagram and Trendline A scatter diagram is a graphical presentation of the relationship between two quantitative variables. One variable is shown on the horizontal axis and the other variable is shown on the vertical axis. The general pattern of the plotted points suggests the overall relationship between the variables. A trendline is a line that provides an approximation of the relationship. Slide 37 Scatter Diagram and Trendline s A Positive Relationship y x Slide 38 Scatter Diagram and Trendline s A Negative Relationship y x Slide 39 Scatter Diagram and Trendline s No Apparent Relationship y x Slide 40 Example: Panthers Football Team s Scatter Diagram The Panthers football team is interested in investigating the relationship, if any, between interceptions made and points scored. x = Number of Interceptions 1 3 2 1 3 y = Number of Points Scored 14 24 18 17 30 Slide 41 Number of Points Scored Scatter Diagram 35 y 30 25 20 15 10 5 0 0 1 2 3 Number of Interceptions 4 x Slide 42 Panthers Football Team Example (Continued): s Insights Gained from the Preceding Scatter Diagram • The scatter diagram indicates a positive relationship between the number of interceptions and the number of points scored. • Higher points scored are associated with a higher number of interceptions. • The relationship is not perfect; all plotted points in the scatter diagram are not on a straight line. Slide 43 Tabular and Graphical Procedures Data Data Qualitative Data Qualitative Data Tabular Tabular Methods Methods Graphical Methods •Frequency Distribution •Rel. Freq. Dist. •Percent Freq. Distribution •Crosstabulation (Contingency Table) •Bar Graph •Pie Chart Quantitative Data Tabular Methods Methods •Frequency Distribution •Rel. Freq. Dist. •Cum. Freq. Dist. •Cum. Rel. Freq. Distribution •Stem­and­Leaf Display •Crosstabulation (Contingency Table) Graphical Graphical Methods Methods •Dot Plot •Histogram •Ogive •Scatter Diagram Slide 44 End of Chapter 2 & 3, Part B Slide 45 ...
View Full Document

This document was uploaded on 11/25/2011.

Ask a homework question - tutors are online