Bstat8 - Lessons in Business Statistics Prepared By P.K...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lessons in Business Statistics Prepared By P.K. Viswanathan Chapter 8: Hypothesis Testing Introduction Managers have to make decisions with minimum risk in an environment characterized by uncertainty. Acceptance or rejection of a decision depends on acceptance or rejection of a hypothesis. For example, a marketing manager is facing a decision whether to introduce a new product in the market or not. If his company could get a market share of 15 percent or more, then the new product would be introduced in the market. A suitable hypothesis formulation and testing it would help the manager take the right decision. This chapter covers the various tests of hypothesis that are useful in making sound decisions. 1) Statistical Hypothesis-A Conceptual Framework What is a Statistical Hypothesis? A statistical hypothesis is a statement about a population parameter. It may or may not be true. The manager has to ascertain the truth of the hypothesis. 1) Statistical Hypothesis-A Conceptual Framework-Continues What is a Statistical Hypothesis?Example Statement 1: Not more than 20% of the adults watch children’s program in the television. Statement 2: More than 20% of the adults watch children’s program in the television. First, it should be noted that these two hypotheses cannot be simultaneously true. Only one of them will be true. Likewise, these two hypotheses cannot be simultaneously false. Only one of them will be false. The acceptance or rejection of a particular hypothesis leads to the acceptance or rejection of a particular decision. Statement 1 given above is called the Null Hypothesis H0 . Statement 2 is called the Alternative Hypothesis H1 . 1) Statistical Hypothesis-A Conceptual Framework Type I Error and Type II Error Null Hypothesis Reject Accept True Type I Error () No Error False No Error Type II Error ( ) 1) Statistical Hypothesis-A Conceptual Framework The probability of making a Type I error is called the level of significance of the test. It is designated by the Greek letter alpha (1- is the confidence level that . ) says that you are right in your assessment (1- of the )% times. If you set 0.05 and happen to reject the null = hypothesis at this level, there is a 5% probability that you have rejected the null hypothesis when in fact it is true. This also means that you are 95% confident that you have accepted the null hypothesis when it is true. 1) Statistical Hypothesis-A Conceptual Framework The probability of making a Type II error is symbolized by the Greek letter 1-is called the power of the test. The power of . the test is the probability of rejecting the null hypothesis when in fact it is false. Suppose you keep =10%, it means that the power of the test is 90%. That is, the probability of rejecting the null hypothesis when it is false is 90% and only 10% of the time you commit the error of accepting the null hypothesis when it is false. It is desirable to keep both and at minimum level. However, a decrease in will lead to an increase in and an in , increase in will lead to a decrease in It is a general practice . to fix and let vary. It is convention to set = 0.05 that , corresponds to the confidence level of 95%. Some practitioners at times also keep = 0.01 1) Statistical Hypothesis-A Conceptual Framework 1) Statistical Hypothesis-A Conceptual Framework A hypothesis could be directional or non-directional. A directional hypothesis is one in which the population parameter is structured to be greater than or equal to or less than or equal to a specified value. This is known as a one-tailed test(one-sided test ) in the parlance of statistical hypothesis. A nondirectional hypothesis is one in which the population parameter is structured to be equal to a specified value. This is known as a two-tailed test(two-sided test). 2) Hypothesis Testing–Univariate Case (One Sample) Hypothesis Test for a Single Mean Primary Purpose of the Test : To test hypotheses that compare the population mean of interest to a specified value 2) Hypothesis Testing–Univariate Case (One Sample) -Illustration Is the average waiting time for the customers of Smart Supermarket at the checkouts greater than 15 minutes? 2) Hypothesis Testing–Univariate Case Hypothesis Structure for the Illustration H0: H1: 15 > 15 2) Hypothesis Testing–Univariate Case Population Mean Example The marketing manager of a large restaurant has been asked to conduct a survey of its customers belonging to a particular income class. The president of the restaurant is interested in the mean income of its customers. He is further interested in comparing this mean income with that of a recently concluded census study by the government. The government study shows a mean income of Rs.300000 per year for this class of customers with a standard deviation of Rs. 30000. The president is desirous of finding out whether the population mean of its customers in this category is Rs. 300000 per year or not. The marketing manager has picked up a random sample of 100 customers of this class from the customer database. The sample data show a mean income of Rs. 293000 per year. Perform a comprehensive statistical hypothesis testing procedure and state your conclusions. 2) Hypothesis Testing–Univariate Case Population Mean Solution to the Example Step 1: Formulate the null and the alternative hypothesis. H0: 300000 (the mean income of the population is equal to = Rs.300000) H1: 300000 (the mean income of the population is not equal to Rs.300000) Step 2: Select the right test statistic. The correct test statistic to be used here is the Z test. Why? Because, the sample size is large. The formula for the Z test is given below: Please note that Z follows a standard normal distribution with mean 0 and standard deviation 1. 2) Hypothesis Testing–Univariate Case Population Mean Solution the Example Continues Step 3: Decide on the level of significance When the value of . the level of significance is not specified in a problem, it is a convention to set the value equal to 0.05. What we’re saying is that only 5% of the time we make the mistake of rejecting the null hypothesis when it is true. Step 4: Compute the test statistic based on sample data. The formula to be used is X μ Z σ n 2) Hypothesis Testing–Univariate Case Population Mean Solution the Example Continues Under the assumption of the null hypothesis being true, you can substitute the value Rs. 300000 in the place of μ . X σ =293000. n =100. = 30000. Upon substitution of these values in the formula, we have 293000 00000 = -2.33 3 Z 30000 100 2) Hypothesis Testing–Univariate Case Population Mean Solution the Example Continues Step 5: Determine the Critical Value for the chosen level of significance. Here, =0.05. The critical value corresponding to the two-tailed test where each tail contains an area of can be easily /2 worked out by using Microsoft Excel. The methodology is already covered in the previous chapters. Here, =0.025. The critical /2 value of Z =1.96 for positive Z and –1.96 for negative results. Since the normal distribution is symmetrical, we can ignore the sign of Z and just take the positive value of Z. That is the critical value of Z is 1.96. Incidentally, if you choose =0.01, then the critical value of Z =2.58. 2) Hypothesis Testing–Univariate Case Population Mean Solution the Example Continues Step 6: Compare the computed test statistic with the critical value. Here, computed Z = -2.33. Since the normal distribution is symmetrical, take the positive value of Z and compare it with the Critical Z =1.96. If you take the negative computed Z, then compare it with –1.96. Take the positive Z. Simple is best. Why bother? Step 7: Decision. If the computed Z is greater than the table Z, reject the null hypothesis H0 and accept H1. Else accept H0. This is same as finding out whether the computed Z falls in the acceptance region or the rejected region. In our case, computed value of Z (take just the positive value) 2.33 is greater than the critical value of Z =1.96. Hence, it falls in the rejection region. Reject H0 and accept H1. (See picture in next slide) 2) Hypothesis Testing–Univariate Case Population Mean 2) Hypothesis Testing–Univariate Case Population Mean Interpretation of the Results for our example: We have rejected H0 and accepted H1. What does this mean? This means that the population mean income of the category of interest to the president of the restaurant is not equal to Rs. 300000 per year at 5% level of significance. Is it more than or less than Rs. 300000? The sample mean suggests that it may be less than Rs. 300000 per year. Can you do this exercise now as a one-tailed hypothesis test? This is a Progressive Test Question for you. 2) Hypothesis Testing–Univariate Case (One Sample)-Population Proportion Test for a Single Proportion Primary Purpose of the Test : To test hypotheses that compare the population proportion of Interest to a specified value 2) Hypothesis Testing–Univariate Case (One Sample)-Illustration Is the proportion of households owning Color TVs in Chennai less than 0.4? 2)Hypothesis Testing–Univariate Case (One Sample) Hypothesis Structure for the Illustration Statement of the Null and Alternative hypothesis H0 : H1 : P 0.40 P 0.40 2)Hypothesis Testing–Univariate Case (One Sample) Example Problem A marketing manager of an enterprise is facing a decision whether to introduce a new product into the market or not. Consumer acceptance measured in a blind comparison test is agreed upon as an appropriate basis for evaluation. Marketing of the new product will be pursued only if the acceptance rate exceeds 30%. Otherwise, the new product will not be introduced in the market. A random sample of 200 consumers reveals that the acceptance rate is 32%. Using a level of significance of 0.05, perform the hypothesis testing and recommend your action. 2)Hypothesis Testing–Univariate Case (One Sample) Solution: This is a classical hypothesis-testing problem of the population proportion. This is also a one-tailed test. This involves a large sample in which n =200. The Z test for the proportion is the appropriate test. H0: P 0.30 (The population proportion of consumer acceptance is less than or equal to 0.30) H1: P > 0.30 (The population proportion of consumer acceptance is greater than 0.30) The Z test for proportion is to be used here. It is given by Z p P P(1 . P) n Please note that Z follows a standard normal with mean 0 and standard deviation 1. 2)Hypothesis Testing–Univariate Case (One Sample) Under the null hypothesis being true, P=0.30. p=0.32. Substituting, we have Z 0.32 0.30 0.30(1 0.30) 200 = 0.62. The critical value of Z for a one - tailed test is 1.65. Since, the com puted Z is less than critical Z, accept H 0 . W hat do you conclude? W e have no evidence to reject the null hypothesis based on the sample data at 5% level of significance. In this case even at 1% level of significance, we cannot reject H 0 . This im plies that you accept H 0 and conclude that the population proportion of consum er accepta nce is less than or equal to30% . Hence, the m anager should not introduce the new product in the m arket. You m ay wonder how com e when the sam ple proportion is 32% , you say that you should not introduce the new product? Is not 32% better than the 30% stipula ted? Yes, but you see statistically speaking, 32% sample proportion has arisen due to chance and not a real one. This is why you say statistically not significant. As long as statistical significance does not take place, you cannot reject the null hypothes is. This is a real beauty of testing of hypothesis. Unless the m anager wants to gamble, he should not venture to introduce the new product. 2)Hypothesis Testing–Univariate Case (Small Sample) Test Statistic (X ) μ t S n 2)Hypothesis Testing–Univariate Case (Small Sample) Example: An investigator took a random sample of eight pieces of aluminum die-castings and observed the sample mean strength to be 31.5. Before taking the measurement, the investigator knew that the population mean strength for an older type of aluminum die-casting was 33. The standard deviation of the sample measurements was 1.3. The investigator would like to know whether the population mean strength of the aluminum diecasting is 33. Setup the null and the alternative hypothesis, perform the test, and comment on the results. 2)Hypothesis Testing–Univariate Case (Small Sample)-Solution to Example This is a small sample case with unknown population standard deviation. The appropriate test is the t test. Please note also from the wording of the problem, you need to perform a two-tailed test. Just like the normal distribution, t is also symmetrical and it is enough if you compare the positive value of the computed t with the critical t for n-1d.f at 5% level of significance. 2)Hypothesis Testing–Univariate Case (Small Sample)-Solution to ExampleContinues H0: =33 H1: 33 (X ) = μ t S n (31.5 33) 1.3 8 The positive value of the computed t =3.26. The critical t value for 7d.f(n-1 =8-1) at 5% level of significance from Excel paste function is 2.36(2 places of decimal). Since the calculated t value is greater than the critical t, reject H0 an accept H1. The conclusion is that the mean strength of aluminum die casting of the population is not 33 at 5% level of significance. 3)Hypothesis Testing–Bivariate Case Population Mean Test of Two Population Means Primary Purpose of the Test: To test hypotheses that compare the Population Mean of interest for two separate populations. (Samples are independent). Example: Is the average expenditure per household on eating out significantly higher in Bangalore than in Calcutta? 3)Hypothesis Testing–Bivariate Case Difference Between Means(Large Sample) Example: A test in computer course was conducted for a group of students, consisting of 70 boys and 60 girls. The marks scored by the students are given below: Boys Girls n1 = 70 n2 = 60 X1 = 70 X2 = 65 ( X 1 X 1 ) 2 = 7,500 ( X 2 X 2 ) 2 = 7,800 Is there a significant difference between the performance of the boys and the girls? 3)Hypothesis Testing–Bivariate Case Difference Between Means(Large Sample) Solution: From the wording of the question, it is clear that the problem could be structured as a two-tailed test. The sample sizes for both the populations are large and hence the Z test is appropriate to use. Let us take a level of significance of 5%. H0: = ( The population mean score of boys 1 2 =The population mean score of girls) H1: ( The population mean score of boys 1 2 The population mean score of girls) 3)Hypothesis Testing–Bivariate Case Difference Between Means(Large Sample) X 1 X 2 Z= 1 2 n1 n2 2 2 . The standard deviation of the two populations are not given. We can use the sample standard deviations in the place of the population standard deviations. 2 2 2 2 S1 S2 1 2 . In the data for the problem, we are That is use in the place of n1 n2 n1 n2 ( X ( X X 1 ) 2 =7500 and X 2 X 2 ) 2 = 7800. By definition ( ( X2 1 1) 2 X 2 X 2 ) 2 S2 and . Substituting and simplifying we have n1 1 n2 1 given S1 2 1 S1 2 (7500/69) = 108.70 S 2 2 (7800/59) =132.20. Z= X 1 X 2 2 2 S1 S 2 n1 n2 70 65 = =2.58. 108.70 132.20 70 60 3)Hypothesis Testing–Bivariate Case Difference Between Means(Large Sample) Conclusion: In our example, calculated Z(2.58) is greater than the table Z(1.96) at 5% level of significance. Reject H0 and accept H1. That is the performance of the boys and the girls are not identical. The null hypothesis of equally good performance is rejected. The difference in the mean scores between boys and girls is significant. 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample) Example:Aptitude test was conducted for two groups of executives-group1 consists of engineers and group2 consists of accountants. The scores obtained by the candidates are given below: Engineers 125 Accountants 112 115 98 119 109 85 96 97 77 107 70 125 114 125 100 118 Do you find any significant difference between the scores of these two groups? 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample) 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample) Now compute t = we have t = X 1 X 2 1 S2 n 1 112.89 97 . Substituting the values form the spreadsheet above , 1 n2 = 2.18. As the calculated t value exceeds the critical t for 15 1 1 224.73 9 8 degrees of freedom at 5% level of significance (2.13 from the t table in Appendix E), Reject the null hypothesis and accept the alternative hypothesis. 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample Paired) Test of Two means:Small sample Primary Purpose of the Test To test hypotheses that compare the Population Mean of the same variable and the data are collected based on before and after situation scenario ( dependent sample). 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample Paired)-Structure of the Null and Alternative Hypothesis H0 : H1 : - = 0 1 2 - = 0 1 2 Test Statistic t=D where D = XA - XB S/ n 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample Paired) Example A company conducted a promotional campaign in 10 randomly chosen retail outlets. Monthly sales in 1000 units are shown before and after the campaign. Is there any significant difference in sales before and after the campaign? Outlet No 1 2 3 4 5 6 7 8 9 10 Before the Campaign 240 225 250 280 200 150 165 100 130 170 After the Campaign 270 245 260 290 190 160 160 130 135 175 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample Paired) Solution This is a two-tailed hypothesis problem. Let us take a level of significance of 0.05. The appropriate test statistic is a paired t for the dependent sample. H0: = (The means sales are same before and after the 1 2 campaign) H1: (The mean sales are not same before and after the 1 2 campaign) 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample Paired) Performing Calculation for the Paired t Test Outlet No Before the Campaign(XB) After the Campaign(XA) D = X A-X B D D 1 2 3 4 5 6 7 8 9 10 240 225 250 280 200 150 165 100 130 170 270 245 260 290 190 160 160 130 135 175 30 19.5 20 9.5 10 -0.5 10 -0.5 -10 -20.5 S= 174.7222 10 -0.5 S= 13.21825 -5 -15.5 30 19.5 5 -5.5 5 -5.5 D 10.5 D t 2.5119745 (S / n ) 2 3)Hypothesis Testing–Bivariate Case Difference Between Means(Small Sample Paired) The table value of t from Appendix E for 9 degrees of freedom at 5% level of significance is 2.26. Since the calculated t is greater than critical t, reject the null hypothesis and accept the alternative. 3)Hypothesis Testing–Bivariate Case Difference Between Proportions Test of Two Proportions Primary Purpose of the Test: To test hypotheses that compare the Population Proportion of Interest for two separate populations 3)Hypothesis Testing–Bivariate Case Difference Between Proportions Example: Two random sample surveys, conducted with two months gap between the two, assessed public opinions on the outcome: The question that was posed was “If the general election was going to take place tomorrow, would you cast your vote for or against the ruling party?” The results of the two surveys are tabulated below: 1st Survey Sample size………………… 1000 For the ruling party…………… 520 Against the ruling party……… 480 2nd Survey 800 380 420 Set up the appropriate hypotheses, test and draw your conclusions. 3)Hypothesis Testing–Bivariate Case Difference Between Proportions Solution: This requires a structuring of the null hypothesis as no change in pattern of voting between the two months by the public. Symbolically H 0: P1 =P2 (The population proportions favoring the ruling party in the two months gap is the same) H 1: P1 P2 (No. It is not the same). Let us take a level of significance of 5%. It is a two -tailed test and therefore critical value of Z =1.96. The test statistic Z p1 p 2 1 1 p (1 p) n 1 n2 n p 2 p2 n p1 1 . First let us calculate p. n1 2 n n p 2 p2 n 1000(520 / 1000) (380 / 800) 800 p1 1 p = =0.50. n1 2 n 1000 800 3)Hypothesis Testing–Bivariate Case Difference Between Proportions Z p1 p 2 0.52 .475 0 =Z =1.90 1 1 1 1 0.5(1 .5) 0 p (1 p ) 1000 800 n 1 n2 The calculated value of Z is less than the table value of Z. Accept H 0 . The inference is that the population proportion of the public favoring the ruling party in the two months gap has not changed at a level of significance of 5%. ...
View Full Document

Ask a homework question - tutors are online