additional estimation topics ppt

additional estimation topics ppt - Additional Estimation...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Additional Estimation Topics Sections 8.7, 8.5 And elsewhere Additional Estimation Topics 1 Intervals for Population Mean In our previous lecture, we looked at intervals for the population mean. Our estimate came from: s X t n where t is a value from the t-distribution with n-1 degrees of freedom and s n is the approximate standard error. Additional Estimation Topics 2 Assumptions In forming this interval, we made some implicit assumptions: 1. We wanted to estimate the average value in the population of interest, and not some function of it; 2. The population was infinite in size or the sample was only a small fraction of the population; and 3. The observations in this population were normally distributed. Additional Estimation Topics 3 Deviation from assumptions Assumption (2): The population was infinite in size or the sample was only a small fraction of the population. If we sample a large portion of a population with known size N, theory shows that the usual standard error estimate is too large. There is a "finite population correction" we can apply to get a narrower interval. Additional Estimation Topics 4 The Finite Population Correction Theory is in Section 8.7, which is an "Online Topic" in our textbook. You can download this from the Chapter 8 materials at the textbook companion website: http://wps.prenhall.com/bp_levine_statsexcel_6/ Use a t-distribution interval with the corrected standard error: s ^ X = n N -n N -1 where the ^ on means "estimated". Additional Estimation Topics 5 Example In 2005, the Survey of Current Business (U.S. Bureau of the Census) contains 492 cities of at least 50,000 residents. We have a sample of 41 cities and want to estimate the average percentage of individuals living under the poverty level. Data in CitySample.xls which is on the course E-learning site. Additional Estimation Topics 6 Example (continued) n = 41 and N = 492 Sample average is 15.63 Sample s.d. is 6.295 Additional Estimation Topics 7 In PhStat Additional Estimation Topics 8 Notes on FPC Some texts say to apply it only when the sampling fraction f = n/N is at least .05 (when you sample at least 5% of the population). With f smaller than that, it really won't make much difference. N=10632, n=328, s=39.5, f=_____ , FPC=______, SE=_______ Additional Estimation Topics 9 FPC in a proportion problem We have a similar situation if we are estimating the proportion of elements in the population that have a particular characteristic. Our estimate of is p = X/n. Standard error is: ^ P = p (1 - p) n N -n N -1 10 Additional Estimation Topics Example In the sample of cities, 11 of the 41 had a per-capita income less than $20,000. 1. Use this to make an estimate of the proportion across the US with PCI this low. 2. Estimate the number of the 492 cities that are in this category. Additional Estimation Topics 11 8.5 Applications in auditing Auditing is one area in business that makes widespread use of sampling and interval estimation. An auditor often needs conclusions about a large population, but it is impractical to examine the "whole thing". Here we look at three applications. Theory is in section 8.5 (from the e-book). Additional Estimation Topics 12 Working with large populations Example: we are estimating the average size of prescriptions dispensed under the Medicare program. In one pharmacy, there are 12,000 such prescriptions filled over a three-year period. What role does N=12000 play in the statistical calculations? Additional Estimation Topics 13 Example A sample of 225 Medicaid prescriptions were examined. The average size was $55.12 and the standard deviation was $22.17 1. What is a 95% interval estimate for the population mean if N=12000? 2. What would it be if N=4200? Additional Estimation Topics 14 Estimating a total Suppose we need to estimate the total amount of Medicare reimbursement over the three years. We can first estimate the average of this variable, then multiply by the population size. Point estimate is thus N X-bar Additional Estimation Topics 15 The interval for the total Essentially, just take the interval for and multiply it by N ^ N X N t n -1 X In PhStat and the workbook called CIE Total.xls Additional Estimation Topics 16 Example Using the results of the previous sample of 225 Medicaid prescriptions, what was the total reimbursement to this pharmacy? n = 225, X-bar = $55.12 and s = $22.17 (1) N = 12000 (2) N = 4200 Additional Estimation Topics 17 Estimating a total error Suppose the items in the sample are examined for billing errors. Some have errors, some do not. The auditor makes a "difference estimate" for the whole population, based on the sample. We compute a new average, D-bar, for the average error among the n items in the sample. Many items have no error so are recorded as 0. The estimate for the population is N D-bar Additional Estimation Topics 18 Data in RxAudit.xls The 225 prescriptions were examined for non-Medicaid coverage, documentation errors and possible fraud. 12 of the claims were disallowed and the pharmacy asked to reimburse the state. Extrapolate this to the entire population of Medicaid prescriptions at this pharmacy. RxDisAllowed 11.69 61.00 21.60 18.89 9.50 19.20 5.36 38.43 58.65 36.35 46.43 207.09 Additional Estimation Topics 19 Need to factor in the "0"s The 12 disallowed prescriptions had an average of $44.52 and a standard deviation of $54.50. But, we also need to factor in the 213 other prescriptions that had no "error". Not too hard to get D-bar, but the SD is best left to PhStat or the workbook named CIE Total Difference.xls. Additional Estimation Topics 20 CIE Total Difference.xls Total Difference In Actual and Entered Copy the data for Data Population S ize S ample S ize Confidence Level 12000 225 95% the 12 disallowed values to the "DIFFERENCES" sheet. On "COMPUTE" fill in the population and sample sizes. Intermediate Calculations S of Differences um 534.19 Average Difference in S ample 2.37417778 Total Difference 28490.1333 S tandard Deviation of Differences 15.6953 FPC Factor 0.9906 S tandard Error of the Total Diff. 12438.4551 Degrees of Freedom 224 t Value 1.9706 Interval Half Width 24511.3554 Confidence Interval Interval Lower Limit Interval Upper Limit 3978.78 53001.49 21 Additional Estimation Topics Estimating the rate of noncompliance A related problem is estimating the proportion of elements in the population that have compliance problems. We can use our previous interval for , except we only care about the upper limit. That is, what is the maximum rate? Additional Estimation Topics 22 This is a one-sided interval For a two-sided interval the multiplier is Z= 1.96 if it is a 95% procedure. Here we only want the upper bound, so use Z=1.645 p(1 - p ) p + 1.645 n N -n N -1 Additional Estimation Topics 23 Our example We had N=12000 Medicaid prescriptions over 3 years. The audit sample of n=225 had X=12 (5.3%) prescriptions not in compliance. 1. What is our estimate for the population? 2. How many of the 12000 does this imply? Additional Estimation Topics 24 Estimating the population median Assumption (3): The observations in the population are normally distributed. Suppose the data are badly skewed or there are a fair number of outliers. If we are trying to estimate a "typical value" in the population, we might be better off estimating the median instead of the mean. We will look at a simple procedure for doing this. Additional Estimation Topics 25 The ".4n 2" rule In MedianEstimation.doc, Walsh outlines a quick-and-easy method for constructing an interval for the population median. Multiply the sample size by four-tenths then subtract two. Let r denote the integer obtained after rounding this figure. The r-th smallest and the r-th largest observations in the sample will form endpoints for the confidence interval. Additional Estimation Topics 26 Example, n = 10 Suppose you have n=10 observations in your sample, and they are (in sorted order): 34 37 45 52 53 58 67 78 101 123 So, r = .4(10)-2 = 4-2 = 2 Your interval is from the 2nd smallest to 2nd largest observation. The 2nd smallest is ___, and the 2nd largest is ___, so the interval is __________. Additional Estimation Topics 27 Approximately 95% The table at the end of the document shows that for n between 10 and 50, this method gives approximately a 95% interval. For n = 10, the exact probability content of the interval is .979, so you actually have a 97.9% interval. If you had a sample size of 23, your interval would use the 7th smallest and 7th largest data values, and it would be a 96.5% interval. Additional Estimation Topics 28 Perceived restaurant quality Does spending more in a restaurant lead to greater customer satisfaction? Readers of a consumer magazine rated 29 chain restaurants for satisfaction (scale of 0 to 100) based on food quality. The restaurants were categorized as highpriced or low-priced depending on the average expenditure per person. Data are in RestRating.xls. Additional Estimation Topics 29 Look at low-price restaurants Do higher-priced restaurants have higher quality? Customer satisfaction about taste of food Price of Meal Low High 59 62 73 76 77 77 78 79 80 80 80 81 81 83 83 Rating for Low-Price Restaurants 78 78 80 82 82 83 84 84 85 85 85 85 86 50 86 Low 55 60 65 70 75 80 85 90 Additional Estimation Topics 30 Estimates X-bar = 76.7, s = 7.07 95% interval for mean is 72.7 to 80.5 With n=15, r = __________. Interval for median is _______________. Additional Estimation Topics 31 ...
View Full Document

This note was uploaded on 02/14/2011 for the course QMB 3250 taught by Professor Thompson during the Spring '08 term at University of Florida.

Ask a homework question - tutors are online