100
25
Again, the standard error is less than that for Y:
25
sy
z
SE[Y] =
1 -
0.522.
100
25
We expect regression estimation to increase the precision in this example because the

SCI
2-,
ACC,
:.d
,fl
L].
'''
3.3 Estimation in Domains
11
variables photo and field are positively correlated (r = 0.62). To estimate the total
number of dead trees, use
tvrcg = (100)(11.99) = 1199;
SE[t3.reg] _ (100)(0.408) = 40.8.
An approximate 95% confidence interval for the total number of dead trees is given
by
1199 ± (2.07)(40.8) = [1114, 12831.
Because of the relatively small sample size, we used the t-distribution percentile (with
n - 2 = 23 degrees of freedom) of 2.07 rather than the normal distribution percentile
of 1.96.
3.2.2
Difference Estimation
Difference estimation is a special case of regression estimation, used when the inves-
tigator "knows" that the slope B1 is 1. Difference estimation is often recommended
in accounting when an SRS is taken. A list of accounts receivable consists of the
book value for each account-the company's listing of how much is owed on each
account. In the simplest sampling scheme, the auditor scrutinizes a random sample
of the accounts to determine the audited value-the actual amount owed-in order
to estimate the error in the total accounts receivable. The quantities considered are
yj = audited value for company i
x, = book value for company i.
Then, y - .x is the mean difference for the audited accounts.
The estimated total difference is t,. - tx = N(y -z); the estimated audited value
for accounts receivable is
tydiff = tx + (tp - tx)
Again, define the residuals from this model: Here, e1 = yt - xi. The variance of tvdiff
is
V(tydiff) = V [tx + (t,. - tx)] = V (te),
where t, = (N/n) Y'tcs e;. If the variability in the residuals ei is smaller than the
variability among the yj's, then difference estimation will increase precision.
Difference estimation works best if the population and sample have a large fraction
of nonzero differences that are roughly equally divided between overstatements and
understatements, and if the sample is large enough so that the sampling distribution
of (y -x) is approximately normal.
In auditing, it is possible that all audited values in the sample are the same as
the corresponding book values. Then, y = x, and the standard error of t,, would be
calculated as zero. In such a situation, where most of the differences are zero, more
sophisticated modeling is needed.
3.3
Estimation in Domains
Often we want separate estimates for subpopulations; the subpopulations are called
domains or subdomains. We may want to take an SRS of visitors who fly to New
York City on September 18 and to estimate the proportion of out-of-state visitors who

CD.
R>>
X12
.22
18
Chapter 3: Ratio and Regression Estimation
intend to stay longer than I week. For that survey, there are two domains of study:
visitors from in-state and visitors from out-of-state. We do not know which persons
in the population belong to which domain until they are sampled, though. Thus, the
number of persons in an SRS who fall into each domain is a random variable, with
value unknown at the time the survey is designed.

#### You've reached the end of your free preview.

Want to read all 503 pages?

- Fall '13
- target population