6.896 Sublinear Time Algorithms
February 27, 2007
Lecture 7
Lecturer: Ronitt Rubinfeld
Scribe: Brendan Juba
1
Recap
Recall from last time that we were considering the boolean hypercube,
{±
1
}
n
.
We had defined the
partial ordering, for
x, y
∈ {±
1
}
n
,
x
≤
y
if
∀
i x
i
≤
y
i
. We said that a probability distribution
p
was
monotone over
{±
1
}
n
if
∀
x
≤
y p
x
≤
p
y
. Recall also that we’d defined
bias
(
X
) =
∑
i
X
i
.
Additive Chernoff Bound (Hoeffding)
For
X
1
, . . . , X
m
i.i.d.
random variables with range [
−
a, a
], ˆ
μ
=
1
m
∑
m
i
=1
X
i
, and
μ
=
E
[ˆ
μ
],
∀
ρ >
0
Pr[

ˆ
μ
−
μ

> ρ
]
≤
2 exp
−
ρ
2
2
a
2
m
.
We were in the middle of the analysis of the following algorithm:
Algorithm: Test Uniform
1. Pick
s
= Θ
(
n
ε
2
log
n
ε
)
samples
X
(1)
, . . . , X
(
s
)
∈
p
.
2. If any
X
(
i
)
has

bias
(
X
(
i
)
)

>
2
n
log(20
s
), stop and output ‘nonuniform.’
3. Let ¯
μ
=
1
s
∑
s
i
=1
bias
(
X
(
i
)
).
4. If ¯
μ
≤
ε
4
, output ‘uniform.’ Otherwise, output ‘nonuniform.’
2
Analysis of Uniformity Test
Last time, we saw the analysis of the case where
p
=
U
D
. Furthermore, in the case where

p
−
U
D

> ε
,
we proved the following claim:
Claim 1
If

p
−
U
D

> ε
, then
E
p
[
bias
(
X
)]
≥
ε
.
Today we will complete the analysis of this second case, which is broken into two subcases. We wish
to show that in both subcases, the algorithm outputs ‘nonuniform’ with probability at least 2
/
3.
Case 2a:
Pr
p
[

bias
(
X
)

>
2
n
log(20
s
)]
≥
10
s
Notice that on any single sample
X
(
i
)
,
Pr[step 2 does not output nonuniform on
X
(
i
)
]
≤
1
−
10
s
where, since we have
s
independent samples,
Pr[step 2 does not output nonuniform]
≤
1
−
10
s
s
<
1
3
(note that this probability is actually quite small, but 1/3 is suﬃcient)
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
1 1 ... 1
D’ D
1
1 ...
1
Figure 1
: The set
D
:
D
without the “tails” of the distribution.
Case 2b:
Pr
p
[

bias
(
X
)

>
2
n
log(20
s
)]
<
10
s
In this case, we might stop in step 2 (and if we do, that’s great for us), but it will suﬃce to show that
step 4 will output nonuniform with suﬃciently high probability otherwise.
Notice that if step 2 passes, each
i
satisfies

bias
(
X
(
i
)
)
 ≤
2
n
log(20
s
), so our samples are from a
distribution that has been conditioned on this event: letting
D
=
{±
1
}
n
be our sample space, the event
we are concerned with is
D
=
{
x
∈
D
:

bias
(
X
)
 ≤
2
n
log(20
s
)
}
. If we take
p
=
p

D
, then after step
2, we know we have a sample drawn from
p
.
Of course, since passing step 2 suggests that the probability of obtaining a sample from
D
\
D
is
very small, we anticipate that
p
is close to
p
. We will make this precise by bounding
E
X
∈
p
[

bias
(
X
)

]
−
E
X
∈
p
[

bias
(
X
)

]:
Claim 2
In this case, for suﬃciently large
n
,
E
X
∈
p
[

bias
(
X
)

]
−
E
X
∈
p
[

bias
(
X
)

]
≤
ε/
8
Proof
It is easy to see that
E
X
∈
p
[

bias
(
X
)

] =
E
X
∈
p
[

bias
(
X
)

]
·
Pr[
X
∈
D
] +
E
x/
∈
p
[

bias
(
X
)

]
·
Pr[
X /
∈
D
]
Let
α
= 10
/s
. By assumption (in case 2b) Pr[
X
∈
D
]
≥
1
−
α
and Pr[
X /
∈
D
]
≤
α
. It is also immediate
that
E
x
∈
p
[

bias
(
X
)

]
≤
2
n
log(20
s
)
Outside
D
, on the other hand, we can’t say anything remarkable about the bias. The best bound we
get follows from the fact that for any
X
,
−
n
≤
bias
(
X
)
≤
n
:
E
x/
∈
p
[

bias
(
X
)

]
≤
n
.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '04
 RonittRubinfeld
 Algorithms, Probability theory

Click to edit the document details