{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

l22 - 6.896 Sublinear Time Algorithms December 2 2004...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
6.896 Sublinear Time Algorithms December 2, 2004 Lecture 22 Lecturer: Eli Ben-Sasson Scribe: Rafael Pass 1 Testing Proximity of Distributions A distribution p on [ n ] = { 1 , 2 , . . ., n } is given by the probabilities ( p 1 , .., p n ), such that n i =1 p i = 1, and 0 p i 1. We will consider algorithms that are given oracle access to a distribution p , i.e., each time we ”press a button” we get a sample i [ n ] with probability p i . Our objective is to test if two distributions p, q over [ n ] are “close”. 1.1 Definitions of closeness of distributions Ideally we would like to consider the L 1 distance between two distributions, defined as follows: | p q | = n i =1 | p i q i | Today we will instead focus on the (easier) L 2 distance (i.e., the Euclidean norm), defined as follows: || p q || = n i =1 ( p i q i ) 2 Later we will use this to estimate the L 1 distance. (A well known fact is that | p q | ≤ n || p q || . This fact will, however, not be enough to get a good estimate of the L 1 distance.) 1.2 The Theorem We will prove the following theorem: Theorem 1 (Batu, Fortnow, Rubinfeld, Smith, White [1]) For every constant , and every dis- tributions p, q over [ n ] , there exists a test that runs in time O ( δ 4 log(1 / )) such that: If || p q || < δ/ 2 , then Pr[ test accepts ] 1 If || p q || > δ , then Pr[ test accepts ] The query complexity of the tester (i.e., the number of sample) is less than the running time (which is constant for constant δ, ). Next lecture we show a tester for L 1 distance which uses a query complexity of n 3 / 2 (which thus is super constant). 1.3 Why is it harder to test L 1 distance? The following example shows why L 1 distance requires super-constant query complexity (even for con- stant δ, ). Consider the following two cases: p, q are two uniform distributions on two equally large (unknown) disjoint subsets of [ n ]. Note that | p q | = 2.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}