LUBYderand - Derandomizing Approximation Algorithms for...

Info icon This preview shows pages 1–16. Sign up to view the full content.

View Full Document Right Arrow Icon
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 6
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
Image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 10
Image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 12
Image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 14
Image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 16
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Derandomizing Approximation Algorithms for Hard Counting Problems Michael Luby International Computer Science Institute, Berkeley, CA and University of California at Berkeley "‘ TR—95—069 December 1995 ‘Research supported in part by National Science Foundation operating grant OCR-9304722 and NCR- 9416101, and United States-Israel Biuational Science Foundation grant No. 92-00226 1 Introduction This paper is a (biased) survey of some work in derandomization of randomized algorithms. Perhaps the most famous open problem in Computer Science is whether or not NP is equal to P, i.e., are efficient non-deterministic algorithms more powerful than efficient determinis- tic algorithms. An analogous question is whether nor not RP is equal to P, i.e., are efficient randomized algorithms more powerful than efficient deterministic algorithms. These two questions formally are quite similar, as the only difference between the def- initions of NP and RP are that for an NP language L, there is only required to be one witness for each 55' E L, whereas for RP a constant fraction of strings are required to be witnesses. Nevertheless, it is my belief that NP is more powerful than P, and on the other hand RP is equal to P. Paradoxically, there is some hope that an eventual proof that shows that NP is not equal to P will be strong enough to also show that RP is equal to P. Definition (function ensemble): Let f : {0,1}‘[“) —> {0,3303} denote a function ensemble, where t(n) and €(n) are polynomial in n and where f with respect to n is a function mapping {0, IF“) to {0,1}E(”). ' do Definition (P-time function ensemble): Call f : {0,1}*(”3 x {0,1}£(”3 —) {0, l}m(”l is a T(n)-time function ensemble if f is a function ensemble and there is a Turing machine such that, for all z 6 {0,1}‘("), for all y E {0,1}E[n), f(x,y) is computable in time Call 3' is a P-time function ensemble if T(n) = no”). in Definition (language): Let L : {0,1}”’ —> {0, 1} be a functiou ensemble. One can view L as a Zenguuge, where, fOr all e: 6 {0,1}”, 3: E L if L(x) = 1 and a" E L if L(:r) = 0. in In the following, definitions of a variety of cemplexity classes with respect to a language L as just defined are made. Definition (P): Say that L E P if L is a P-time function ensemble. in Definition (NP): Say that L E NP if there is a P-time function ensemble f : {0,1}” x {0, 1W“) —) {0, 1} such that for all z e {0,1}n, :r E L implies PrYEu{D!1}¢(n)[f(:r, Y) = 1] > 0. m e' L implies PrYEu{0I1}£(1-3) [f(s:, Y) = 1] = 0. .1. Definition (RP): Say that L E RP if there is a constant e >- 0 and a P-time functiOn ensemble f : {0,1}n X {0, 1}“1'33 —> {0,1} such that for all a: 6 {0,1}”, :t' E L implies PrY6u{0,1}gm[f(m,Y) = 1] :1: g L implies PI'YEumJFm) = 1] II N 5. 0. ii There are a few philosophical reasons to want to show that RP = P, e.g., does random- ization help in the general context of efficient computation? There are also some practical reaSOns, e.g., in practice if one uses a Monte Carlo algorithm then there should be great concern about where the source of random bits comes from. In practice, what one typically does is to take a very small number of random bits and stretch them into a large number using a simple pseudorandom generator, e.g., a linear congruential generator, and then use these bits as the source of randomness for the Monte Carlo simulation. The problem in practice is that there is no guarantee that these pseudorandorn bits are randOm enough to make the Monte Carlo simulation behave as if though they were truly random. In practice sometimes the anewer turns out to be that this is good enough, sometimes it has turned out to be not good enough, and in many cases the answer is not known. One of the moti- vations behind the work on derandomization of randomized algOrithrns is to put this entire question on a much firmer scientific footing, i.e., to be able to say authoritatively which pseudorandom generators can be prove.ny used for which Monte Carlo algorithms. 1.1 Randomness and Pseudoranclomness The notion of randomness tests for a string evolved over time: from set—theoretic tests to enumerable [24, Kolmogorov], recursive and finally limited time tests. There were s0me preliminary works that helped motivate the concept of a pseudorandom generator including [38, Shamir]. [9, Blum and Micah] introduce the fundamental concept of a pseudorandom generatOr that is useful for cryptographic (and other) applicatiOns, and gave it the significance it has today by providing the first pr0vable construction of a pseudorandom generator based on the conjectured diihcrilty of a well-known and well-studied computational problem, the discrete log problem. [41, Yao] introduces the new standard definition of a pseudorandOm generator, and shows an equivalence between the this definition and the next bit test introduced in [9, Blurn and Micah]. The standard definition of a pseudorandorn generator introduced by [41, Yao] is based on the fundamental concept of computational indistinguishability introduced previously in [18, Goldwasser and Micali]. [41, Yao] also shows how to construct a pseudorandom generator from any one-way permutation. Another important observation of [41, Yao] is that a pseudorandom generator can be used to reduce the number of random bits needed for any probabilistic polynomial time algorithm, and this shows how to perform a deterministic simulation of any polynomial time probabilistic algorithm in subexponential time based on a pseudorandom generator. The results on deterministic simulation were subsequently generalized in [10, BOppana and Hirschfeld]. The robust notion of a pseudorandom generator, due to [9, Blum and Micah], [41, Yao], should be contrasted with the classical methods of generating random looking bits as described in, e.g., [23, Knuth]. In studies of classical methods, the output of the generatOr is considered good if it passes a particular set of standard statistical tests. The linear congruential generator is an example of a classical method for generating random looking bits that pass a variety of standard statistical tests. However, [11, Boyar] and [25, Krawczyk] iii show that there is a polynomial time statistical test which the output from this generator does not pass. 2 Circuits A convenient way to view computation is via boolean circuits. Definition (cf,£‘,§*5(”}): For all n e N, let cams“) be the set of all ermine with an) El?!) boolean input variables 2: = (2:1, . .. ,2g(n]) of depth d(n) with a total of at most 5(a) gates. A circuit C E Cgég‘SW consists of “and”-gates and “or”-gates, where each gate is allowed unbounded fan-in. C consists of d(n) levels of gates, where ali gates at a given level are the same type. All the gates at levei 1 have as inputs any mixture of variables and their negations. For all i E {2, . . . , d(n)}, all gates at level i receive their inputs from the gates at level i — i. The set of gates at level (flu) are considered to be the output of C. J- For example, let f : {O,1}” >< {0,1}£("') —) {0,1} be the P-time function ensemble associated with an RP language L. One can think of f as a family of boolean circuits as follows. For each u, there are 2” circuits in the family with £(n)-bit inputs, one circuit for each of the 2“ possible values of a” E {0, 1}”. The circuit C; associated with the input a: computes the value of f(:r,y) on input 9 E {0,1}£("}. The circuit Cm can be derived frorn the description of f and from 3:. Since f is a P-time functiOn ensemble, without loss of generality one can say that Ca, consists of at most a polynomial number pm) of alternating levels of “and” and “or” gates, with a single output gate at the bottOIn level. The property that C}, has is that if :r E L then Pry[Cz(Y) = 1] 3 1/2 and if a: E L then inn/[04m = 1] = 0, where Y ER {0, 1W3. One Can view this cirCuit family for L as a subfamily of (3%)”). In terms of circuits, the approach to derandomization explored in this survey is to construct a pseudorandom generator 9 that can be used as the source of randomness for all C E C. This approach was initiated by [9, Blurn and Micah] and [41, Yao]. Definition (e-pseudorandom generator for a circuit family G): Let g : {0, nil”) —)- {0,1}” be a P-tirne function ensemble, where fin) < it. Let C be an infinite family of circuits, and let 0 E C be a circuit with n. inputs. The distinguishing probability of G for g is 6 = [atom = 11 * plates» = 11L where Y ER {0, 1}“ and Z ER {0, IV”), in which case 9 is said to e—approximate C' for any 5 > 6. Say that g is a e—pseudorandorn generator for C if g s—approfirnates O for all C E C. Say that C has time-success ratio 8(a) for 9 if the minimum over all circuits 0 E C with n inputs of the ratio of the number of gates in 0 divided by the distinguishing probability of C is at least 4. Given g, the derandomization for the RP circuit family described above is straightfor- ward: For any :1: E {0,1}”, one can approximate the fraction of the 23”} inputs on which Cm outputs 1 as follows. Let g : {0,1}"(“') —> {0,1}£{”) be a 1/2—pseudoraudom generator iv for Oggl’pm). For all z E {0,1}"{”l, compute Cx(g(z)), and if any of the answers is 1 then conclude that as E L, and otherwise conclude that x E’ L. Note that by the properties of g this procedure is guaranteed to correctly classify z with respect to membership in L. The smaller Mn) is in relationship to n the stronger the pseudolrandom generator 9. For example, if rte) is O(log(n)) then the entire procedure runs in deterministic polynomial time, showing that HP = P. One way to view the ab0ve approach is to find a small set S of €(nj-bit strings such that the average value of 041;) over 3; E S is close to the average value of Cato) over all y E {0,1}£["’). In the above, 3 can be viewed as the set {9(2) : z E {0,1}"(”l}. The other important ingredient to the derandomization is that the set S can be efficiently enumerated. This is true in the above approach because 9 is a P-tirne functiOn ensemble, and thus 8 can be enumerated simply by sequencing through all z E {0, 1PM) and computing 9(2). Suppose one is only interested in the property that S be small and drop the (crucial) requirement that .5' is easy to enumerate, Then, it is not hard to see that there is such a small set 5'. To see this, suppose that one chooses randomly a set S of size n + 1. If .r E’ L then, independent of S, I is correctly classified. On the other hand, if :c E L, then .r is incorrectly classified with probability at most Tin“). Since there are at most 2” such 3:, the probability that there is an z E {0, 1}” that is correctly classified is at most 1/2. Thus, there is a set S of size n + 1 that correctly classifies all :1: E {0, 1}” with respect to membership in L. This is the approach that [1, Adleman] tool; to show that RP C; P/poly, i.e., in a non—uniform sense randomization is no more powerful than deterministic computation. The problem with this approach is that there is no clue of how to efficiently generate the set S in deterministic polynomial time. Thus, the uniform version of the “RP = P?” question is far from resolved. 3 Derandomizing Particular Algorithms In some applications, the goal is to completely derandomize the algorithm to produce a deterministic solution. However, in other cases the basic goal is only to drastically reduce the number of random bits used by the randomized algorithms, e.g., reduce from (9(a) bits to O(10g(n)) bits. The philosophical and practical import of this is that it is hard in practice to produce high quality independent random bits at a high rate. 3.1 Pairwise Independence One of the key ideas in derandomizing randomized algorithms is the observation that sorne- times it is the case that the randomized algorithms works just as well if there is not full independence between the random variables. One special but important case is when pairs wise independence between the randOm variables sufices. Consider a set of random variables indexed by a set U, La, {2, : i E U}, where Z,- E T. For finite t = |T|, a uniform distribution assigns Pr[Zz = d] = 1ft, for all x E U, or E T. If this distribution were furthermore pairwise independent, one would have: for all m 79 y E U, for all a, £1 E T, Pr[Zz = a, z, = a] = Pr[z,, = a] - Pr[z,, = 51:1/9. This is not the same as complete independence, as evidenced by the following set of three pairwise-independent variables (U = {1,2, 3}, T = {0, 1}, t = 2): Each row 3 can be thought of as a function as : U -> T. Let S be the index set for these functions, where in this case S = {0, 1}2. For all x 75 y E U, for ail oz, 6 E T, saws) = o: A lists) = s] = 1/4 = 1/152 (Notice in particular that Prsefls[hs(x) : 31509)] = 1/2 = l/t.) Any set of functions satisfying this condition is a 2-universa1 family of hash functions. Definitions and explicit constructions of 2-universal hash functions were first given by [40, Carter and Wegman] . One simple way to construct a general family of hash functions mapping {0,1}” —> {0,1}” is to let S = {0, 1}” X {0, 1}”, and then for all s = {(1,6) E 8, for all a: E {0,1}n define 313(3) = as + b, where the arithmetic operations are with respect to the finite field GF[2“]. Thus, each h, maps {0,1}” —> {0,1}“ and S is the index set of the hash functions. For each s = (a, b) E 5, one can write: it, (rs) = as 1 [3 315(9) 9 1 b When :1: aé y, the matrix is non-singular, so that for any a, y E {0, 1}”, the pair (.543), h, takes on all 22" possible values (as s varies over all 5 Thus if s is chosen uniformly at random fi'om S, then (h5($), hs(y)) is also uniformly distributed. One can view 8 as the set of points in a sample space on the set of random variables {Zz : a: E {0,1}”} where Zz(s) = h3(m) for all 3 E 8. With respect to the uniform distribution 0n 5, these random variables are pairwise independent, i.e., for all s: # y E {0,1}", for all a, x3 E {0,1}” ,gsiztts) = a A Zita) = 5] = sggstzxei = a] - sggsizyts) = 6] = 1/22“. To obtain a hash function that maps to k < 11 bits, we can still use 8 as the function index family: The value of of the hash function indexed by s on input m is obtained by computing h,(:c) and using the first in bits. The imporant properties of these hash functions are: vi o Pairwise independence. I Succinctness — each function can be described as a 2n-bit string. Therefore, randomly picking a function index requires only 27:. random bits. I The function h5($) can easily be computed (in LOGSPACE, for instance) given the function index .9 and the input a. In the sequel, unless otherwise specified, references to pairwise independent hash func- tions refer to this construction, and 5 denotes the set of indices for the hash family. 3.2 Applications to Particular Algorithms The original applications described in [40, Carter and Wegman] were applications to hash- ing. Subsequently 2-universal hashing has been applied in surprising ways to a rich variety of problems. Fer a survey of some of these applications, see [31, Luby and Wigderson]. Consider, for example, the MAXCUT problem: given a graph G = (V, E), find a two- coloring ofthe vertices x : V —) {0,1} so as to maximize C(X) = y) E E : 3((55) 7E The following is a description of a solution to this problem that is guaranteed to produce a out where at least half the edges cross the cut. If the vertices are colored randomly (0 or 1 with probability 1/2) by choosing X uniformly from the set of all possible 2'“ colorings, then: E[c(x)l = Z Prlx($)¢x(y)l = % (ayleE Thus, there must always be a cut of size at least Let S be the index set for the hash family mapping V —> {0,1}. Since the summation above only requires the coloring of vertices to be pairwise-independent, it follows that E[c(hs)] = %1 when 3 ER 8. Since |5| = IV|2, one can deterministic-ally try it, for all s E S in polynomial time (even in the parallel complexity class N 0), and for at least one s E S, h, defines a partition of the nodes where at least 1% edges cross the partition. This derandomization approach was developed and discussed in general terms in the series of papers [13, Cher and Goldreich], [27, Luby], [5, Alon, Babai and Itai]. There, the approach was applied to derandomize algorithms such as witness sampling, a fast parallel algorithm for finding a maximal independent set, and other graph algorithms. This work was extended to more efficient parallel approaches in [28, Luby], and generalized in [8, Berger and Rompel] and [32, Motwani, Naor and Naor] to apply to parallel solutions of other combinatOrial problems. Other examples include the use of five-wise independent random variables to choose the pivot elements for quicksort while still maintaining the O(nlog(n)) running time [22, Karloff and Raghavan]. Extending the werk of [22, Karloff and Raghavan], [33, Mulmuley] vii reanalyzes many classical computational geometry problems, including trapezoidal trian- gulations, convex hulls, Voronoi diagrams, and hidden surface removal. The new analysis shows that only limited independence between the random variables suffices for the Ian dornized algOrithm. Other ideas that have been used to derandomize particular algorithms include using expander graphs, and combining expander graphs with pairwise independence techniques. An example of this approach is [21, Karger and Motwani]. 4 Derandomizing Depth Two Circuits In this section depth two unbounded fan-in boolean circuits are c0nsidered. The circuit subfamily of Grimm-H with a first layer of m = m(n) “an ” gates all feeding into a single output “or” gate can be viewed as a formula F in disjunctive normal form [DNF) on n variables with m clauses. Let t be the maximum length of a clause in the formula, and let Pr[F] denote the probability that a random, independent and uniformly chosen truth assignment to the variables satisfies F. The problem of computing Pr[F] exactly is known to be #P-complete [39, Valiant]. On the other hand, for many applications a good estimate of Pr[F] is all that is needed. [29, Luby and Velickovié] introduces several ideas which can be combined with known results to obtain approximation algorithms for the DNF problem. The first idea is that of stmflower reduction in a way similar to the way that [37, Razborov] uses it to prove exponential lower bounds for the size of the smallest monotone circuit for finding the biggest clique in a graph. Given a DNF formula F, one looks for a large collection of clauses which form a sunflower, i.e., the intersection of any two distinct clauses in this collection is the same. Then, all the clauses in this collection are replaced by their common pairwise intersection, thus obtaining another formula which has probability of being satisfied close to that of F. This procedure is repeated until no large sunflowers can be found, obtaining a new formula F“. A them-em of [14, Erdds and Redo] implies that at the end of this procedure there are not too many clauses in F'. Because F’ is so small, it turns out to be easier to appr0ximate Pr[F’], and because of the properties of the reduction this approximation is also a good approximation of Pr[F]. This reduction can be combined with the algorithm from [36, Nisan and Wigderson] (using the improvement based on an observation from [30, Luby, Velickovié and WigdersouD to produce a polynomial time e-approximation for Pr[F] when t = 0(log1i8(nm)) and e is not too small. The second approach relies on special constructions of small probability distributions. Given a distribution 1') let Prp[F] denote the probability that F is satisfied by a truth assignment chosen according to ED. The goal is to find a distribution ID with a small sample space such that Pry [F] is close to Pr[F]. Then, Pr~D[F] can be calculated by exhaustive consideration of the points in the space. An easy counting argument shows that such a dis- tribution exists. The results of [29] can be viewed as progress towards finding a polynomial time construction of...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern