This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Lecture Notes on Complexity and NPcompleteness
1. Reductions Let A and B be two problems whose instances require as an answer either a yes" or a no"
3SAT and Hamilton cycle are two good examples. A reduction from A to B is a polynomialtime algorithm R which transforms inputs of A to equivalent inputs of B. That is, given an
input x to problem A, R will produce an input Rx to problem B, such that x is a yes"
input of A if and only if Rx is a yes" input of B.
A reduction from A to B, together with a polynomial time algorithm for B, constitute a
polynomial algorithm for A see Figure. For any input x of A of size n, the reduction R takes
time pn a polynomial to produce an equivalent input Rx of B. Now, this input Rx
can have size a most pn since this is the largest input R can conceivably construct in pn
time. If we now submit this input to the assumed algorithm for B, running in time q m on
inputs of size m, where q is another polynomial, then we get the right answer of x, within a
total number of steps at most pn + q pn also a polynomial! ‘‘yes’’
x Reduction R R(x) Algorithm for B
‘‘no’’ input of A (input
of B) Algorithm for A Figure 1: Reduction
We have seen many reductions so far, establishing that problems are easy e.g., from
matching to max ow. In this part of the class we shall use reductions in a more sophisticated
and counterintuitive context, in order to prove that certain problems are hard. If we reduce
A to B, we are essentiually establishing that, give or take a polynomial, A is no harder than
B. We could write this as
A B,
an inequality between the complexities of the two problems. If we know B is easy, this establishes that A is easy. If we know A is hard, this establishes B is hard. It is this latter
implication that we shall be using soon. 2. Problems, problems: : : We have seen many problems that we can solve in polynomial time and we know this is
good. The class of all problems that are so solvable is denoted P. We review several problems
from Pbelow; in each case we also list another problem, bearing a super cial similarity to the one in P. The similarity is, indeed, super cial: the second problem in each pair is not
known or believed, or expected to be solved in polynomial time; only algorithms that are
exponential in the worst case are known for these problems.
Given a weighted graph and an integer K , is there a tree
that connects all nodes of the graph whose total weight is K or less?
traveling salesman problem: Given a weighted graph and an integer K , is there a
cycle that visits all nodes of the graph whose total weight is K or less?
minimum spanning tree: Notice that we have converted each one of these familiar problems into a decision problem,
a yesno" question, by supplying a goal K and asking if the goal can be met. Any optimization problem can be so converted we shall soon see more examples. If we can solve the
optimization problem, we can certainly solve the decision version actually, the converse is in
general also true. Therefore, proving a negative complexity result about the decision problem
for example, proving that it cannot be solved in polynomial time immediately implies the
same negative result for the optimization problem. By considering the decision versions, we
can study optimization problems sidebyside with decision problems see the next examples,
and consider reductions between them. This is a great convenience in the theory of complexity
which we are about to develop.
Given a directed graph, is there a closed path that visits each edge
of the graph exactly once?
Hamilitonian graph: Given a directed graph, is there a closed path that visits each
node of the graph exactly once?
Eulerian graph: A graph is Eulerian if and only if it is strongly connected and each node has equal indegree
and outdegree; so the problem is squarely in P. There is no known such characterization or
algorithm for the Hamilton problem and notice its eerie similarity with the TSP.
Given a Boolean circuit, and its inputs, is the output T?
circuit SAT: Given a Boolean circuit, and some of its inputs, is there a way to set the
remaining inputs so that the output is T?
circuit value: We know that circuit value is in Pwe showed that it can be reduced to linear programming; also, the naive algorithm for that evaluates all gates bottomup is polynomial. How
about circuit SAT? There is no obvious way to solve this problem, sort of trying all input
combinations for the unset inputs and this is an exponential algorithm.
2SAT: Given a Boolean formula in conjunctive normal form and with at most two literals
per clause, is there a satisfying truth assignment?
3SAT: Given a Boolean formula in conjunctive normal form and with at most three
literals per clause, is there a satisfying truth assignment?
We know from the homework that 2SAT can be solved by graphtheoretic techniques. For
3SAT, no such techniques are available, and the best algorithms known for this problems are
exponential in the worst case.
matching: Given a boysgirls compatibility graph, is there a complete matching?? Given a boysgirlshomes compatibility relation that is, a set of triangles, is there a complete matching a set of disjoint triangles that covers all boys, all
girls, and all homes?
3D matching: We know that matching can be solved by a reduction to max ow, and then to linear
programming. For 3D matching we shall see a reduction too. Unfortunately, the reduction is
from 3SAT to 3D matching and this is bad news for 3D matching: : :
Given integers a1 ; : : :; an , and another integer K in unary, is there a
subset of these integers that sum exactly to K ?
knapsack: Given integers a1 ; : : :; an , and another integer K in binary, is there a subset
of these integers that sum exactly to K ?
unary knapsack: unary knapsack is in Psimply because the input is represented so wastefully, with about n + K bits, so that the On2 K dynamic programming algorithm, which would be exponential
if K were represented in binary, is bounded by a polynomial in the length of the input. There is no polynomial algorithm known for knapsack. Given an m n matrix A and an m vector b, are there real
numbers x1 ; : : :; xn 0 satisfying Ax b?
linear programming: Given an m n matrix A and an m vector b, are there
integers x1; : : :; xn 0 satisfying Ax b?
integer linear programming: Although there are algorithms that solve linear programming in polynomial time simplex is
exponential in the worst case, but certain more recently proposed alternative approaches work
in polynomial time, the additional requirement that the solution consist of integers seems to
make the problem impossible. Rounding the solutions of the linear program up or down is
no help even nding whether a rounding that remains feasible is possible is a hard problem
see the reduction from 3SAT to integer linear programming later in these notes.
3. Certi cates and the Class NPAlthough some of the problems we saw in the previous section TSP, 3SAT, circuit SAT, Hamilton cycle, 3D matching, knapsack are
not known of believed to be solvable by polynomial algorithms, they all have a positive common property: the certi cate property. In each case, if a given input of the problem is a yes"
input a satis able Boolean formula, a graph with a Hamilton cycle, then there is a short
argument, a succinct certi cate that may convince one about the fact that the input is indeed
a yes" input. In the case of 3SAT, the certi cate would be a satisfying truth assignment. In
the case of Hamilton cycle, it would be a closed path that visits each node once. In the case
of the TSP, it would be a tour whose total cost is less than or equal to the given goal. And so
on. These certi cates have the following properties:
They are small. In each case the certi cate would never have to be longer than a
polynomial in the length of the input.
They are easily checkable. In each case there is a polynomial algorithm which takes
as inputs the input of the problem and the alleged certi cate, and checks whether the
certi cate is a valid one for this input. In the case of 3SAT, the algorithm would just
check that the truth assignment indeed satis es all clauses. In the case of Hamilton cycle
whether the given closed path indeed visits every node once. And so on. Every yes" input to the problem has at least one certi cate possibly many, and each
no" input has none.
Not all problems have such certi cates. Consider, for example, the problem nonHamiltonian
graph: Given a graph G, is it true that there is no Hamilton cycle in G? How would you
prove to a suspicious person that a given large, dense, complex graph has no Hamilton cycle?
Short of listing all cycles and pointing out that none visits all nodes once a certi cate that is
certainly not succinct?
The problems that have this positive property comprise a class known as NP1. A yesno"
problem is in NPif and only if it has the certi cate property. That is, every yes" instance has
at least one concise certi cate of its yesness", and all no" instances have no such certi cate;
and furthermore a certi cate can be tested e ciently for validity.
Notice also that Pis a subset of NP. To see why, suppose that a problem is in P, that is,
it has a polynomialtime algorithm. But then a trace of this algorithm run on a given input,
and returning yes", is a good certi cate for this input: It is concise, can be tested fast, yes"
inputs have one, no" inputs don't.
Let us next consider the problem circuit SAT, de ned above. It is of course in NP: A
setting of the unknown input gates s that makes the whole circuit T serves well as a certi cate
of any yes" input. It turns out that circuit SAT plays a very special and important role
within NP:
A problem is in NPif and only if it can be reduced to circuit SAT
Let us argue why this statement known as Cook's theorem, and considered as one of the most
important results in Computer Science is true. One direction is easy: If a problem A can be
reduced to circuit SAT then of course it is in NP: A certi cate of any yes" input would be
a running of the reduction on this input, together with a certi cate for the resulting input of
circuit SAT a satisfying setting of the unknown input gates.
The other direction is much more complicated, but here is a plausible explanation: Suppose
that we have a problem A in NP; we want to show that it has a reduction to circuit SAT.
The fact that A is in NPmeans that there is a polynomial algorithm that checks inputs
of A and certi cates for validity. But an algorithm runs on a computer, and, after all, a
computer is nothing more than a huge Boolean circuit supplying the rules whereby the next
state is computed from the current state and the input. If we superpose enough polynomially
many such circuits, we get a circuit that describes the full run of the validity algorithm on a
certi cate and an input, where the bits in the input gates stand for the input and the certi cate.
Suppose now that we are given an input x of A. If we plug in the correct T F values for x
in the apprpriate input gates of the circuit, and keep the input gates that correspond to the
certi cate unknown, we get an instance of circuit SAT that precisely captures the question
whether a valid certi cate for x exists that is to say, whether x is a yes" instance of A.
Hence, the construction of the circuit we described is the sought reduction from A to circuit
SAT!
If a yesno" problem has these two properties:
it is in NP;
all other problems in NPreduce to it; NPstands for nondeterministic polynomial, meaning that all problems in it can be solved" in polynomial
time by a nondeterministic computer" that starts by guessing the right certi cate, and then checking it.
1 then it is called NPcomplete. The existence of such problems may seem a priori unlikely,
but we already know that there is at least one: circuit SAT. In the next section we shall see
many more examples of NPcomplete problems.. 4. NPcomplete problems To prove that a problem is NPcomplete, we typically reduce a problem that is known to
be NPcomplete to it. Now that circuit SAT has provided a place to start, we shall prove
many problems NPcomplete by the reductions pictured in the gure.
From circuit SAT to 3SAT. Suppose that we are given a circuit C with some input gates
unknown, such as the one pictured in Figure. We must construct from it an equivalent input
to 3SAT, that is, a formula RC that is satis able if and only if there is a satisgying setting of
the unknown input gates of C . The construction is shown in the gure. RC has a variable
for each gate of C , and also for each gate of C it has certain clauses. The precise set of the
clauses depends on the nature of the gate considered.
If x is a T input gate, then we simply have the clause x.
If x is a F input gate, then we have the clause x.
If x is a F input gate, then we have the clause x.
If x is an unknown input gate, then no clauses are added for it intuitively, they are free
to be whatever they want, as long as they succeed in making the output gate T:
If gate x is the OR of the gates y and z , then we add the clauses y _ x z _ x x _ y _ z . It
is easy to see that the conjunction of these clauses is equivalent to the statement x = y _ z .
Similarly, if gate x is the AND of the gates y and z , then we add the clauses x _ y x _ z
y _ z _ x, which is x = y ^ z
Next, if gate x is the NOT of gate y , then we add the clauses x _ y x _ y , which is x = y
Finally, if gate x also happens to be the output gate, then we add the clause x, expressing
the condition that the output gate be T.
The conjunction of all these clauses is the sought formula RC . To show that the reduction
R works, we must establish the following statement:
C has a setting of the unknown input gates that makes the output variable T if and only if
RC is satis able.
Suppose that C does have such a setting of the unknown input gates. Then RC can be satis ed by the following truth assignment: Set all variables correponding to unknown input
gates to T or F, depending on their value in the given setting of the input gates. Compute
now all values of all gates in the circuit; this is claimed the truth assignment that satis es
RC . This truth assignment must satisfy all clauses of RC , because it corresponds to a
legal assignment of values to the gates of C , and all these clauses require is that the values be
legal. Finally, since the given setting makes the output gate T, the last clause is also satis ed.
Conversely this is the subtler direction in proofs of reductions, suppose that RC has
a satisfying truth assignment. Then consider the setting of the unknown input gates of C
suggested by this truth assignment. Since the truth assignment satis es RC , it assigns to
all other variables of RC the value of the corresponding gate since RC requires that all gates be properly computed. But, since RC also insists that the output be T, this means
that the setting of the unknown inputs succeeds in satisfying C .
From 3SAT to integer linear programming. This reduction is easy, since any clause such as
x _ y _ z can be rewritten as the integer linear program x + 1 , y + z 1; 0 x; y; z 1:
Repeating for all clauses, we have an input of integer linear programming that is equivalent
to the given input to 3SAT.
From 3SAT to independent set. In a typical input to independent set we are given a graph
G = V; E and an integer K . We are asked whether there is a set I V with jI j K such
that if u; v 2 I then u; v 2 E .
=
We must reduce 3SAT to independent set. That is, given any Boolean formula with
at most 3 literals in each clause, we must produce a graph G = V; E and an integer K such
that G has an independent set of size K or more if and only if is satis able.
The reduction is illustrated in the gure. K is the number of clauses. For each clause we
have a group of nodes, one for each literal in the clause, connected by edges in all possible
ways. Also, any two nodes from di erent groups, correponding to contradictory literals like
x and x are connected by an edge. This concludes the description of G, and of the reduction.
Suppose that G has an independent set I of size K or more. Since there are K groups,
and all nodes in them are connected in all ways, I cannot contain two nodes from any group.
So, it must have one node from each group. Think of the node from a group as the literal that
satis es this clause. Since contradictory literals are connected with an edge, no nodes in I are
contradictory, and hence these literals together comprise a satisfying truth assignment for .
If a particular variable was not used, this means that we can take it to be either T or F; both
alternatives would satisfy .
Conversely, suppose that we have a satisfying truth assignment of . Each clause must
have at least one T literal; x one of for each clause. Consider the corresponding set I of K
nodes. Since these literals come from a single truth assignment, they are not contradictory,
and so I is an independent set, completing the proof.
From independent set to vertex cover and clique. Let G = V; E be a graph. A vertex
cover of G is a set C V such that all edges in E have at least one endpoint in C . The
vertex cover problem is this: Given a graph G and a number K , does G have a vertex cover
of size at most K ?
The reduction from independent set to vertex cover is very easy, and based on this
observation: C is a vertex cover of G = V; E if and only if V , C is an independent set! This
is because any two nodes not in a vertex cover cannot have an edge between them, because
this edge would not have an endpoint in the vertex cover. So, here is the reduction: Given an
instance G = V; E ; K of independent set, we produce the instance G = V; E ; jV j , K
of vertex cover. There is an independent set with K nodes or more if and only if there is a
vertex cover of size jV j , K or less.
The clique in a graph is a fully connected set of nodes. The clique problem asks whether
there is a clique of size K or larger in the graph. The reduction from independent set to
clique is very simple: We go from the instance G; K of independent set to the equivalent
instance G; K , where G is the complement of G, the graph with the same nodes as G, and
with precisely all edges that are missing from G.
From vertex cover to dominating set. A set of nodes D id a dominating set if each node
either is in D, or is adjacent to a node in D. This twist on vertex cover is also NPcomplete. To reduce an input G; K of vertex cover to it, we simply add to G, for each
edge a; b 2 E , anew node ab, and two new edges a; ab ; b; ab . It is clear that any vertex cover of G is a dominating set of the new graph. And any dominating set of the new graph can
be made into a vertex cover of G by replacing any new vertex by one of its adjacent vertices. 5. An Epilogue: Undecidability Are there problems that are not even in NP? The answer is, yes, but they rarely appear
to come up in practice." In fact, there are problems for which there are no algorithms at all!
Consider the following situation. You want to write a Boolean function termP,X which
takes two inputs, P and X. P is a program in the same language, and X is a data le. termP,X
returns true if program P with input le X eventually terminates. If program P on le X loops
forever, then termP,X returns false.
It can be proved that such program is impossible to write! That is to say, there is no
algorithm, however ine cient, that solves the problem given a program P and its input X,
will P terminate on X?" Here is the proof:
Suppose, for the sake of contradiction, that we have written such a Boolean function
termP,X. Using it, we can write the following simple program:
Boolean function diagP: if termP,P then loop forever
And now the contradiction: Does diagdiag terminate? It is easy to see that it does
if and only if it does not! This is a contradiction, to whic we were led by assuming that
termP,X can be written. We must conclude that there is no program that can be written
to solve the termination or halting problem above. NPcomplete NP P Figure 2: Pand NP circuit SAT 3SAT integer
linear
programming 3D matching independent set Hamilton cycle clique knpapsack
set cover vertex cover
undirected Hamilton cycle
dominating set
tTSP Figure 3: (X+Y+Z) (X+Y+Z)
X Y (X+Y) X X Z Y Z Figure 4: (X+Y+Z) X Y Y Z ...
View
Full
Document
This note was uploaded on 01/28/2012 for the course CS 220 taught by Professor Ibarra,o during the Winter '08 term at UCSB.
 Winter '08
 Ibarra,O

Click to edit the document details