This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Chapter 4 Decidability and Undecidability 4.1 What is Computability? From an informal or intuitive perspective what might we mean by computability? One
natural interpretation is that something is computable if it can be calculated by a systematic
procedure; we might think of this as a process that can be described by one person and carried
out by another. We will now assert that any such process can be written as a computer program. This is
known as the ChurchTuring thesis because they were the ﬁrst to give formal mathematical
speciﬁcations of what constituted an algorithm in form of the lambda calculus and Tur—
ing Machines, respectively. Note that the claim is a thesis; it is inherently not something
provable, for the notion of a systematic process is unavoidably imprecise. We remark that an instruction such as “guess the correct answer” does not seem to be
systematic. An instruction such as “try all possible answers” is less clear cut: it depends on
whether the possible answers are ﬁnite or inﬁnite in number. For speciﬁcity, henceforth we
will imagine working with a speciﬁc programming language, and using this as the deﬁnition
of reasonable instructions (you can think of this as Java, C, or whatever is your preferred
programming language). Actually, in practice we will describe algorithms in pseudo—code
and English, but with enough detail that their programmability should be clear. We deﬁne computability in terms of language recognition. Deﬁnition 4.1.1. L g 2* is computable if there is a program P with two possible outputs,
“Recognize” and “Reject” and on input a: E 2*, P outputs “Recognize” ifcc E L and “Reject”
if x ¢ L. We will also say that P decides L the sense of decides set membership). Remark. Often, one wants to compute some function f of the input 11:. By allowing
2—input programs we can deﬁne computable functions f as follows: f is computable if there
is a 2input program P, such that on input (say), P outputs “Recognize” if f = y and
“Reject” if f(x) 7é y. 96 4.2. EN CODIN GS 97 4.2 Encodings Often, we will want to treat the text of one program as a possible input to another program.
To this end, we take the conventional approach that the program symbols are encoded in
ASCII or more simply in binary. We do the same for the input alphabet. Of course, if we are
not trying to treat a program as a possible input to another program, we can have distinct
alphabets for programs and inputs. As it turns out, not all languages are computable (indeed, we have already seen that the
halting problem is not computable). First, however, we are going to see some computable
languages. The ﬁrst languages we are going to look at capture properties of automata. To this end,
we need an agreed method for writing the description of an automata. This is similar to
a standard input format for a graph, for example. To describe a DFA we specify it as a
sequence of 4 sets: 2, the input alphabet, V, the set of vertices, start 6 V, the start vertex,
F E V, the set of recognizing vertices, and 6, a description of the edges and their labels. If E = {a1,a2   ,ak}, V = {111,   ,v,} we could write this as a string.
({ala a2; ' ' ' yak}7{v1av2a ' ' ' avr}33tarta{vi13vi2a ' ' 'vij}7
Start, {(7115011) _) 'Uj15 (01,012) _) vj2a ' ' ' 3(015 ak) _> vjk)
(’02: 0’2) _> vjk+l’ ' ° ' 1(1)“ 0’16) _> “MD, where F = {vi1,v,2, ,vij} and 6(vh,a,) = 15(an for 1 S h S 7',1Si£ k. There is a further difficulty that we would like to use one alphabet to describe many
different machines. One way of doing this is to write the vertices and the alphabet characters
in binary. We then see that the above DFA description uses just 8 distinct characters:
{, }, (, ), 0, 1, —>, , . We could reduce this to two characters, 0 and 1 say, by encoding each of
these characters using three US and 1s. We will view an input string is a descriptor of a DFA only if it has the above format, and
in addition: 0 a, 7é a] for i 7é j, i.e. the relevant binary strings are not equal.
0 vﬁévj foriaéj. 0 start = v, for some i,1 S i S r. o For each h,1 S hgj, 12,, 211;, for some l,1 Slgr. o In each expression (Uh, (1,) = v; for some l, 1 g l g r. _) vj(h—l)k+i7 Uj(h—1)k+i As this is rather tedious, henceforth we leave the description of correct encodings to the
reader (to imagine). We will simply use angle brackets to indicate a suitable encoding in
a convenient alphabet (binary say). So (M) denotes an encoding of machine M, (w) an 98 CHAPTER 4. DECIDABILITY AND UNDECIDABILITY encoding of string 11) (note that w is over some alphabet 2, while (111) may be in binary).
We also use (M, w) to indicate the encoding of the pair M and 11). But even this can prove
somewhat elaborate, so sometimes we will simply write M when we mean (M As a rule,
when M is the input to another program, then what we intend is (M), a proper encoding of
the description of M. We are now ready to look at the decidability of some languages. 4.3 Decidability of Regular Language Properties Example 4.3.1. Rec—DFA = , w)  M is a DFA and M recognizes input w}.
Claim 4.3.2. RecDFA is decidable. Proof. To prove the claim we simply need to give an algorithm A that on input (M, 11))
determines Whether M recognizes its input. Such an algorithm is easily seen at a high level:
A ﬁrst checks whether the input is legitimate and if not it rejects. By legitimate we mean
that (M, w) is the encoding of a DFA, followed by the encoding of a string 11) over the input
alphabet for M. If the input is legitimate A continues by simulating M on input w: A keeps
track of the vertex M has reached as M reads its input w. When (in A’s simulation) M has
read all of w, A checks whether M has reached a ﬁnal vertex and outputs accordingly: A
outputs “Recognize” if M has reached a ﬁnal vertex, and A outputs “Reject” otherwise. _ “Recognize” if M recognizes w
A((M’ _ { “Reject” if M does not recognize w The details are a bit more painstaking: given the current vertex reached by M and the
next character in w, A looks up the vertex reached by scanning the edge descriptions (the
triples (p, a) —> q). In yet more detail, A stores the current vertex in a variable, the input to
in a linked list, a pointer to the next character of w to be read, and the DFA in adjacency
list format. The details of how to implement algorithm A should be clear at this point. As they are
not illuminating we are not going to spell them out further. III Note. Henceforth, a description at the level of detail of the ﬁrst paragraph of the above proof
will sufﬁce. Further, we will take it as given that there is an initial step in our algorithms
to check that the inputs are legitimate. Let PRecDFA be the program implementing the just described algorithm A deciding the
language of Example 4.3.1. So PRecDFA takes inputs (M, x) (strictly, (M, and outputs
“Recognize” if M recognizes ac and outputs “Reject” if M does not recognize as. This is
a notation we will use repeatedly. If Prop is a decidable property (i.e. the language L =
{w  Prop(w) is true} is decidable) then PProp will be a program that decides L; we will also
say that PProp decides Prop. Example 4.3.3. Rec—NFA = , w)  M is a description of an NFA and M decides w}. 4.3. DECIDABILITY OF REGULAR LANGUAGE PROPERTIES 99 Claim 4.3.4. RecNFA is decidable. Proof. The following algorithm AReC_NFA decides RecNFA. Given input (M, w), it simulates
M on input 11) by keeping track of all vertices reachable on reading the ﬁrst i characters of
w, for i = 0, 1, 2,  ~  in turn. M recognizes w exactly if a ﬁnal vertex is reachable on reading
all of w, and consequently ARec_NFA outputs “Recognize” if it ﬁnds M can reach a ﬁnal vertex
on input w and outputs “Reject” otherwise. [I Example 4.3.5. Rec—RegExp = {(13112)  r is a regular expression that generates w}.
Claim 4.3.6. RecRegEa‘p is decidable. Proof. The following algorithm AReC_RegEXD decides Rec—RegExp. Given input (1", w), it begins
by building an NFA Mr recognizing the language L(7‘) described by 1", using the procedure
from Chapter 2, Lemma 2.4.1. AReC_R,egExp then forms the encoding (MT, 111) and simulates the
program PRWNFA from Example 4.3.3 on input (MT, 111). ARec_RegExp’s output (“Recognize” or
“Reject”) is the same as the one given by PRwNFA on input (MT, 111). This is correct for MT recognizes 111 if and only if it) is one of the strings described by
7“. III This procedure is taking advantage of an already constructed program and using it as a
subroutine. This is a powerful tool which we are going to be using repeatedly. Example 4.3.7. EmptyDFA =  M is a DFA and L(M) = Note that L(M) = (b exactly if no ﬁnal vertex of M is reachable from its start vertex. It
is easy to give a graph search algorithm to test this. Claim 4.3.8. Empty—DFA is decidable. Proof. The following algorithm AEmPty_DFA decides EmptyDFA. Given input (M ), it deter
mines the collection of vertices reachable from M’s start vertex. If this collection includes a ﬁnal vertex then the algorithm outputs “Reject” and otherwise it outputs “Recognize”.
III III Example 4.3.9. EqualDFA 2 {(MA, MB)  MA and MB are DFAs and L(MA) = L(MB)}.
Claim 4.3.10. EqualDFA is decidable. Proof. Let A = L(MA) and B = L(MB). We begin by observing that there is a DFA MAB
such that MIT/TAB) 2 gb exactly if A = B. For let C = (A DB) U (Zn B). Clearly, if A = B,
C = c5. While ifC' = c5, AﬂB = gb, so A g B = B; similarly AﬂB = (I), so B Q A; together,
these imply A = B. But given DFAs M A and MB, we can construct DFAs MA and MB to recognize A
and B respectively. Then using M A and MB we can construct DFA AQE to recognize A n B_and 1/123 to recognize A F] B. Given M Ag, M23 we can construct M A3 to recognize
(AnB)LJ(AﬂB). 100 CHAPTER 4. DECIDABILITY AND UNDECIDABILITY So the algorithm AEquaLDFA to decide Equal—DFA, given input (M A, M B), constructs 1T4; AB
and forms the encoding AB). .AEquaLDFA then simulates program PEmPty_DFA from the pre
ceding example on input AB). .AEquaLDFA outputs the result of the simulation of PEmmy_DFA
on input (HA3). This is correct for PE,,,,,,,,_DFA outputs “Recognize” exactly if L(A7AB) = d which is the
case exactly if A = B. III We have now given two examples of a use of a subroutine in a very particular form. More
speciﬁcally, program P has used program V as a subroutine and then used the answer of Q
to compute its own output. In these two examples, the calculation of P’s output has been
the simplest possible: the output of Q has become the output of P. We call this form of algorithm design a reduction. If we have a program (or algorithm)
Q that decides a language A (e.g. Empty—DFA), and we give a program to decide language
B (e.g. Equal—DFA) using Q as a subroutine then we say we have reduced language B to
language A. What this means is that if we know how to decide language A we have now
also demonstrated how to decide language B. Example 4.3.11. Inf—DFA =  M is a DFA and L(M) is inﬁnite}.
Claim 4.3.12. Inf—DFA is decidable. Proof. Note that L(M) is inﬁnite exactly if there is a path which includes a cycle and which
goes from M ’s start vertex to some ﬁnal vertex. This property is readily tested by the fol—
lowing algorithm Alnf_DFA. Step 1. AmDFA identiﬁes the non—trivial strong components1 of M’s graph, that is those
that contain at least one edge (so any vertices with a selfloop will be in a nontrivial strong
component). Step 2. AmmFA forms the reduced graph, in which every strong component is replaced by
a single vertex, and in addition it marks each nontrivial strong component (or rather the
corresponding vertices). Further, there is an edge in the reduced graph from component C
to component C” exactly if in the original graph there is an edge (u, v) where u E C and
v E 0’. Step 3. AMDFA checks whether any path from the start vertex to a ﬁnal vertex includes a
marked vertex (and thus can contain a cycle drawn from the corresponding strong compo
nent). Step 3.1. By means of any graph traversal procedure (e.g. DFS, BFS), $11,,st determines
which marked vertices are reachable from the start vertex. It doubly marks these vertices.
step 3.2. By means of a second traversal, AmDFA determines whether any ﬁnal vertices can
be reached from the doubly marked vertices. If so, there is a path with a cycle from the start
vertex to a ﬁnal vertex, and then Alnf_DFA outputs “Recognize”; otherwise .AInf_DFA outputs “Reject.” III 1Recall that a strong component is a maximal set C of vertices, such that for every pair of vertices
u, v E 0, there are directed paths from u to v and from v to u. 4.4. DECIDABILITY OF CONTEXT FREE LANGUAGE PROPERTIES 101 Example 4.3.13. All—DFA =  L(M) = 2* where E is M’s input alphabet}. L(M) = 2* exactly if L(M) = 925. E to test if L(M) = 2*, the decision algorithm
AAEFA simply corﬂructs the encoding (M) and then uses the program PI.3,,,,,,,,_DFA to test if
L(M) = as, where M is the DFA recognizing L, and then outputs the same answer, namely: “Reject” if PEmp,y_DFA((Z)) = “Reject”
“Recognize” if PEmp,y_DFA((M = “Recognize” A.......(<M>) = { 4.4 Decidability of Context Free Language Properties Example 4.4.1. RecCFC = {(G,'w)  G is a GFG which can generate in}.
Claim 4.4.2. Rec—CFG is decidable. Proof. The following algorithm AReC_CFG decides Rec—CFC. Step 1. AReC_CFG converts G to a CNF grammar G, with start symbol S. Step 2. If to = A, AREPCFG checks if S’ —> A is a rule of G and if so outputs “Recognize”)
and otherwise outputs “Reject.” ~ Step 3. If U) 7é A, the derivation of w in G, if there is one, would take 2w — 1 steps.
ARwCFG simply generates, one by one, all possible derivations in G of length 2w — 1. If any
of them yield 11) then it outputs “Recognize” and otherwise it outputs “Reject.” (This in
not intended to be an efficient algorithm.) III Example 4.4.3. EmptyCFG =  G is a CFG and L(G) = (13}.
Claim 4.4.4. Empty—CFG is decidable. Proof. Note that L(G) aé (b if and only if G’s start variable can generate a string in T*,
where T is G’s terminal alphabet. We simply determine this property for each variable A
in G: can A generate a string in T*? This can be done by means of the following algorithm
AEmp,y_CFG, which marks each such variable. Step 1. AEmp,y_CFG converts the grammar to CNF form (this just simpliﬁes the rest of the
description) . Step 2. AEmp,y_CFG marks each variable A for which there is a rule A —> a or A —> A (the
latter could apply only to S, the start variable). Step 3. Iteratively, .AEmNrCFG marks each variable A such that there is a rule A —> BC
and B and C are already marked. (We leave an efﬁcient implementation to the reader).
AEmp,y_CFG stops when no more variables can be marked. Step 4. AEmp,y_CFG outputs “Reject” if S is marked and “Recognize” otherwise. III Next, we describe a more eﬂicient algorithm Eff—ARROW, for determining if a CNF gram—
mar G can generate a string 11/. It runs in time 0(mn3), where m is the number of rules in
G and n = 102 CHAPTER 4. DECIDABILITY AND UNDECIDABILITY First, we introduce a little notation. Let U) = wringnu)”, where each 111, E T, the
terminal alphabet, for 1 S i S n. 1125 denotes the length l substring of 11) beginning at 111,:
ml = wiwi+1w¢+z—1. Eff—ARwCFG uses dynamic programming. Speciﬁcally, in turn, for l = 1,2,   ,n, it
determines, for each variable A, whether A can generate wﬁ, for each possible value of i, i.e.
for 1 S i S n — l + 1. This information sufﬁces, for G can generate w exactly if S =>* 112?,
when S is G’s start variable. For l = 1, the test amounts to asking whether A —> w, is a rule. For l > 1, the test amounts to the following question: Is there a rule A —> BC, and a length k, with 1 S k < l, such that B generates the
length k substring of w beginning at w, and such that C generates the remainder
of mi (i.e. B =>* ml“ and 0 =>* wljrlg). Note that the results of the tests involving B and C have have already been computed, so for a single rule and a single value
of Is, this test runs in 0(1) time. Summing the running times over all possible values of i, k, l, and all m rules yields the overall running time of 0(mn3).
This shows: Lemma 4.4.5. The decision procedure for language Rec—CFG runs in time 0(mn3) on input
(M,w), where n = w and m is the number of rules in G'. Example 4.4.6. Inf—CFO = {G  G is a CNF grammar and L(G) is inﬁnite}.
Claim 4.4.7. InfCFC is decidable. Proof. Note that L(G) is inﬁnite exactly if there is a path in a derivation tree with a repeated
variable. The following algorithm, .AInwFG identiﬁes the variables that can be repeated in
this way; L(G) is inﬁnite exactly if there is at least one such variable. Away; proceeds in
several steps.
Step 1. This step identiﬁes useful variables, variables that generate nonempty strings of
terminals.
This can be done using a marking procedure, First, Alum“ marks the variables A for which
there is a rule of the form A —> a. Then, iteratively, for each rule A —> BC, where both B
and C are marked, it also marks A, continuing until no additional variables can be marked.
The marked variables are exactly the useful variables. If S is not useful, the algorithms stops, answering “Reject”. Otherwise, it continues with
Step 2.
Step 2. AMCFG now identiﬁes the reachable useful variables, i.e. those useful variables for
which there is a derivation S =>* 0147', where a, 7' E U *, with U being the set of G’s useful
variables and S its start variable. This is done via the following marking process.
Step 2.1. .AInf_CFG marks S.
Step 2.2. For each unprocessed variable A, Alum“; marks all variables on the RHS of a rule 4.5. UNDECIDABILI TY 103 with A on the LHS.
When this process terminates, the marked variables are exactly the reachable useful variables.
Step 3. Finally, .AMCFG identiﬁes the repeating, reachable useful variables, namely the
variables that can repeat on a derivation tree path.
To do this, AMCFG uses a procedure analogous to the one used in Step 2: For each reachable
useful variable A, Alnmm determines the variables reachable from A; if this collection includes
A, then A is repeating. AMCFG answers “Recognize” exactly if some variable is repeating. CI 4.5 Undecidability The Barber of Seville is a classic puzzle. The barber of Seville is said to shave all men who
do not shave themselves. So who shaves the barber of Seville? To make this into a puzzle
the words have to be treated unduly logically. In particular, one has to interpret it to mean
that anyone shaved by the barber of Seville does not shave himself. Then if the barber of
Seville shaves himself it is because he does not shave himself and in turn this is because he
does shave himself. One way out of this conundrum occurs if the Barber of Seville is a woman. But our
purpose here is to look at how to set up this type of conundrum, or contradiction. Let us form a table, on one side listing people in their role as people who are shaved, on
the other as potential shavers (or barbers). For simplicity, we just name the people 1, 2,    .
So the entries in row 2' show who is shaved by person 2', with entry (i, j) being Y (“yes”) if
person i shaves person j and N (“no”) otherwise. Let row b be the row for the barber of
Seville. Then, for every j, entries (b, j) and (j, j) are opposite (one Y and one N). This leads
to a contradiction for entry (b, b) cannot be opposite to itself. See Figure 4.1. ElmMM.) (peopb. ulna un shaded) ‘Pbl‘ Ink'1 01
but“; Figure 4.1: Who Shaves the Barber of Seville? Now we are ready to show that the halting problem is undecidable by means of a similar 104 CHAPTER 4. DECIDABILITY AND UNDECIDABILITY argument. We deﬁne the halting problem function H as follows. H( P w) _ “Recognize” if program P halts on input 11)
’ _ “Reject” if program P does not halt on input w By halt, we mean that a program completes its computation and stops. Recall that the program P and its input are encoded as binary strings. Let 1' denote
the ith such string in lexicog...
View
Full Document
 Winter '08
 Ibarra,O
 Halting problem, Seville, standard input format

Click to edit the document details