This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 'J‘ 26 2 Alphabets, Words, Languages, and Algorithmic Problems
Proo’. First we show {a}*{b}" g {aibj  i,j 6 IN}.
'1 Let a: E {a}’“{b}'. Then Inna :1: = yz, where y E {a}" /\z E {b}* {deﬁnition of concatenation}
: :r = yz, where (3k E IN : y E {a}") /\ (3m E IN : z E {1)}’")
{deﬁnition of the Kleene star)
®m=yz, where (Elk€11‘l:y=ak)/\(3mell\lzz=b"")
<=> 3k,m E N, such that a: = akbm
=> :1: e {a‘bJ’  m e 1N}. Next, we show {a‘bj  i,j 6 N} g {a}*{b}’.
Let (r E {aibj  i,j E IN}. Then a = a'bl for some integers r,l E lN
: :c E {a}*{b}*, since (1"6 {a}",bl€ {b}‘. Exercise 2.26. Prove or disprove the truth of the equality
({arorr = {ab}; Deﬁnition 2.27. Let El and 2'2 be two arbitrary alphabets. A homomor
phism from S1 to .532 is any function h : Sf ——> E; that satisﬁes the following
conditions: (i) MA); A and
(ii) h(uv) = h(u) h('o) for all u,o 6 27;. One can easily observe that to specify a homomorphism it is sufﬁcient to
ﬁx h(a) for all symbols a E 21. Exercise 2.28. Let h be a homomorphism from E1 to .332. Prove by induction
that for all words a: = 17132....anl'ie 231 for i = 1, . . .,m, h(a:) = h(:c1)h(:c2) . . .h(:rm).
Consider a mapping h given by
h(#) = 10,h(0) = 00 and 11(1) = 11.
Clearly h speciﬁes a homomorphism from {0,1,#} to Ebml. For instance, h<011#101#) = h(o)h<1>h(1>h<#>h(1)h(o)house
= 0011111011001110. One can use h to transfer a representation of some objects over {0, 1, #} to a
new representation of these objects over Elmo]. 2.3 Algorithmic Problems ‘27 Exercise 2.29. Deﬁne a homomorphism from {0,1,#} to 21,001 that maps
inﬁnitely many words over {0, 1, #} to one word from (Emmy. Exercise 2.30. Deﬁne an injective homomorphism from Erngic to )3me that
provides an unambiguous representation of Boolean formulae over 21,001. Exercise 2.31. Let E] and E; be alphabets. Let h be a homomorphism from
El to Eg For any language L g .17: we deﬁne h(L) = {12011)  w E L}.
Let L1. L3 g 5;“. Prove or disprove the following equality: = lL(L1 2.3 Algorithmic Problems Before giving the formal deﬁnition of the notion “algorithm” by the Turing—
machine model in Chapter 4 we View algorithms as programs. We assume
that the reader knows what a program is. For our purposes, the speciﬁc pro
gramming language is irrelevant. When using the synonym “program” for
“algorithm” we require that the program computes a correct output for each
feasible input. This means that an algorithm is considered to be a program
that halts for any input (i.e., does not have any inﬁnite computation) and
solves the given problem. Given this assumption a program (an algorithm) A performs a mapping
A : E; —> E; for some alphabets £1 and 272. This means that (i) the inputs are represented as words Over an alphabet 21,
(ii) the outputs are represented as words over an alphabet .372, and
(iii) A unambiguously assigns an output to every input. For any algorithm A and any input a: we denote by A(m) the output of the
algorithm A for the input 1‘. We say that two algorithms (programs) A and
B are equivalent if they work over the same alphabet E and A(a:) : 3(a)
for all :r E E". *0 Deﬁnition 2.32. The decision problem (2, L) for a. given alphabet Z‘ and
a given language L g E“ is to decide for any a: E 23‘", whether :5 E L or .7; ¢ L.
An algorithm A solves the decision problem (L, E), if, for all a: E Z“. _ LifIEL,
“Tl—{0,17 z¢L. We also say that A recognizes L. 28 2 Alphabets, Words, Languages, and Algorithmic Problems If,‘for a language L, there exists an algorithm that recognizes L, we say
that the language L is recursive5. We often use a language L g E‘ to Specify
a concrete property of words from Z" (or objects that are represented by the
words). Words in L satisfy this property and words from LC 2 E" — L do not have this property. . .
Usually we describe a decision problem (E, L) in the followrng way: (27, L)
Input: .7: E )3".
Output: A(:r) E Ebool = {0,1}, where A __ 1, if :1: E L (Yes, at has the property),
(:6) * 0, if :1: ¢ L (No, a: does not have the property). For instance, ({a,b), {aT‘bTL  n e 1N}) is a decision problem that can be also speciﬁed as follows: (an. {are l n e N)
Input: 3: 6 a,b *.
Output: Yes, if a; = anb" for an n E IN.
No, otherwise. Emample 2.33. A wellknown decision problem of large practical importance
is primality testing (Ebooii {3 E (Ebooly lNumb57‘(37l is 3 Prime” The usual representation is ‘ (Ebooi, E (25001)" I Number(:r) is a prime})
Input: .7: E (Eboolf.
Output: Yes, if Number(:c) is a prime, No, otherwise. Example 2.34. Let L = {2: E (Ekeybomdy i at is asyntactically correct program
in C++}. We consider the following problem that is a subproblem of any compiler for C++.
Input: 3; E (Ekeyboard)* Output: Yes, if a; E L,
No, otherwise. 5 Recursion is one of the fundamental terms of computer science. Therefore we
use later a formal model of computation (algorithm) in order to give a formally precise deﬁnition of this term. 2.3 Algorithmic Problems 29 Example 2.35. The Hamiltonian Cycle Problem (HC) is (2, HC), where
E = {0,1,#} and HC = {m E 23* l cc represents an undirected graph that
contains a Hamiltonian cycles}. Example 2.36. The Satisﬁability Problem (SAT) is (Slog“3,8AT) with
SAT = {x E (Elogicr l 2: represents a satisﬁable Boolean formula}. An important subclass of decision problems is the class of equivalence
problems. For instance, the equivalence problem for programs is to decide
whether two given programs A and B of the same programming language
(i.e., the input is (A,B) E (Ekeyboard)*) are equivalent. Another example
of an equivalence problem is to decide whether two given Boolean formulae
represent the same Boolean function. Deﬁnition 2.37. Let E and F be two alphabets. We say that an algorithm A
computes a function f : Z" —+ I”, if for all a: E Z" AW) = f($) Decision problems are a special case of computing functions, because solv—
ing a decision problem is equivalent to computing the characteristic function of a language.7
At ﬁrst glance one may think that computing functions is the most general representation of algorithmic problems. The following definition shows that
this is not true. Deﬁnition 2.38. Let E and be alphabets and let R Q 27* x I" be a relation
in E“ and I“. An algorithm A computes R (or A solves the relation problem R), if, for every 1 E Z":
(2:,A(:1:)) E R. From Deﬁnition 2.38 we see that to solve a relation problem R for a given
input m, it suffices to ﬁnd one y from a possibly inﬁnite set of ys with the
property (3:, y) e R. The following examples show that the relation problems
are not only an abstract generalization of computing functions,8 but that
many practical problems are problems about computing a relation. 6 A Hamiltonian cycle of a graph G is a cycle (a. closed path) that contains every vertex of G exactly once.
7 The characteristic function f1, of a language L g )3" is a function from 2‘ to {0,1} with = 1 life: 6 L.
H . . .
Remember that functions are specml relations with the property that for every :1: there exists exactly one y with (any) 6 R. 36 2 Alphabets, Words, Languages, and Algorithmic Problems ;
LLB Input: A m x n matrix
A = [aij]i=1....m,j=1,...,n and two vectors
b: (b1,...,bm)T and c: (01,...,cn) for mm E IN — {0}, where aij, bi, cj are integers for i = 1,...,m and
' = 1,... ,n. Constraints: M(A, b, c) = {X =(z1,...,mn)T 6 IN"  AX = b}.
{M(A,b,c) is the set of all solutions (vectors) that satisfy the system
AX = b of linear equations determined by A and 17. Costs: For every X = (1101,” . ,zn) E M(A,b,c), cost(X, (A,b,c)) = c  X = Zciri. i=1
. Goal: minimum. Aside from the decision problems and optimization problems introduced
above we also consider algorithmic problemsof another nature. These prob
lems require no input. Their only task is to generate a word or an inﬁnite sequence of symbols. Deﬁnition 2.47. Let E be an alphabet and let a: E 23". We say that an algo
rithm generates the word x, if A outputs a: for the input A. The following program A generates the word 100111. A : begin
write(100111);
end For every positive integer n, the following program An generates the word
(01)". An : begin
for i = 1 to 11 do
write(01);
end 0
A program that generates a word a can be viewed as an alternative rep resentation of m. Thereby one can save some words in memory as programs
that: generate these words. 2.4 Kolmogorov Complexity 37 Deﬁnition 2.48. Let E be an alphabet, and let L g E“. A is an algorithm enumerating L, if, for every positive integer n, A outputs $1,m2,...,:rn,
where 1:1,.r3,...,m.n are the ﬁrst n words of L with respect to the canonical
order. Example 2.49. Let E : {0} and let L = {0"  p is a prime}. Input: 77.
Output: 02, 03,05, 07, . . . ,0“, where pn is the n—th smallest prime. Exercise 2.50. Prove that a language is recursive if and only if there exists
an algorithm that enumerates L. 2.4 Kolmogorov Complexity In this section we consider words as information carriers and we focus on
ﬁnding a reasonable way to measure the information contents of words. We
do this only for words over the basic alphabet Ebool. An informal idea can
be to say that a word 11) has a small information content if there is a short
representation of this word (i.e., if it is compressible), and that a word u) has
a large information content if there does not exist any short representation of
w (i.e., there is no representation shorter than The intuition behind this
is that a word with small information content is regular and hence easy to
describe, and a word with high information content is irregular.13 Therefore
the only way of representing it is to write it completely bit by bit. Based on this idea, the word
011011011011011011011011 with a. short representation (011)8 has a smaller information content than the word
0101101000101101001110110010. The process of producing a representation of w that is shorter than lwl or
the previous representation of w is called the compression” of w. The next idea could be to ﬁx a convenient compression method and use
the length of the resulting compressed representation of w as a measure of its
information content. Clearly, one has to assert that the compressed represen~
tation of w is again a word over Shoo] because the use of a larger alphabet to
obtain a shorter representation of a given word yields no true compression. Exercise 2.51. Find an injective mapping H from (EMMY to {0, 1. 2, 3,11}‘ =
23;, such that
lTl Z 2 ‘ [H($) '3 A chaotic distribution of Us and 15
1" Formally, a compressmn can be viewed as an injective mapping from (Emmy t0 (ghoul). 38 2 Alphabets, Words, Languages, and Algorithmic Problems
0 for every 1: E (Eboolr, III 3 4. Which compression factor can be achieved, if
instead of E5 one uses the alphabet Em for an integer m > 5? If one uses the representations for the compression of words, then one can
start with the alphabet {0,1,(,)} and represent wCl by (w)Bz'n(a)
for any in E (Eboolr. Thus, for instance, (011)1000 stands for (011)8 or
(0)1010(010)1(01)1101stands for (0)‘“(010)1(01)13. To get ﬁnally a representation over Ebooi, one can use the homomorphism from {0,1,(,)} to £1,001 deﬁned by ~ " h.(0) = 00, h(1) = 11, h(()=10, and h.()) = 01. \. In this way, the compressed representation of (011)8 becomes 100011110111000000. Note, that this compression method is correct because each compressed rep
resentation of w unambiguously determines the original livord w. Th‘e’probleni'is‘that one can propose’iﬁ'ﬁnitely many different compression
mechanisms,‘ Which is the right’one? For instance, one can further improve the
compiession method 'fﬁtroduced above by compressing the representation of
powersfﬁ‘ii‘ﬁs, (011)220 can be ‘used as a shorter representation of (011)1048576.
Using'this strategy, the compression can be arbitrarily improved by generating
the representations such as ‘ 2n (01)122",(01)?” for regular words over 21,001. This means that for any of our compression
methods M, there exists another compression method that is better than
M for inﬁnitely many words over Ebool. Therefore, if one wants to have an
objective and robust measure of information content in words, one cannot
take any of the above strategies. Matters can get worse. Let us consider the following compression method. For every :2: E (21,0002 the nonnegative integer Number(x) can be repre—
sented as its factorization 1‘, is n.
pl up2cvncnpk for primes p1< p2 < _< 19),, i1,2'2,..'.,ik E 1N:— {0} for j = 1,2,...,k. A
possible representation 17‘,1 m?  '12}: over {0,1, (, ),} can be Bin(p1)(Bin(i1))Bin(p2)(Bin(iz)) ...Bin(pk)(Bin(ik)). ‘,L‘"‘“uave~" «we ,7,‘ .._.,; 2.4 Kolmogorov Complexity 39 Applying the above—introduced homomorphism b. one obtains a binary
representation of x. The bad news is that this compression method is in
comparable with the method based on subword repetitions in the sense that
compression by subword repetitions can perform better than compression by factorization and vice versa. Exercise 2.52. Find two words :r,y 6 (Emmy such that 1. the compression by subword repetitions provides a substantially shorter
representation of .1: than the compression method by factorization, and 2. the compression by factorization results in a substantially shorter repre
sentation of y than the compression by repetitions. A deﬁnition of a complexity measure must be robust, in the sense that
the measured complexity has to have a broad validity, to be applicable in
a general framework. Considering the dependence of the size of word repre
sentations on a particular compression method, it is obvious that ﬁxing one
compression method for the deﬁnition of the size of the information content
of a word leads to a situation where one does not have any possibility to
formulate generally valid assertions about the information content of words.
The following deﬁnition by Kolmogorov provides a way out of this apparent
deadlock. It is important to observe that the introduction of the algorithm
(program) is the crucial point that enables one to ﬁnd a way of information— content measurement. Deﬁnition 2.53. For any word a; EXEbool)‘, the Kolmggpibt’ complexity Kev) of the ward w%”?7ié"'liiiiérylength}? emanate» P'ds‘é‘dl'ﬁwémm
that generates ' We know that any compiler for Pascal generates the machine code of every
syntactically correct program written in Pascal and that the machine code of 4 a program is nothing other than a word over (EMMY. Hence, for every word a: 6 (Essay, We consider all (inﬁnitely many) machine codes of programs that
generate z and the length of the shortest one is the Kolmogorov complexity of as.
Is [{(x) a good candidate for the size of the information content of c? When one wants to include all compression methods, then surely, yes. For
every compression method M that computes a compression M (m) to any word
:5, one can write a program that contains M (:0) as a parameter (a constant of
the program) and generates :c from M But before analyzing the deﬁnition
of K (3:) in depth, we present some basic properties of Kolmogorov complexity
to deepen our understanding of this concept. The ﬁrst property of Kguarantees that K (2:) cannot be substantially ' ‘5 larger that In: . Obviously, this property is highly valued. ———5 WW .._.... ..
1 More precisely, the length of the shortest binary code of a program generating a: 40 2 Alphabets, Words, Languages, and Algorithmic Problems 0
2.54 There exists a constant d, such that for every (n E (21,001)‘ Lemma Proof. For any a: e (Eboolr we consider the'following program“; .41: begin
write(a:);
ad The parts begin, write, and and commas of the program Am are the
same for every :2: E (Eboolr and the length of their representation in the
. machine code is bounded by a small constant d independent of m. The word
a: is represented in A, as 2: and therefore its contribution to the length of the (binary) machine code of AI is exactly Iml. El Obviously, regular words with many subword repetitions have small K01
orov complexity. Let yn = 0" 6 {0, 1}” for any n E IN  {0}. The following .Eu‘rfakfaumu prog Y,1 generates y" . “mamw “’M'M‘S'egin
for i = 1 to 71 do
write(0);
end All partsydf programs Yn are the same, except the number n. The length
of Bin(n) is exactly f10g2(n+ 1)] and thus the contribution of n to the binary
representation of Yn is at most [log2 n] + 1. Hence, there exists a constant c such that WM K(yn) S floggnl + c = [log2 ynn +6; 9 for all n E IN — h“‘*~~~~— '
Now, consider 2,. = 0"2 6 {0,1}* for any positive integer n. The following
program Zﬂ generates the word 2... begin
m := n;
m := m x m;
for i = 1 to m do
write(0);
end All programs Zn are similar, except for the integer n. If the length of the
binary code of all parts of Zn except n is d, then 16 Eur SllelClW WE use a Fattellllﬁe'programming language, where, for instance,
the declaration of variables is omitted. '2 .4 Kolmogorov Complexity 41 1((znlg [loggm + 1)] + d g [log2(\/lzn] + d + 1. Note, that the. crucial point in our calculation is that the program Zn does
not need to store the value 712 as a constant. Instead, Zn stores the smaller
value n, and the value 71'2 is computed during the execution of the program.
Since the size of the memory during the computation of Z. is not included
in the length of the description of Z", we have saved approximately [log2 n] representation bits in this way. Exercise 2.55. Prove the following assertion. There exists a constant c such
that, for every positive integer n, K ((01)?) g [log2(n+ 1)1+'c = [log210g2 (l(01)2"l/2)l +c‘
Exercise 2.56. Find an inﬁnite sequence of words y1,y2,y3, . .. over Ebool
that satisﬁes the following conditions: (1) [ml < tall for an 2' E IN 40}. and
(ii) there exists a constant c such that, for all i E IN — {0}.
Kim) S llogalogs 1052 lyill + C Exercise 2.57. Prove that. for each positive integer m there exists a word rum such that
Iwml — K(wm) > m. One can also measure the information content of positive integers by sim
ply taking the Kolmogorovcomplexity of their binary representations, Deﬁnition 2.58. The Kolmogorov complexity of a positive integer n ts
K(n) : K(Bm(n)). Exercise 2.59. Let n. 2 pg be a positive integer. Prove
K01.) S K03) + K(q) + C for a constant c, that does not depend on n, p, and q. The next fundamental result shows that there exist words that are not
compressible with respect to Kolmogorov complexity. Lemma 2.60. For every positive integer n there exists a word wn E (JbooxY‘. such that
[{(wn) Z wn = 7L, i.c., for any positive integer n there is a noncompressible woer of length n. *— 42 2 Alphabets, Words. Languages, and Algorithmic Problems Proof! The proof is based on the following simple combinatorial idea. We have
exactly 2" words 121, . . . ,ZIJzu in (231,001)". Let, fort = 1, 2, . . . , 2", CProg(a:1) 6
{0,1}‘ be the machine code of the program Prog(:r1) that generates mi and
K(:ci) = CProg($i), i.e., Prog(zi) is one of the shortest17 programs that generate 2:5. .
Clearly, for 2' ¢ j, CProg(:rz) and C—Prog(r,) must be different because (I); and z, are different. This means that we have 2” different machine codes CProg(w1), CProg(:rg), . . . ,C—Prog(zr2n) of shortest programs for 51:1,:1:2,...,cr2n. It is sufﬁcient to show that at least one of these program machine codes has length at least n.
Our combinatorial argument simply says that one cannot have 2" different machine codes (words), all shorter than n. Each machine code is a word over
(Emmy. The number of words of length 1' over Ebool is exactly 2’. Hence the
number of all nonempty words over Eboui of length at most n — 1 is 11—1
Zzi=2"—2<2".
i=1 Therefore, there is at least one word among CPrbghrl ), . . . , CProg($2n) with
a length of at least n. Let CProg(m]~) be such a word...
View
Full Document
 Winter '15
 anil gupta

Click to edit the document details