This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Finite Automata, Palindromes, Powers,
and Patterns Terry Anderson, Narad Rampersad, Nicolae Santean", and Jeffrey Shallit David R. Cheriton School of Computer Science
University of Waterloo
Waterloo, Ontario N2L 3G1, Canada
tandersoncuwaterloo . ca, nrampersadccs .uwaterloo . ca
nsanteanciusb . edu, shallitcgraceland .uwaterloo . ca Abstract. Given a language L and a nondeterministic ﬁnite automaton
M, we consider whether we can determine efﬁciently (in the size of M) if
M accepts at least one word in L, or inﬁnitely many words. Given that M
accepts at least one word in L, we consider how long the shortest word can
be. The languages L that we examine include the palindromes, the non
palindromes, the kpowers, the non—k—powers, the powers, the non—powers
(also called primitive words), and words matching a general pattern. 1 Introduction Let L g 2* be a ﬁxed language, and let M be a deterministic ﬁnite automaton
(DFA) or nondeterministic ﬁnite automaton (NFA) with input alphabet E. In
this paper we are interested in three questions: 1. Whether we can efﬁciently decide (in terms of the size of M) if L(M) contains
at least one element of L, that is, if L(M) D L aé (Z); 2. Whether we can efﬁciently decide if L(M) contains inﬁnitely many elements
of L, that is, if L(M) n L is inﬁnite; 3. Given that L(M) contains at least one element of L, what is a good upper
bound on the shortest element of L(M) n L? As an example, consider the case where E = {a}, L is the set of primes written
in unary, that is, {ai : 2' is prime }, and M is a NFA with n states. To answer questions (1) and (2), we ﬁrst rewrite M in Chrobak normal form
[5]. Chrobak normal form consists of an NFA M’ with a “tail” of 00*?) states,
followed by a single nondeterministic choice to a set of disjoint cycles containing
at most n states. Computing this normal form can be achieved in 0(n5) steps
by a result of Martinez [17]. Now we examine each of the cycles produced by this transformation. Each
cycle accepts a ﬁnite union of sets of the form (at)*ac, where t is the size of * Author’s current address: Department of Computer and Information Sciences, Indi—
ana University South Bend, 1700 Mishawaka Ave., P.O. Box 7111, South Bend, IN
46634, USA. C. MartinVide, F. Otto, and H. Fernau (Eds): LATA 2008, LNCS 5196, pp. 52—63, 2008.
© SpringerVerlag Berlin Heidelberg 2008 Finite Automata, Palindromes, Powers, and Patterns 53 the cycle and c S n2 + n; both t and c are given explicitly from M’. Now,
by Dirichlet’s theorem on primes in arithmetic progressions, gcd(t,c) = 1 for
at least one pair (t, c) induced by M’ if and only if M accepts inﬁnitely many
elements of L. This can be checked in 0(n2) steps, and so we get a solution to
question (2) in polynomial time. Question (1) requires a little more work. From our answer to question (2),
we may assume that gcd(t, c) > 1 for all pairs (t, c), for otherwise M accepts
inﬁnitely many elements of L and hence at least one element. Each element in
such a set is of length kt + c for some k: 2 0. Let d = gcd(t,c) 2 2. Then
kt+c = (1975/ d + c/ d)d. If k > 1, this quantity is at least 2d and hence composite.
Thus it sufﬁces to check the primality of c and t + c, both of which are at
most n2 + 271. We can precompute the primes < n2 + 271 in linear time using
a modiﬁcation of the sieve of Eratosthenes [18], and check if any of them are
accepted. This gives a solution to question (1) in polynomial time. On the other hand, answering question (3) essentially amounts to estimating
the size of the least prime in an arithmetic progression, an extremely difﬁcult
question that is still not fully resolved [9], although it is known that there is a
polynomial upper bound. Thus we see that asking these questions, even for relatively simple languages
L, can quickly take us to the limits of what is known in formal languages and
number theory. In this paper we examine questions (1)(3) in the case where M is an NFA
and L is either the set of palindromes, the set of k—powers, the set of powers, the
set of words matching a general pattern, or their complements. In some of these cases, there is previous work. For example, Ito et a1. [12]
studied several circumstances in which primitive words (nonpowers) may appear
in regular languages. As a typical result in [12], we mention: “A DFA over an
alphabet of 2 or more letters accepts a primitive word if]? it accepts one of length
3 3n— 3, where n is the number of states of the DFA”. Horvath, Karhumaki and
Kleijn [11] addressed the decidability problem of whether a language accepted
by an NFA is palindromic (i.e., every element is a palindrome). They showed
that the language accepted by an NFA with n states is palindromic if and only
if all its words of length shorter than 3n are palindromes. A preliminary version of the full version of this paper is available online 2 Notions and Notation Let E be an alphabet, i.e., a nonempty, ﬁnite set of symbols (letters). By 2* we
denote the set of all ﬁnite words over 2, and by 5, the empty word. For 11) E 2*,
we denote by 111R the word obtained by reversing the order of symbols in w.
A palindromes is a word 11) such that 'w = wR. If L is a language over E, i.e.,
L g 2*, we say that L is palindromic if every word 11) e L is a palindrome.
Let k 2 2 be an integer. A word y is a kpower if y can be written as y = .7:
for some nonempty word x. If y cannot be so written for any It 2 2, then y is
primitive. A 2—power is typically referred to as a square, and a 3power as a cube. k 54 T. Anderson et al. Patterns are a generalization of powers. A pattern is a nonempty word p over
a pattern alphabet A. The letters of A are called rvariables. A pattern 1) matches
a word w E 2* if there exists a nonerasing morphism h : A* —> 2* such that h(p) = w. Thus, a word w is a kpower if it matches the pattern ak. We deﬁne an NFA (or DFA) as the usual 5tuple M = (Q, 2, 6, q0,F). The
size of M is the total number N of its states and transitions. When we want to
emphasize the components of M, we say M has it states and t transitions, and
deﬁne N := n + t. We note that if M is an NFA or NFAe, we can remove all states that either
cannot be reached from the start state or cannot reach a ﬁnal state (the latter
are called dead states) in linear time (in the number of states and transitions)
using depthﬁrst search. We observe that L(M) aé (b if and only if any states
remain after this process, which can be tested in linear time. Similarly, if M
is a NFA, then L(M) is inﬁnite if and only if the corresponding digraph has a
directed cycle. This can also be tested in linear time. We will also need the following wellknown results [10]: Theorem 1. Let M be an NFA with n states. Then (a) L(M) aé (b if and only ifM accepts a word of length < n.
(b) L(M) is inﬁnite if and only ifM accepts a word of length 2, n S K < 2n. A language L is called slender if there is a constant C such that, for all n 2 0, the
number of words of length n in L is less than C. The following characterization of slender regular languages has been independently rediscovered several times
in the past [14,24,19]. Theorem 2. Let L g 2* be a regular language. Then L is slender if and only
if it can be written as a ﬁnite union of languages of the form uu*w, where
u,u,w E 2*. For further background on ﬁnite automata and regular languages we refer the
reader to Yu [26]. 3 Testing If an NFA Accepts at Least One Palindrome Over a unary alphabet, every string is a palindrome, so problems (1)(3) become
trivial. Let us assume, then, that the alphabet 2 contains at least two letters.
Although the palindromes over such an alphabet are not regular, the language L' = {:13 E 2* : me E L(M) or there exists a E 2 such that mamR E is, in fact, regular, as often shown in a beginning course in formal languages [10,
p. 72, Exercise 3.4 We can take advantage of this as follows: Lemma 1. Let M be an NFA with it states and t transitions. Then there exists
an NFA M’ with n2 + 1 states and 3 2t2 transitions such that L(M’) = L’. Finite Automata, Palindromes, Powers, and Patterns 55 Corollary 1. Given an NFA M with n states and t transitions, we can deter
mine if M accepts a palindrome in O(n2 + t2) time. Corollary 2. Given an NFA M, we can determine if L(M) contains inﬁnitely
many palindromes in quadratic time. Corollary 3. If an NFA M accepts at least one palindrome, it accepts a palin
drome of length 3 2n2 — 1. Rosaz [21] also gave a proof of this last corollary. The quadratic bound is tight,
up to a multiplicative constant, in the case of alphabets with at least two letters,
and even for DFAs: Proposition 1. For inﬁnitely many it there exists a DFA Mn with n states over
a 2letter alphabet such that the shortest palindrome accepted by Mn is of length
2 n2 / 2 — 3n + 5. 4 Testing If an NFA Accepts at Least One
Nonpalindrome In this section we consider the problem of deciding if an NFA accepts at least
one nonpalindrome. Equivalently, we consider the problem: Given an NFA M,
is L(M) palindromic? Again, the problem is trivial for a unary alphabet, so we assume E 2 2.
Horvath, Karhumaki, and Kleijn [11] proved that the question is recursively
solvable. In particular, they proved the following theorem: Theorem 3. L(M) is palindromic if and only if {at E L(M) : < 3n} is
palindromic, where n is the number of states of M. For an NFA over an alphabet of at least 2 symbols, the 3n bound is easily seen
to be optimal; for a DFA, however, the bound of 3n can be improved to 3n — 3,
and this is optimal. While a naive implementation of Theorem 3 would take exponential time, in
this section we show how to test palindromicity in polynomial time. The main idea is to construct a “small” NFA Mt’, for some integer t > 1,
where no word in is a palindrome, and Mt’ accepts all nonpalindromes
of length < t (in addition to some other nonpalindromes). We omit the details
of the construction (a similar construction appears in [25]). Given an NFA M with n states, we now construct the crossproduct with
Mg”, and obtain an NFA A that accepts L(M) ﬂ By Theorem 3,
L(A) = 0 if and only if L(M) is palindromic. We can determine if L(A) = (0 in
linear time. If M has n states and t transitions, then A has 0(n2) states and
0(tn) transitions. Hence we have proved the following theorem. Theorem 4. Let M be an NFA with n states and t transitions. The algorithm
sketched above determines whether M accepts a palindromic language in O(n2 +
tn) time. 56 T. Anderson et al. In analogy with Corollary 2 and using a different construction than that of
Theorem 4, we also have the following proposition. Proposition 2. Given an NFA M with n states and t transitions, we can de
termine in 0(n2 + t2) time if M accepts inﬁnitely many nonpalindromes. 5 Testing If an NFA Accepts a Word Matching a Pattern In this section we consider the computational complexity of the decision problem: NFA PATTERN ACCEPTANCE INSTANCE: An NFA M over the alphabet E and a pattern p over some
alphabet A. QUESTION: Does there exist :3 6 27+ such that ac E L(M) and .7:
matches p? Since the pattern p is given as part of the input, this problem is actually
somewhat more general than the sort of problem formulated as Question 1 of
the introduction, where the language L was ﬁxed. The following result was proved by Restivo and Salemi [20] (a more detailed
proof appears in Theorem 5 (Restivo and Salemi). Let L be a regular language and let A be
an alphabet. The set PA of all nonempty patterns p E A“ such that p matches
a word in L is eﬁectively regular. Observe that Theorem 5 implies the decidability of the NFA PATTERN AC
CEPTANCE problem. It is possible to give a boolean matrix based proof of
Theorem 5 (see Zhang [27] for a study of this boolean matrix approach to au
tomata theory) that provides an explicit description of an NFA accepting PA,
but due to space constraints we omit this proof. However, the reader may per
haps deduce the argument from the proof of the following algorithmic result,
which uses similar ideas. Theorem 6. The NFA PATTERN ACCEPTANCE problem is PSPACE
complete. Proof {sketch}. We ﬁrst show that the problem is in PSPACE. By Savitch’s
theorem [23] it sufﬁces to give an NPSPACE algorithm. Let M = (Q, E, 6, go, F),
where Q = {0,1,...,n — 1}. For a E 2‘, let Ba be the n x n boolean matrix
whose (i, j) entry is 1 if j E 6(i, a) and 0 otherwise. Let [3 denote the semigroup
generated by the Ba’s. For 11) = 11201111   "w, E 2*, we write Bw to denote the
matrix product Bu,0 Bu,1   Bws. Let A be the set of letters occuring in p. We may suppose that A = {1, 2, . . . , k}.
First, nondeterministically guess k: boolean matrices Bl, . . . , Bk. Next, for each
i, verify that B, is in the semigroup B by nondeterministically guessing a word
in of length at most 2"2 such that B, = Bw. We guess u) symbolby—symbol and Finite Automata, Palindromes, Powers, and Patterns 57 reuse space after perfoming each matrix multiplication while computing Bw.
Then, if p = popl   p,, compute the matrix product B = BPOBP1 Bp, and
accept if and only if B describes an accepting computation of M. To show hardness we reduce from the following PSPACE—complete problem
[7, Problem AL6]. We leave the details to the reader. DFA INTERSECTION INSTANCE: An integer k 2 1 and k DFAs A1, A2, . . . , Ak, each over the
alphabet E. QUESTION: Does there exist :3 e 2* such that a: is accepted by each
A,, 1 g i g k? [I We may deﬁne various variations or special cases of the NFA PATTERN
ACCEPTANCE problem, such as: NFA ACCEPTS A kPOWER, NFA
ACCEPTS INFINITELY MANY kPOWERS, where each of these prob
lems is deﬁned in the obvious way. When Is is part of the input (i.e., k: is not
ﬁxed), these problems can be shown to be PSPACE—complete by a variation on
the proof of Theorem 6. However, if k is ﬁxed, both of these problems can be
solved in polynomial time, as we now demonstrate. Proposition 3. Let M be an NFA with n states and t transitions, and set N =
n + t, the size of M. For any ﬁxed integer k: 2 2, there is an algorithm running
in O(n2k_1tk) = O(N2k_1) time to determine if M accepts a kpo'wer. Proof (sketch). For a language L g 2*, we deﬁne Ll/Ic = {:13 E 2* : :17" E L}. It
is wellknown that if L is regular, then so is Ll/k. We leave it to the reader to
verify that an NFAe M’ accepting Ll/k can be constructed with n2,“_1 + 1 states
and at most tk distinct transitions. Testing whether or not L(M’) accepts a non empty word can be done in linear time, so the running time of our algorithm is
0(n2k—1tk). El Corollary 4. We can decide if an NFA M with it states and t transitions accepts
inﬁnitely many kpowers in 0(n2k_1tk) time. We may also consider the problems NFA ACCEPTS A 2 kPOWER and
NFA ACCEPTS INFINITELY MANY 2 kPOWERS, again deﬁned
in the obvious way. Here, even for ﬁxed k, these problems are both PSPACE
complete. Setting k: = 2 corresponds to the problems NFA ACCEPTS A
POWER and NFA ACCEPTS INFINITELY MANY POWERS, so we
see that both these problems are PSPACE—complete as well. To show PSPACE—hardness for the “inﬁnitely many” problems, we reduce
from the DFA INTERSECTION INFINITENESS problem, which is de
ﬁned similarly to the DFA INTERSECTION problem, except that we now
ask if there are inﬁnitely many words at such that :r is accepted by each A,. This
problem is easily seen to be PSPACE—complete as well. 58 T. Anderson et al. 6 Testing If an NFA Accepts a NonkPower In the previous section we showed that it is computationally hard to test if an
NFA accepts a kpower (when k is not ﬁxed). In this section we show how to
efﬁciently test if an NFA accepts a nonkpower. Again, we ﬁnd it more congenial
to discuss the opposite problem, which is whether an NFA accepts nothing but
h—powers. First, we need some classical results from combinatorics on words. Theorem 7 (Lyndon and Schiitzenberger [15]). If ac, y, and z are words
satisfying an equation ply] = z’“, where i, j, k 2 2, then they are all powers of a
common word. Theorem 8 (Lyndon and Schiitzenberger [15]). Letu and v be nonempty
words. If uv = vu, then there exists a word at and integers i, j 2 1, such that
u = x‘ and v = x]. In other words, u and v are powers of a common word. We include here the following combinatorial result, which, when applied to words
in a regular language, gives a sort of “pumping lemma” for powers in a regular
language. Proposition 4. Let u, v, and m be words, v 5A 5, and let f,g 2 1 be integers,
f aé g. If uvfw and uvgw are nonprimitive, then uv"w is nonprimitive for all
integers n 2 1. Further, if uvw and uv2w are kpowers for some integer k 2 2,
then v and uv"w are kpowers for all integers n 2 1. The following result is an analogue of Theorem 3, from which we will derive an
efﬁcient algorithm for testing if a ﬁnite automaton accepts only kpowers. Theorem 9. Let L be accepted by an nstate NFA M and let k 2 2 be an
integer. 1. Every word in L is a kpower if and only if every word in the set {:13 E L :
m 3 3n} is a hpower. 2. All but ﬁnitely many words in L are kpowers if and only if every word in
the set {x E L : n S 3 3n} is a kpower. Further, ifM is a DFA over an alphabet of size 2 2, then the bound 3n may be
replaced by 3n — 3. Ito et al. [12] proved a similar result for primitive words: namely, that if L
is accepted by an n—state DFA over an alphabet of two or more letters and
contains a primitive word, then it contains a primitive word of length 3 3n — 3.
In other words, every word in L is a power if and only if every word in the set
{x E L : 3 3n— 3} is a power. The proof of Theorem 9 is similar to that of [12, Proposition 7], albeit with
some additional complications. We shall give a complete proof in the full version
of this paper. Finite Automata, Palindromes, Powers, and Patterns 59 The characterization due to Ito et al. [12, Proposition 10] (see also Domosi,
Horvath, and Ito [6, Theorem 3]) of the regular languages consisting only of
powers, along with Theorem 2, implies that any such language is slender. A
simple application of the Myhill—Nerode Theorem gives the following weaker
result. Proposition 5. Let L be a regular language and let k 2 2 be an integer. If all
but ﬁnitely many words of L are kpowers, then L is slender. In particular, if L
is accepted by an nstate DFA and all words in L of length 2 2 are kpowers,
then for all r 2 If, the number of words in L of length r is at most n. The following characterization is analogous to the characterization of palin
dromic regular languages given in [11, Theorem 8], and follows from Propo
sition 5, Theorem 2, and the (omitted) proof of Proposition 4. Theorem 10. Let L g 2* be a regular language and let k 2 2 be an integer.
The language L consists only of kpowers if and only if it can be written as
a ﬁnite union of languages of the form uu*w, where u,u,w e 27* satisfy the
following: there exists a primitive word at E 2* and integers i, j 2 0 such that
o 2 mi,“ and mu 2 $37“. Next we apply Theorem 9 to deduce the following algorithmic result. Theorem 11. Let k 2 2 be an integer. Given an NFA M with n states and t
transitions, it is possible to determine if every word in L(M) is a kpower in
O(n3 + tn2) time. Proof (sketch). We create an NFA, M;, for r = 3n, such that no word in is a kpower, and M; accepts all nonkpowers of length 3 r (and perhaps some
other nonkpowers). Note that we may assume that k S r. If k > r, then no word of length 3 r is
a kpower. In this case, to obtain the desired answer it suffices to test if the set
{x E L(M) : S r} is empty. However, this set is empty if and only if L(M)
is empty, and this is easily veriﬁed in linear time. We now form a new NFA A as the cross product of M; with M. From Theo
rem 9, it follows that L(A) = (2) iff every word in L(M) is a kpower. Again, we
can determine if L(A) = (Z) in linear time. We omit the details of the construction of M;, noting only that M; can be
constructed to have at most 0(r2) states and 0(r2) transitions. After construct
ing the crossproduct, this gives a 0(n3 + tn2) bound on the time required to
determine if every word in L(M) is a kpower. El Theorem 9 suggests the following question: if M is an NFA with n states that
accepts at least one nonkpower, how long can the shortest nonkpower be?
Theorem 9 proves an upper bound of 3n. A lower bound of 2n — 1 for inﬁnitely
many n follows easily from the obvious (n + 1)—state NFA accepting a”(a"+1)*,
where n is divisible by 14:. However, Ito et al. [12] gave a very interesting example
that improves this lower bound: if x = ((ab)"a)2 and y = baxab, then .7: and xym 60 T. Anderson et al. are squares, but xyxyx is not a power. Hence, the obvious (8n + 8)—state NFA
that accepts has the property that the shortest nonkpower accepted is
of length 20n+ 18. We generalize this lower bound by deﬁning x and y as follows:
let u = (ab)"a, x 2 uk, and y = x_1(xbau_1x)kx_1. We leave it to the reader
to deduce the following result. Proposition 6. Let k 2 2 be ﬁxed. There exist inﬁnitely many NFAs M with
the property that if M has r states, then the shortest nonkpower accepted is of
length (2 + ﬁn — 0(1). We may also apply part (2) of Theorem 9 to obtain an algorithm to check if an
NFA accepts inﬁnitely many nonkpowers. Theorem 12. Let k: 2 2 be an integer. Given an NFA M with n states and t
transitions, it is possible to determine if all but ﬁnitely many words in L(M) are
kpowers in 0(n3 + tn2) time. 7 Automata Accepting Only Powers We now move from the problem of testing if an automaton accepts only kpowers
to that of testing if it accepts only powers (of any kind). Just as Theorem 9 was
the starting point for our algorithmic results in Section 6, the following theorem
of Ito et al. [12] (stated here in a slightly stronger form than in the original) is
the starting point for our algorithmic results in this section. Theorem 13. Let L be accepted by an nstate NFA M. 1. Every word in L is a power if and only if every word in the set {x E L :
x 3 3n} is a power. 2. All but ﬁnitely many words in L are powers if and only if every word in the
set {x E L : n g 3 3n} is a power. Farther, ifM is a DFA over an alphabet of size 2 2, then the bound 3n may be
replaced by 3n — 3. We next prove an analogue of Proposition 5. We need the following result, ﬁrst
proved by Birget [3], and later, independently, in a weaker form, by Glaister and
Shallit [8]. Theorem 14. Let L Q 2* be a regular language. Suppose there exists a set of
pairs 5' = E 2* x 2* :1 g i g n} such that: {a} xiyi E L for 1 S i g n;
and (b) either xiyj ¢ L or xjyi ¢ L for 1 g i,j S n, i 5A j. Then any NFA
accepting L has at least it states. Proposition 7. Let M be an nstate NFA and let 6 be a nonnegative integer
such that every word in L(M) of length 2 t is a power. For all r 2 2, the number
of words in L(M) of length r is at most 7n. Finite Automata, Palindromes, Powers, and Patterns 61 Proof. We give the proof in full, as it illustrates an unusual and unexpected
combination of techniques from both the theory of nondeterministic state com
plexity as well as the theory of combinatorics on words. Let r 2 K be an arbitrary integer. The proof consists of three steps. Step 1. We consider the set A of words 11) in L(M) such that w = r and
w is a kpower for some [9 2 4. For each such 11), write 11) 2 mi, where .7: is
a primitive word, and deﬁne a pair ($2,9ci—2). Let SA denote the set of such
pairs. Consider two pairs in SA: (maxi—2) and (y2,yj_2). The word x2yj_2 is
primitive by Theorem 7 and hence is not in L(M The set SA thus satiﬁes the
conditions of Theorem 14. Since L(M) is accepted by an n—state NFA, we must
have SA g n and thus A S n. Step 2. Next we consider the set B of cubes of length r in L(M For each
such cube in = .733, we deﬁne a pair 016,372). Let SB denote the set of such pairs.
Consider two pairs in $3: (:13, x2) and (y, y2). Suppose that any2 and yxz are both
in L(M The word :ry2 is certainly not a cube; we claim that it cannot be a
square. Suppose it were. Then and are even, so we can write .7: = $1.732
and y = 111.112 where lwll = lmzl = lyll = lyzl. NOW if 90.742 = 90196291929192 is
a square, then 2312:2311 = y2y1y2, and so y1 = y2. Thus y is a square; write
y = 22. By Theorem 7, yr? = 22902 is primitive, contradicting our assumption
that yx2 E L(M It must be the case then that :er2 is a kpower for some k 2 4.
Thus, my2 2 u,“ for some primitive u uniquely determined by :r and y. With each
pair of cubes :83 and y3 such that both :er2 and ym2 are in L(M) we may therefore
associate a kpower uk 6 L(M), where k 2 4. We have already established in
Step 1 that the number of such kpowers is at most n. It follows that by deleting
at most 72 pairs from the set S B we obtain a set of pairs satisfying the conditions
of Theorem 14. We must therefore have SB 3 2n and thus B 3 2n. Step 3. Finally we consider the set C of squares of length r in L(M For each
such square 11) = .732, we deﬁne a pair (at, Let Sc denote the set of such pairs.
Consider two pairs in So: (:13, :13) and (y, Suppose that .733; and ya: are both in
L(M The word any is not a square and must therefore be a kpower for some
k: 2 3. We write my 2 u,“ for some primitive u uniquely determined by m and y.
In Steps 1 and 2 we established that the number of kpowers of length r, k 2 3,
is A + B 3 3n. It follows that by deleting at most 3n pairs from the set So we
obtain a set of pairs satisfying the conditions of Theorem 14. We must therefore
have Sc 3 4n and thus C 3 4n. Putting everything together, we see that there are A + B + C S 7n words
of length r in L(M), as required. El The bound of 7n in Proposition 7 is almost certainly not optimal. We now prove
the following algorithmic result. Theorem 15. Given an NFA M with 71 states, it is possible to determine if
every word in L(M) is a power in 0(n5) time. Proof (sketch). Checking if a word is a power can be done in linear time using
the KnuthMorris—Pratt algorithm [13]. By Theorem 13 and Proposition 7 it
sufﬁces to enumerate the words in L(M) of lengths 1, 2, . . . , 3n, stopping if the 62 T. Anderson et al. number of such words in any length exceeds 7n. If all these words are powers,
then every word is a power. Otherwise, if we ﬁnd a nonpower, or if the number
of words in any length exceeds 7n, then not every word is a power. By the work
of Makinen [16] or Ackerman & Shallit [1], we can enumerate these words in
0(n5) time. El Using part (2) of Theorem 13 along with Proposition 7, one obtains the following
in a similar manner. Theorem 16. Given an NFA M with it states, we can decide if all but ﬁnitely
many words in L(M) are nonpowers in 0(n5) time. 8 Final Remarks In this paper we examined the complexity of checking various properties of reg
ular languages, such as consisting only of palindromes, containing at least one
palindrome, consisting only of powers, or containing at least one power. In each
case, we were able to provide an efﬁcient algorithm or show that the problem
is likely to be hard. Our results are summarized in the following table. We also
report some upper and lower bounds on the length of a shortest palindrome,
k—power, etc., accepted by an NFA; due to space constraints we must omit the
proofs of these bounds. Here M is an NFA with n states and t transitions. decide if decide if upper bound on worstcase
L L(M) n L = Q) L(M) n L shortest element lower bound
inﬁnite of L(M) 0 L known
palindromes 0(n2 + t2) 0(n2 + t2) 2n2 — 1 “72 — 3n + 5
nonpalindromes 0(n2 + tn) 0(7).2 + t2) 3n — 1 3n — 1
kpowers 0(n%_1tk) 0(n2k_1tk) kn,“ {2(nk)
(k ﬁxed)
kpowers PSPACE— PSPACE—
(k part of input) complete complete
non—Ic—powers 0(n3 + tn2) 0(n3 + tn2) 3n (2 + — 0(1)
powers PSPACE— PSPACE— (n + 1)n"+1 enw” l°g ")
complete complete
non—powers 0(n5) 0(n5) 3n %n — 2
References 1. Ackerman, M., Shallit, J .: Efﬁcient enumeration of regular languages. In: Holub, J .,
Zdarek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 226—242. Springer, Heidelberg
(2007) 2. Anderson, T., Rampersad, N., Santean, N., Shallit, J.: Finite automata, palin
dromes, patterns, and borders, http://www.arxiv.org/abs/O711 .3183 0°." 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. Finite Automata, Palindromes, Powers, and Patterns 63 . Birget, J .—C.: Intersection and union of regular languages and state complexity. Inform. Process. Lett. 43, 185—190 (1992)
Castiglione, G., Restivo, A., Salemi, 8.: Patterns in words and languages. Disc.
Appl. Math. 144, 237—246 (2004) . Chrobak, M.: Finite automata and unary languages. Theoret. Comput. Sci. 47, 149—158 (1986); Errata 302, 497—498 (2003) . D6m6si, P., Horvath, G., Ito, M.: A small hierarchy of languages consisting of nonprimitive words. Publ. Math (Debrecen) 64, 261—267 (2004)
Carey, M., Johnson, D.: Computers and Intractability. Freeman, New York (1979) . Glaister, I., Shallit, J .: A lower bound technique for the size of nondeterministic ﬁnite automata. Inform. Process. Lett. 59, 75—77 (1996) . HeathBrown, D.R.: Zerofree regions for Dirichlet L—functions, and the least prime in an arithmetic progression. Proc. Lond. Math. Soc. 64, 265—338 (1992)
Hopcroft, J .E., Ullman, J .D.: Introduction to Automata Theory, Languages, and
Computation. AddisonWesley, Reading (1979) Horvath, S., Karhumaki, J ., Kleijn, J .: Results concerning palindromicity. J. Inf.
Process. Cybern. EIK 23, 441—451 (1987) Ito, M., Katsura, M., Shyr, H.J., Yu, S.S.: Automata accepting primitive words.
Semigroup Forum 37, 45—52 (1988) Knuth, D., Morris Jr., J., Pratt, V.: Fast pattern matching in strings. SIAM J.
Computing 6, 323—350 (1977) Kunze, M., Shyr, H.J., Thierrin, G.: h—bounded and semi—discrete languages. In—
formation and Control 51, 147—187 (1981) Lyndon, R.C., Schiitzenberger, M.P.: The equation am = bnc” in a free group.
Michigan Math. J. 9, 289—298 (1962) Makinen, E.: On lexicographic enumeration of regular and contextfree languages.
Acta Cybernetica 13, 55—61 (1997) Martinez, A.: Efﬁcient computation of regular expressions from unary NFAs. In:
DCFS 2002, pp. 174—187 (2002) Pritchard, P.: Linear primenumber sieves: a family tree. Sci. Comput. Program
ming 9, 17—35 (1987) Paun, G., Salomaa, A.: Thin and slender languages. Disc. Appl. Math. 61, 257—270
(1995) Restivo, A., Salemi, S.: Words and patterns. In: Kuich, W., Rozenberg, G., Sa
lomaa, A. (eds.) DLT 2001. LNCS, vol. 2295, pp. 215—218. Springer, Heidelberg
(2002) Rosaz, L.: Puzzle corner, #50. Bull. European Assoc. Theor. Comput. Sci. 76, 234
(February 2002); Solution 77, 261 (June 2002) Rozenberg, G., Salomaa, A.: Handbook of Formal Languages. Springer, Berlin
(1997) Savitch, W.: Relationships between nondeterministic and deterministic tape com
plexities. J. Comput. System Sci. 4, 177—192 (1970) Shallit, J .: Numeration systems, linear recurrences, and regular sets. Inform. Com
put. 113, 331—347 (1994) Shallit, J ., Breitbart, Y.: Automaticity I: Properties of a measure of descriptional
complexity. J. Comput. System Sci. 53, 10—25 (1996) Yu, 3.: Regular languages. In: Handbook of Formal Languages, Ch. 2, pp. 41—110
(1997) Zhang, G.—Q.: Automata, Boolean matrices, and ultimate periodicity. Inform. Com
put. 152, 138—154 (1999) ...
View
Full Document
 Winter '08
 Ibarra,O
 Formal language, Regular expression

Click to edit the document details