This preview shows pages 1–3. Sign up to view the full content.
Module 5
Contextfree grammars and
languages
Making a computer that can’t think, but has inﬁnite memory
CS 360: Introduction to the Theory of Computing
Daniel G. Brown, University of Waterloo
5.1
Topics for this module
•
Contextfree grammars and languages
•
Parsing words in CFGs
•
Ambiguity in CFGs
CFGs and CFLs are the second major topic of this course. They have applications to parsing, primar
ily, but actually pop up in bioinformatics quite a bit.
They also have had applications in (essentially) political theory.
•
Yes, really.
5.2
1
Contextfree grammars and languages
Contextfree grammars: preliminaries
Contextfree grammars:
•
More complicated structure than regular expressions
•
Can encode more interesting languages
•
Still constrained in how one part of the input can relate to another part (essentially, it’s “context
free”: here, the name of language class means something important)
We will have an automaton for CFGs, called
pushdown automata
, which we’ll talk about in Module
6.
5.3
Contextfree grammars: the idea
Basic idea:
•
Construct words in language by applying rules to symbols.
•
Finish when all of the symbols in the word are ﬂeshed out.
•
Hard to explain without an example.
Example:
•
S
→
subject verb object
•
S
→
subject verb
•
subject
→
“billy” or “john” or “bob”
•
verb
→
“wants” or “broke”
•
object
→
“the toy” or “a dog”
Here, sentences have forms like: “billy wants a dog” or “john broke”
5.4
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document A more typical example
Another example (with the form we’ll use):
•
S
→
AB
•
A
→
0
A

ε
•
B
→
B
1

ε
Here,
S
generates
AB
, and then we produce a word with
A
, and follow with a word generated by
B
.
•
The word from
A
is from 0
*
, and the word generated from
B
is from 1
*
.
•
Words in this CFG’s language:
ε
,
0000001111
,...
.
•
In fact, its language is 0
*
1
*
.
5.5
Formal deﬁnition of a CFG
Let’s give a fully formal deﬁnition. A
contextfree grammar
(CFG) is a 4tuple:
G
= (
V
,
T
,
P
,
S
)
:
•
V
= ﬁnite set of
variables
Usually capital letters.
•
T
= ﬁnite alphabet, called
terminals
. Disjoint from
V
.
T
is the alphabet for the CFG’s
language
.
•
P
= ﬁnite set of
productions
: deﬁnitions for the variables.
–
Productions are of the form:
A
→
α
Here,
A
is a variable, and
α
is a string of symbols
from
V
and
T
.
–
Notation
: If we have
A
→
α
and
A
→
β
, we will write this as
A
→
α

β
•
S
=
Start variable
: the symbol from
V
where the derivation of a word begins.
5.6
Formalizing the previous example
Our example from before, which generated 0
*
1
*
:
G
= (
V
,
T
,
P
,
S
)
, where
•
V
=
{
A
,
B
,
S
}
•
T
=
{
0
,
1
}
•
P
=
{
S
→
AB
,
A
→
0
A

ε
,
B
→
B
1

ε
}
•
S
=
S
Now, need to deﬁne what a word being in the CFG’s language is.
Derivations
: produce a string in
T
*
by applying rules in
P
to strings we get from
S
.
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 01/14/2012 for the course CS 360 taught by Professor Johnwatrous during the Winter '08 term at Waterloo.
 Winter '08
 JohnWatrous

Click to edit the document details