This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 81 yacc — A Compiler Compiler 3 yacc (yet another compiler compiler) provides a general tool for imposing structure on the input to a computer program. Before using yacc , you prepare a specification that includes: • a set of rules to describe the elements of the input; • code to be invoked when a rule is recognized; • either a definition or declaration of a low-level scanner to examine the input. yacc then turns the specification into a C-language function that examines the input stream. This function, called a parser, works by calling the low-level scanner. The scanner, called a lexical analyzer, picks up items from the input stream. The selected items are known as tokens. Tokens are compared to the input construct rules, called grammar rules. When one of the rules is recognized, the code you have supplied for the rule is invoked. This code is called an action. Actions are fragments of C-language code. They can return values and use values returned by other actions. The heart of the yacc specification is the collection of grammar rules. Each rule describes a construct and gives it a name. For example, one grammar rule might be: date: month_name day ’,’ year ; 82 Programming Utilities Guide — November 1995 3 where date , month_name , day , and year represent constructs of interest; presumably, month_name , day , and year are defined in greater detail elsewhere. In the example, the comma is enclosed in single quotes. This means that the comma is to appear literally in the input. The colon and semicolon are punctuation in the rule and have no significance in evaluating the input. With proper definitions, the input: might be matched by the rule. The lexical analyzer is an important part of the parsing function. This user- supplied routine reads the input stream, recognizes the lower-level constructs, and communicates these as tokens to the parser. The lexical analyzer recognizes constructs of the input stream as terminal symbols; the parser recognizes constructs as nonterminal symbols. To avoid confusion, refer to terminal symbols as tokens. There is considerable leeway in deciding whether to recognize constructs using the lexical analyzer or grammar rules. For example, the rules: might be used in the above example. While the lexical analyzer only needs to recognize individual letters, such low-level rules tend to waste time and space and may complicate the specification beyond the ability of yacc to deal with it. Usually, the lexical analyzer recognizes the month names and returns an indication that a month_name is seen. In this case, month_name is a token and the detailed rules are not needed. Literal characters such as a comma must also be passed through the lexical analyzer and are also considered tokens....
View Full Document
This note was uploaded on 12/25/2010 for the course ALL 0204 taught by Professor 79979 during the Spring '10 term at National Chiao Tung University.
- Spring '10