[go: up one dir, main page]

0% found this document useful (0 votes)
41 views7 pages

Yacc: Yet Another Compiler-Compiler

Yacc is a tool that generates a parser that structurally analyzes input based on a specification of grammar rules provided by the user. The user specifies the input structures, actions to take when rules are recognized, and a low-level input routine. Yacc generates a parser that uses the grammar rules to organize tokens from the input stream, invoking user-defined actions when rules are matched. Yacc specifications include declarations, grammar rules, and programs, separated by delimiters. Grammar rules define the structure of nonterminal symbols, while the lexical analyzer handles tokens.

Uploaded by

Arun Vadivel
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views7 pages

Yacc: Yet Another Compiler-Compiler

Yacc is a tool that generates a parser that structurally analyzes input based on a specification of grammar rules provided by the user. The user specifies the input structures, actions to take when rules are recognized, and a low-level input routine. Yacc generates a parser that uses the grammar rules to organize tokens from the input stream, invoking user-defined actions when rules are matched. Yacc specifications include declarations, grammar rules, and programs, separated by delimiters. Grammar rules define the structure of nonterminal symbols, while the lexical analyzer handles tokens.

Uploaded by

Arun Vadivel
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Yacc: Yet Another Compiler-Compiler

Introduction
 Yacc provides a general tool for describing the input to a
computer program.

 The Yacc user specifies the structures of input, together


with code to be invoked as each such structure is
recognized.

 Yacc turns a specification into a subroutine that handles


the input process.

 The input subroutine produced by Yacc calls a user-


supplied routine to return the next basic input item. Thus,
the user can specify input in terms of individual input
characters, or names or numbers.

 The user-supplied routine may also handle features such


as comment and conventions.

 Yacc is written in portable C.

 Yacc act as compilers for C, APL, Pascal, RATFOR, etc.,

 Yacc is also used for

1. Phototypesetter language,

2. Desk calculator languages,


3. Document retrieval system and

4. Fortran debugging system.

WORKING OF YACC
 Yacc provides a general tool for imposing structure on the
input to a computer program.

 The Yacc user prepares a specification of the input


process; this includes rules describing the input structure,
code to be invoked when these rules are recognized, and a
low-level routine to do the basic input.

 Yacc then generates a function to control the input


process. This function, called a parser, calls the user-
supplied low-level input routine (the lexical analyzer) to
pick up the basic items (called tokens) from the input
stream.

 These tokens are organized according to the input


structure rules, called grammar rules.

 When one of these rules has been recognized, then user


code supplied for this rule, an action, is invoked;

 Actions have the ability to return values and make use of


the values of other actions.
 The actions and output subroutines are in C.
 The input specification is a collection of grammar rules.
Each rule describes an allowable structure and gives it a
name. For example, one grammar rule might be
date : month_name day ',' year ;

Here, date, month_name, day, and year are input to the


program.The colon and semicolon merely serve as
punctuation in the rule, and have no significance in
controlling the input. Thus, with proper definitions, the
input

July 4, 1776
might be matched by the above rule.

An important part of the input process is carried out by the


lexical analyzer. This user routine reads the input stream,
recognizing the lower level structures, and communicates these
tokens to the parser.

 A structure recognized by the lexical analyzer is called a


terminal symbol or tokens.
 A structure recognized by the parser is called a
nonterminal symbol.

Literal characters such as ``,'' must also be passed through the


lexical analyzer, and are also considered tokens.

BASIC SPECIFICATIONS:
 Names refer to either tokens or nonterminal symbols.
 Every specification file consists of three sections:

 the declarations,

 (grammar) rules, and

 programs.

The sections are separated by double percent ``%%'' marks. ``


%'' is an escape character in Yaac specification.

A full specification file looks like

declarations
%%
rules
%%
programs

The declaration section may be empty.

The smallest legal Yacc specification is

%%
rules

Blanks, tabs, and newlines are ignored except that they may
not appear in names or multi-character reserved symbols.
Comments may appear wherever a name is legal; they are
enclosed in /* . . . */, as in C and PL/I

The rules section is made up of one or more grammar rules. A


grammar rule has the form:
A : BODY ;
A represents a nonterminal name, and BODY represents a
sequence of zero or more names and literals. The colon and the
semicolon are Yacc punctuation.

Names may be of arbitrary length, and may be made up of


letters, dot ``.'', underscore ``_'', and non-initial digits. Upper
and lower case letters are distinct. The names used in the body
of a grammar rule may represent tokens or nonterminal
symbols.

 A literal consists of a character enclosed in single quotes


``'''. As in C, the backslash ``\'' is an escape character
within literals, and all the C escapes are recognized. Thus

'\n' newline
'\r' return
'\'' single quote ``'''
'\\' backslash ``\''
'\t' tab
'\b' backspace
'\f' form feed
'\xxx' ``xxx'' in octal
The NULL character ('\0' or 0) should never be used in grammar
rules.

If a nonterminal symbol matches the empty string, this can be


indicated in the obvious way:

empty : ;
Names representing tokens must be declared as

%token name1 name2 . . .in the declarations section.

 Start is one of the nonterminal symbol.The start symbol is


taken to be the left hand side of the first grammar rule in
the rules section
syntax: %start symbol

 The vertical bar ``|'' can be used to avoid rewriting the left
hand side.

 The end of the input to the parser is signaled by a special


token, called the endmarker.The endmarker represents
some reasonably obvious I/O status, such as ``end-of-file''
or ``end-of-record''

The precedences and associativities are used by Yacc to


resolve parsing conflicts; The rules are:

1. The precedences and associativities are recorded for


those tokens and literals that have them.
2. A precedence and associativity is associated with each
grammar rule; it is the precedence and associativity of the
last token or literal in the body of the rule. If the %prec
construction is used, it overrides this default. Some
grammar rules may have no precedence and associativity
associated with them.
3. When there is a reduce/reduce conflict, or there is a
shift/reduce conflict and either the input symbol or the
grammar rule has no precedence and associativity, then
the two disambiguating rules given at the beginning of the
section are used, and the conflicts are reported.
4.If there is a shift/reduce conflict, and both the grammar
rule and the input character have precedence and
associativity associated with them, then the conflict is
resolved in favor of the action (shift or reduce) associated
with the higher precedence. If the precedences are the
same, then the associativity is used;
left associative => reduce,
right associative => shift, and
nonassociating => error.

You might also like