[go: up one dir, main page]

0% found this document useful (0 votes)
0 views59 pages

cd lab mannual (2)

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 59

lOMoARcPSD|41236959

Compiler Design Lab Manual

Compiler Construction (Rajasthan Technical


University)

Scan to open on Studocu

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Rajasthan College of Engineering for Women, JAIPUR

LAB MANUAL

Compiler Design Lab (5CS4-22) Semester: V


Branch: Computer Science & Engineering

Department of Computer Science & Engineering

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

INDEX

S.No. Content Page Number


1 Lab Ethics & Instructions 1-2
2 Marking/Assessment System 3
3 List of Experiments Prescribed by RTU 4
4 Lab Plan 5
5 A Brief about C, Lex and Yacc 6-16
Experiment 1: Introduction: Objective, scope and outcome of the
6 course. 17-19

Experiment 2: To identify whether given string is keyword or not.


7 20-21

8 Experiment 3: Count total no. of keywords in a file 22-23


9 Experiment 4: Count total no of operators in a file 24
Experiment 5 Count total occurrence of each character in a given
10 25-26
file
Experiment 6: a C program to insert, delete and display the entries
11 27-28
in Symbol Table.
Experiment 7: Write a LEX program to identify following: Valid
12 29-33
mobile number, Valid url, Valid identifier, Valid date, Valid time
13 Experiment 8: lex program to count blank spaces, words, lines 34
Experiment 9: lex program to count the no. of vowels and
14 35
consonants
Experiment 10 : YACC program to recognize strings aaab, abbb
15 36-37
using a^nb^n, where b>=0
Experiment 11: YACC program to evaluate an arithmetic
16 38-39
expression involving operators +,-,* and /
Experiment 12: YACC program to check validity of a strings
17 40-41
abcd, aabbcd using grammar a^nb^nc^md^m, where n , m>0
18 Experiment 13: C program to find first of any grammar 42-43
19 Multiple Choice Questions 44-48
20 Practical Exam Sample Paper 49
21 Text and Reference books 50

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

LAB ETHICS
Do's
1. Shut down the computers before leaving the lab.
2. Keep the bags outside in the racks.
3. Enter the lab on time and leave at proper time.
4. Maintain the decorum of the lab.
5. Utilize lab hours in the corresponding experiment.
6. Get your floppies checked by lab incharge before using it in the lab.

Don’ts
1. Don't bring any external material in the lab.
2. Don't make noise in the lab
3. Don't bring the mobile in the lab.
4. If extremely necessary then keep ringers off.
5. Don't enter in server room without permission of lab in charge.
6. Don't litter in the lab.
7. Don't delete or make any modification in system files.
8. Don't carry any lab equipment’s outside the lab.

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

INSTRUCTIONS
Before Entering in the Lab

1. All the students are supposed to prepare the theory regarding the next program.
2. Students are supposed to bring the practical file and the lab copy.
3. Previous program should be written in the practical file.
4. Algorithm of the current program should be written in the lab copy.
5. Any student not following these instructions will be denied entry in the lab.

While Working in the Lab

1. Adhere to experimental schedule as instructed by the lab incharge.


2. Get the previously executed program signed by the instructor.
3. Get the output of current program checked by the instructor in the lab copy.
4. Each student should work on his assigned computer at each turn of the lab.
5. Take responsibility of valuable accessories
6. Concentrate on the assigned practical and don't play games.
7. If anyone is caught red-handed carrying any equipment of the lab, then he/she will have to face
serious consequences

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

RAJASTHAN TECHNICAL UNIVERSITY


III Year-V Semester
5CS4-22: Compiler Design
Lab
List of Experiment

1. Introduction: Objective, scope and outcome of the course.


2. To identify whether given string is keyword or not.
3. Count total no. of keywords in a file. [Taking file from user]
4. Count total no of operators in a file. [Taking file from user]
5. Count total occurrence of each character in a given file. [Taking file from user]
6. Write a C program to insert, delete and display the entries in Symbol Table.
7. Write a LEX program to identify following:
a) Valid mobile number
b) Valid url
c) Valid identifier
d) Valid date (dd/mm/yyyy)
e) Valid time (hh:mm:ss)
8. Write a lex program to count blank spaces, words, lines in a given file.
9. Write a lex program to count the no. of vowels and consonants in a C file.
10. Write a YACC program to recognize strings aaab, abbb using a^nb^n, where b>=0.
11. Write a YACC program to evaluate an arithmetic expression involving operators +,-,* and /.
12. Write a YACC program to check validity of a strings abcd, aabbcd using grammar
a^nb^nc^md^m, where n , m>0
13. Write a C program to find first of any grammar.

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Lexical Elements of the C Language

Like any other High level language, C provides a collection of basic building blocks, symbolic words
called lexical elements of the language. Each lexical element may be a symbol, operator, symbolic
name or word, a special character, a label, expression, reserve word, etc. All these lexical elements
are arranged to form the statements using the syntax rules of the C language. Following are the
lexical elements of the C language.

C Character Set
Every language has its own character set. The character set of the C language consists of basic
symbols of the language. A character indicates any English alphabet, digit or special symbol
including arithmetic operators. The C language character set includes:
 Letter, Uppercase A ….. Z, Lower case a….z
 Digits, Decimal digits 0….9.
 Special Characters, such as comma, period. semicolon; colon: question mark?, apostrophe‘
quotation mark “ Exclamation mark ! vertical bar | slash / backslash \ tilde ~ underscore _
dollar $ percent % hash # ampersand & caret ^ asterisk * minus – plus + <, >, (, ), [,], {, }
 White spaces such as blank space, horizontal tab, carriage return, new line and form feed.

C Tokens
In a passage of text, individual words and punctuation marks are called tokens. Similarly, in C
program, the smallest individual units are known as C tokens. C has following tokens

 Keywords or Reserve words such as float, int, etc


 Constants such 1, 15,5 etc
 Identifiers such name, amount etc
 Operators such as +, -, * etc
 Separators such as :, ;, [, ] etc and special characters
 Strings

Keywords
Key words or Reserve words of the C language are the words whose meaning is already defined and
explained to the C language compiler. Therefore Reserve words can not be used as identifiers or
variable names. They should only be used to carry the pre-defined meaning. For example int is a
reserve word. It indicates the data type of the variable as integer. Therefore it is reserved to carry the
specific meaning. Any attempt to use it other than the intended purpose will generate a compile time
error. C language has 32 keywords. Following are some of them
auto break case char const continue
default
6

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

do double else enum extern float for


goto if int long register return
short
signed sizeof static struct switch typedef
union
unsigned void volatile while

Constants
A constant can be defined as a value or a quantity which does not change during the execution of a
program. Meaning and value of the constant remains unchanged throughout the execution of the
program. These are also called as literals. C supports following types of constant.

1. Integer Constants
An integer constant refers to a sequence of digits. There three types of integer constants, namely
decimal, octal and hexadecimal. Decimal integer constant consists of set of digits from 0 to 9
preceded by an optional + or – sign.
Ex: 123, -321, 0, 4567, + 78
Embedded spaces, commas and non-digit characters are not permitted between digits. An octal
integer constant consists of any combination of digits from 0 to 7 with a leading 0(zero). Ex : 037,
0435, 0567. A sequence of digits preceded by 0x or 0X is considered as hexadecimal digit. They may
also include alphabets A to F or a to f representing numbers from 10 to 15. The largest integer value
that can be stored is machine dependant; It is 32767 for 16 bit computers. It is also possible to store
larger integer constants by appending qualifiers such U, L and UL to the constants.

2. Floating point Constants or Real Constants


The quantities that are represented by numbers with fractional part are called floating point numbers.
Ex: 0.567, -0.76, 56.78, +247.60. These numbers are shown in decimal notation , having a whole
number followed by a decimal point and the fractional part. It is possible to omit digits before the
decimal point or digits after the decimal point. Ex: 215., .95, -.76 or +.5 A real number or a floating
point number can also expressed as in exponential notation. Ex: 2.15E2. The general form of
exponential notation is mantissa e exponent or mantissa E exponent The mantissa is either a real
number or an integer. The exponent is an integer with an optional plus or minus sign. Embedded
white is not allowed in this notation.

3. Single Character constants


A single character constant contains a any valid character enclosed within a pair of single quote
marks. Ex: ‘5’, ‘A’, ‘;’ ‘ ‘. The character constants have integer values associated with them known
as ASCII values. For ex: A is having the ASCII value of 65.

4. String constants
A string constant is a sequence of characters enclosed in double quotes. The characters may be
alphabets, numbers special characters and blank space. Ex: “Hello”, “2002”, “Wel Come”, “5+3”
7

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

5. Backslash character constants or Escape sequence characters.


C supports some special backslash character constants that are used in output functions. For ex: ‘\n’
stands for new line characters. Each one of them represents a single character even though it consists
of two characters.

\a Audible alert or bell \b back space


\f form feed \n new line
\r carriage return \t horizontal tab
\v vertical tab \’ single quote
\” double quote \? Question mark
\\ backslash \0 null

Identifiers
An identifier is a sequence of letters, digits and an underscore. Identifiers are used to identify or name
program elements such as variables, function names, etc. Identifiers give unique names to various
elements of the program. Some identifiers are reserved as special to the C language. They are called
keywords.

Variables
A variable is a data name that may be used to store data value. A value or a quantity which may vary
during the program execution can be called as a variable. Each variable has a specific memory
location in memory unit, where numerical values or characters can be stored. A variable is
represented by a symbolic name. Thus variable name refers to the location of the memory in which a
particular data can be stored. Variables names are also called as identifiers since they identify the
varying quantities. For Ex : sum = a+b. In this equation sum, a and b are the identifiers or variable
names representing the numbers stored in the memory locations.
Rules to be followed for constructing the Variable names(identifiers)

 They must begin with a letter and underscore is considered as a letter.


 It must consist of single letter or sequence of letters, digits or underscore character.
 Uppercase and lowercase are significant. For ex: Sum, SUM and sum are three distinct
variables.
 Keywords are not allowed in variable names.
 Special characters except the underscore are not allowed.
 White space is also not allowed.
 The variable name must be 8 characters long. But some recent compilers like ANSI C
supports 32 characters for the variable names and first 8 characters are significant in most
compilers.
8

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Data Types
Data refers to any information which is to be stored in a computer. For example, marks of a student,
salary of a person, name of a person etc. These data may be of different types. Computer allocates
memory to a variable depending on the data that is to be stored in the variable. So it becomes
necessary to define the type of the data which is to be stored in a variable while declaring a variable.
The different data types supported by C are as follows:

int Data Type


Integer refers to a whole number with a range of values supported by a particular machine. Generally
integers occupy one word of storage i.e 16 bits or 2 bytes. So its value can range from -32768 to
+32767.
Declaration: int variable name;
Ex: int qty;
The above declaration will allocate a memory location which can store only integers, both positive
and negative. If we try to store a fractional value in this location, the fractional data will be lost.
float Data Type
A floating point number consists of sequence of one or more digits of decimal number system along
with embedded decimal point and fractional part if any. Computer allocates 32 bits, i.e. 4 bytes of
memory for storing float type of variables. These numbers are stored with 6 digits of precision for
fractional part.
Declaration: float variableName; Ex : float amount;

double Data Type


It is similar to the float type. It is used whenever the accuracy required to represent the number is
more. In others words variables declared of type double can store floating point numbers with number
of significant digits is roughly twice or double than that of float type. It uses 64 bits i.e. 8 bytes of
memory giving a precision of 14 decimal digits.
Declaration : double variableName;

char Data Type


A single character can be defined as char data type. These are stored usually as 8 bits i.e 1 byte of
memory.
Declaration : char variableName ; Ex: char pass;

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

String refers to a series of characters. Strings are declared as array of char types. Ex: char name[20];
will reserve a memory location to store upto 20 characters.
Further, applying qualifiers to the above primary data types yield additional data types. A qualifier
alters the characteristics of the data type, such as its sign or size. There are two types of qualifiers
namely, sign qualifiers and size qualifiers. signed and unsigned are the sign qualifiers short and long
are the size qualifiers.

Size and range of data types on a 16 bit Machine

Type Size (bits) Range


char or signed char 8 -128 to 127
unsigned char 8 0 to 255
int 16 -32768 to 32767
unsigned int 16 0 to 65535
short int or signed short int 8 -128 to 127
unsigned short int 8 0 to 255
long int or signed long int 32 -2,147,483,648 to
2,147,483,647
unsigned long int 32 0 to 4,294,967,295
float 32 3.4e-38 to 3.4e+38
double 64 1.7e-308 to 1.7e+308
long double 80 3.4 e-4932 to 1.1e+4932

Declaring a variable as constant


We may want the value of the certain variable to remain constant during the execution of the
program. We can achieve this by declaring the variable with const qualifier at the time of
initialization.

Ex: const int tax_rate = 0.30;

10

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

The above statement tells the compiler that value of variable must not be modified during the
execution of the program. Any attempt change the value will generate a compile time error.
Declaring a variable as volatile
Declaring the variable volatile qualifier tells the compiler explicitly that the variable’s value may be
changed at any time by some external source and the compiler has to check the value of the variable
each time it is encountered.

Ex; volatile int date;

Defining Symbolic Constants


We often use certain unique constants in a program. These constants may appear repeatedly in
number of places in a program. Such constants can be defined and its value can be substituted during
the preprocessing stage itself.
Operators in C
An operator is a symbol which acts on operands to produce certain result as output. For example in
the expression a+b; + is an operator, a and b are operands. The operators are fundamental to any
mathematical computations.
Operators can be classified as follows:

 Based on the number of operands the operator acts upon:


o Unary operators: acts on a single operand. For example: unary minus(-5, -20, etc),
address of operator (&a)
o Binary operators: acts on two operands. Ex: +, -, %, /, *, etc
o Ternary operator: acts on three operands. The symbol ?: is called ternary operator in C
language. Usage: big= a>b?a:b; i.e if a>b, then big=a else big=b.
 Based on the functions
o Arithmetic operators
o Relational operators
o Logical Operators
o Increment and Decrement operators
o Assignment operators
o Bitwise operators
o Conditional Operators
o Special operators

Precedence and Associativity of operators:


Each operator in C has a precedence associated with it. This precedence is used to determine how an
expression involving more than one operator is evaluated. The operator at the higher level of
precedence is evaluated first. The operators of the same precedence are evaluated either from left to
right or from right to left depending on the level. This is known as the associativity property of an
operator.

11

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Operator Description Level Associativity

Parenthesis
()
Array index 1 L–R
[]

+ Unary plus
- Unary minus
++ Increment
-- Decrement
2 R–L
! Logical negation
~ One’s Complement
& Address of
sizeof(type) type cast conversion

* Multiplication
/ Division 3 L- R
% Modulus

+ Addition
4 L–R
- Subtraction
<< Left Shift
5 L–R
>> Right Shift
< Less than
<= Less than or equal to
6 L–R
> Greater than
>= Greater than or equal to

== is equal to
7 L–R
!= Not equal to
& Bitwise AND 8 L–R

12

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

^ Bitwise XOR 9 L–R


| Bitwise OR 10 L–R
&& Logical AND 11 L–R
|| Logical OR 12 L–R
?: Conditional Operator 13 R–L
=, Assignment operator
14 R–L
+=, -=, *=, /=, %= Short hand assignement
, Comma operator 15 R–L

Preprocessor directives:
There are different preprocessor directives. The table below shows the preprocessor directives.

Directive Function
#define defines a macro substitution
#undef Undefines a macro
#include Specifies the files to be included.
#ifdef Tests for a macro definition
#endif Specifies the end of #if
#ifndef Tests whether a macro is not defined
#if Tests a compile-time condition
#else Specifies alternatives when #if tests fails
Header files:
C language offers simpler way to simplify the use of library functions to the greatest extent possible.
This is done by placing the required library function declarations in special source files, called header
files. Most C compilers include several header files, each of which contains declarations that are
functionally related. stdio.h is a header file containing declarations for input/ouput routines; math.h
contains declarations for certain mathematical functions and so on. The header files also contain other
information related to the use of the library functions, such as symbolic constant definitions.
13

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

The required header files must be merged with the source program during the compilation process.
This is accomplished by placing one or more #include statements at the beginning of the source
program. The other header files are:
<ctype.h> character testing and conversion functions
<stdlib.h> utility functions such as string conversion routines , memory allocation routines, random
number generator etc
<string.h> String manipulations functions
<time.h> Time manipulation functions

Input and Output Functions


The C language consists of input-output statements to read the data to be processed as well as output
the computed results. C language provides a set of library functions or built in functions, in order to
carry out input and output operations. These library functions are available in a header file called
stdio.h. So for using these library functions the following preprocessor directive is essential.
The input and output functions in C language can be broadly categorized into two types:
 Unformatted Input Output functions : which provides the facility to read or output data as a
characters or sequence of characters. Ex: getch(), getche(), getchar(), gets(), putch(), purchar()
and puts().
 Formatted I/O functions : which allow the use format specifiers to specify the type of data to
be read or printed. Ex: scanf() and printf() functions.

14

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Introduction to Lex Programming

The unix utility lex parses a file of characters. It uses regular expression matching; typically it is used
to ‘tokenize’ the contents of the file. In that context, it is often used together with the yacc utility.
However, there are many other applications possible.

Structure of a lex file

A lex file looks like:

...definitions...
%%
...rules...
%%
...code...

Here is a simple example:


%{
int charcount=0,linecount=0;
%}

%%
. charcount++;
\n {linecount++; charcount++;}
%%

int main() {
yylex();
printf("There were %d characters in %d lines\n", charcount,linecount);
return 0; }

In the example you just saw, all three sections are present:
 definitions All code between %{ and %} is copied to the beginning of the resulting C file.
rules A number of combinations of pattern and action: if the action is more than a single
 command it needs to be in braces.
 code This can be very elaborate, but the main ingredient is the call to yylex, the lexical
analyser. If the code segment is left out, a default main is used which only calls yylex.

15

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Introduction to YACC Programming


Yacc provides a general tool for imposing structure on the input to a computer program. The Yacc
user prepares a specification of the input process; this includes rules describing the input structure,
code to be invoked when these rules are recognized, and a low-level routine to do the basic input.
Yacc then generates a function to control the input process. This function, called a parser, calls the
user-supplied low-level input routine (the lexical analyzer) to pick up the basic items (called tokens)
from the input stream. These tokens are organized according to the input structure rules, called
grammar rules; when one of these rules has been recognized, then user code supplied for this rule, an
action, is invoked; actions have the ability to return values and make use of the values of other
actions.

The heart of the input specification is a collection of grammar rules. Each rule describes an allowable
structure and gives it a name. For example, one grammar rule might be

date : month_name day ',' year ;

Here, date, month_name, day, and year represent structures of interest in the input process;
presumably, month_name, day, and year are defined elsewhere. The comma ``,'' is enclosed in single
quotes; this implies that the comma is to appear literally in the input. The colon and semicolon merely
serve as punctuation in the rule, and have no significance in controlling the input. Thus, with proper
definitions, the input

July 4, 1776

These are some points about YACC:

Input: A CFG- file.y

Output: A parser y.tab.c (yacc)

 The output file "file.output" contains the parsing tables.


 The file "file.tab.h" contains declarations.
 The parser called the yyparse ().
 Parser expects to use a function called yylex () to get tokens.

16

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-1
AIM:
Introduction: Objective, scope and outcome of the course.

OBJECTIVE:
The laboratory course is intended to make experiments on the basic techniques of compiler construction and
tools that can be used to perform syntax-directed translation of a high-level programming language into an
executable code. Students will design and implement language processors in C by using tools to automate parts
of the implementation process. This will provide deeper insights into the more advanced semantics aspects of
programming languages, code generation, machine independent optimizations, dynamic memory allocation,
and object orientation.

SCOPE:
The scope of this course is to explore the principle, algorithm and data structure involved in the design and
construction of compiler.

OUTCOMES:
Upon the completion of Compiler Design practical course, the student will be able to:
1. Understand the working of lex and yacc compiler for debugging of programs.
2. Understand and define the role of lexical analyzer, use of regular expression and transition diagrams.
3. Understand and use Context free grammar, and parse tree construction.
4. Learn & use the new tools and technologies used for designing a compiler.
5. Develop program for solving parser problems.
6. Learn how to write programs that execute faster.

17

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Introduction of Compiler Design

Compiler is a software which converts a program written in high level language (Source Language)
to low level language (Object/Target/Machine Language).

 Cross Compiler that runs on a machine ‘A’ and produces a code for another machine ‘B’. It
is capable of creating code for a platform other than the one on which the compiler is running.

Phases of a Compiler –
There are two major phases of compilation, which in turn have many parts. Each of them take input
from the output of the previous level and work in a coordinated way.

18

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Analysis Phase – An semantic tree representation is created from the give source code:
1. Lexical Analyzer
2. Syntax Analyzer
3. Semantic Analyzer
Lexical analyzer divides the program into “tokens”, Syntax analyzer recognizes “sentences” in the
program using syntax of language and Semantic analyzer checks static semantics of each
construct.

Synthesis Phase – It has three parts :


4. Intermediate Code Generator
5. Code Optimizer
6. Code Generator
Intermediate Code Generator generates “abstract” code. Code Optimizer optimizes the abstract code,
and final Code Generator translates abstract intermediate code into specific machine instructions.
19

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-2
AIM:

Program to find whether given string is keyword or not

PROGRAM:

#include<stdio.h>
#include<conio.h>
#include<string.h>

void main()
{
char a[5][10]={"printf","scanf","if","else","break"}; char
str[10];
int i,flag;

clrscr();

puts("Enter the string :: ");


gets(str);

for(i=0;i<strlen(str);i++)
{
if(strcmp(str,a[i])==0)
{
flag=1;
break;
}
else
flag=0;
}

20

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

if(flag==1)
puts("Keyword");
else
puts("String");

getch();
}

21

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-3
AIM:

Count total no. of keywords in a file. [Taking file from user]

PROGRAM:

#include<stdio.h>

#include<stdlib.h>
#include<string.h>
#include<ctype.h>
static int count=0;
int isKeyword(char buffer[]){

char keywords[32][10] =
{"auto","break","case","char","const","continue","default","do","double","else","enum","extern","fl
oat","for","goto","if","int","long","register","return","short","signed","sizeof","static","struct","switc
h","typedef","union","unsigned","void","volatile","while"};
int i, flag = 0;

for(i = 0; i < 32; ++i){


if(strcmp(keywords[i], buffer) == 0){
flag = 1;
count++;
break;
}
}

return flag;
}

int main(){
char ch, buffer[15] ;
FILE *fp;
int i,j=0;

fp = fopen("KESHAV3.C","r");

if(fp == NULL){
printf("error while opening the file\n");
exit(0);
22

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

while((ch = fgetc(fp)) != EOF){


if(isalnum(ch)){
buffer[j++] = ch;
}
else if((ch == ' ' || ch == '\n') && (j != 0)){
buffer[j] = '\0';
j = 0;

if(isKeyword(buffer) == 1)
printf("%s is keyword\n", buffer);

}
printf("no of keywords= %d", count);
fclose(fp);

return 0;
}

23

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-4

AIM:

Count total no of operators in a file. [Taking file from user] #include<stdio.h>

PROGRAM:

#include<stdlib.h>

#include<string.h>
#include<ctype.h>
static int count=0;
int main(){
char ch, buffer[15], operators[] = "+-*/%=";
FILE *fp;
int i;
clrscr();

fp = fopen("KESHAV3.C","r");

if(fp == NULL){
printf("error while opening the file\n");
exit(0);
}

while((ch = fgetc(fp)) != EOF){


for(i = 0; i < 6; ++i){
if(ch == operators[i]) {
printf("%c is operator\n", ch);
count++;
}
}

}
printf("no of operators= %d", count);
fclose(fp);

return 0;
}

24

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-5

AIM:

Count total occurrence of each character in a given file. [Taking file from user]

PROGRAM:

#include <stdio.h>

#include <string.h>

#include<conio.h>

int main ()
{

FILE * fp;
char string[100];
int c = 0, count[26] = { 0 }, x;

fp = fopen ("deepa.txt", "r");


clrscr();

while (fscanf (fp, "%s", string) != EOF)

{ c=0;
while (string[c] != '\0')
{

/** Considering characters from 'a' to 'z' only and ignoring others. */

if (string[c] >= 'a' && string[c] <= 'z')


{

x = string[c] - 'a';

count[x]++;

25

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

c++;

}
}

for (c = 0; c <26 ; c++)


printf ("%c occurs %d times in the string.\n", c + 'a', count[c]);
return 0;

26

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-6

AIM:

Write a C program to insert, delete and display the entries in Symbol Table.

PROGRAM:
//Implementation of symbol table
#include<stdio.h>
#include<ctype.h>
#include<stdlib.h>
#include<string.h>
#include<math.h>
void main()
{
int i=0,j=0,x=0,n;
void *p,*add[5];
char ch,srch,b[15],d[15],c;
printf("Expression terminated by $:");
while((c=getchar())!='$')
{
b[i]=c;
i++;
}
n=i-1;
printf("Given Expression:");
i=0;
while(i<=n)
{
printf("%c",b[i]);
i++;
}
printf("\n Symbol Table\n");
printf("Symbol \t addr \t type");
while(j<=n)
{
c=b[j];
if(isalpha(toascii(c)))
{
p=malloc(c);
add[x]=p;
d[x]=c;
printf("\n%c \t %d \t identifier\n",c,p);
x++;
j++;
27

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

}
else
{
ch=c;
if(ch=='+'||ch=='-'||ch=='*'||ch=='=')
{
p=malloc(ch);
add[x]=p;
d[x]=ch;
printf("\n %c \t %d \t operator\n",ch,p);
x++;
j++;
}}}}

28

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-7

AIM:

Write a LEX program to identify following:

1. Valid mobile number


2. Valid url
3. Valid identifier
4. Valid date (dd/mm/yyyy)
5. Valid time (hh:mm:ss)

PROGRAM:

1. Valid mobile number


%{
/* Definition section */
%}

/* Rule Section */
%%

[1-9][0-9]{9} {printf("\nMobile Number Valid\n");}

.+ {printf("\nMobile Number Invalid\n");}

%%

// driver code
int main()
{
printf("\nEnter Mobile Number : ");
yylex();
printf("\n");
return 0;
}
int yywrap()
{
}

29

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

2. Valid url

%%

((http)|(ftp))s?:\/\/[a-zA-Z0-9]{2,}(\.[a-z]{2,})+(\/[a-zA-Z0-9+=?]*)* {printf("\nURL Valid\n");}

.+ {printf("\nURL Invalid\n");}

%%

void main() {

printf("\nEnter URL : ");

yylex();

printf("\n");

}
int yywrap()
{
}

30

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

3. Valid identifier

%%

^[a - z A - Z _][a - z A - Z 0 - 9 _] * printf("Valid Identifier");

// regex for invalid identifiers


^[^a - z A - Z _] printf("Invalid Identifier");
.;
%%

void main() {

printf("\nEnter Identifier: ");

yylex();

printf("\n");

}
int yywrap()
{
}

31

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

4. Valid date (dd/mm/yyyy)

%{
#include<stdio.h>
int i=0,yr=0,valid=0;
%}
%%
([0-2][0-9]|[3][0-1])\/((0(1|3|5|7|8))|(10|12))\/([1-2][0-9][0-9][-0-9]) {valid=1;}

([0-2][0-9]|30)\/((0(4|6|9))|11)\/([1-2][0-9][0-9][0-9]) {valid=1;}

([0-1][0-9]|2[0-8])\/02\/([1-2][0-9][0-9][0-9]) {valid=1;}

29\/02\/([1-2][0-9][0-9][0-9]) { while(yytext[i]!='/')i++; i++;while(yytext[i]!


='/')i++;i++;while(i<yyleng)yr=(10*yr)+(yytext[i++]-'0'); if(yr%4==0||(yr
%100==0&&yr%400!=0))valid=1;}

%%
void main()
{
yyin=fopen("new","r");
yylex();
if(valid==1) printf("It is a valid date\n");
else printf("It is not a valid date\n");
}
int yywrap()
{
return 1;
}

32

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

5. Valid time(hh:mm:ss)

%{
#include<stdio.h>
int i=0,yr=0,valid=0;
%}
%%
([0-2][0-9]:[0-6][0-9]\:[0-6][0-9]) {printf("%s It is a valid time\n",yytext);}

%%
void main()
{
yyin=fopen("new","r");
yylex();
}
int yywrap()
{
return 1;
}

33

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-8

AIM:

Write a lex program to count blank spaces,words,lines in a given file.

PROGRAM:

%{
#include<stdio.h>
int lines=0, words=0,s_letters=0,c_letters=0, num=0, spl_char=0,total=0;
%}
%%
\n { lines++; words++;}
[\t ' '] words++;
[A-Z] c_letters++;
[a-z] s_letters++;
[0-9] num++;
. spl_char++;
%%
void main(void)
{
FILE *fp;
char f[50];
printf("enterfile name \n");
scanf("%s",f);
yyin= fopen(f,"r");
yylex();
total=s_letters+c_letters+num+spl_char;
printf(" This File contains ..."); printf("\
n\t%d lines", lines); printf("\n\t%d
words",words); printf("\n\t%d small
letters", s_letters); printf("\n\t%d capital
letters",c_letters); printf("\n\t%d digits",
num);
printf("\n\t%d special characters",spl_char);
printf("\n\tIn total %d characters.\n",total);
}
int yywrap()
{
return(1);
}
34

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-9

AIM:

Write a lex program to count the no. of vowels and consonants in a C file.

PROGRAM:

%{

#include<stdio.h>
int vcount=0,ccount=0;
%}
%%
[a|i|e|o|u|E|A|I|O|U] {vcount++;}
[a-z A-Z (^a|i|e|o|u|E|A|I|O|U) ] {ccount++;}
%%
int main()
{
FILE *fp;
char f[50];
printf("enterfile name \n");
scanf("%s",f);
yyin= fopen(f,"r");
yylex();
printf("No. of Vowels :%d\n",vcount);
printf("No. of Consonants :%d\n",ccount);
return 0;
}
int yywrap()
{
}

35

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-10

AIM:

Write a YACC program to recognize strings aaab,abbb using a^nb^n, where b>=0.

PROGRAM:

Gm.l

%{
#include "y.tab.h"
%}
%%
"a"|"A" {return A;}
"b"|"B" {return B;}
[ \t] {;}
\n {return 0;}
. {return yytext[0];}
%%
int yywrap()
{
return 1;
}
Gm.y
%{
#include<stdio.h>
%}
%token A B
%%
stmt: S
;
S: A S B
|
;
%%
void main()
{
printf("enter \n");
yyparse();
printf("valid");
exit(0);
36

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

}
void yyerror()
{
printf("invalid ");
exit(0);
}

37

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-11
AIM:

Write a YACC program to evaluate an arithmetic expression involving operators +,-,* and /.

PROGRAM:

Expr.l

%{
#include "y.tab.h"
extern int yylval;
%}
%%
[0-9]+ {yylval=atoi(yytext);
return number;}
[\t] {;}
[\n] {return 0;}
. {return yytext[0];}
%%
int yywrap()
{
return 1;
}

Expr.y
%{
#include<stdio.h>
int res=0;
%}
%token number
%left '+' '-'
%left '*' '/'
%%
stmt:expr {res=$$;}
;
expr:expr '+' expr {$$=$1+$3;}
|expr '-' expr {$$=$1-$3;}
|expr '*' expr {$$=$1*$3;}
|expr '/' expr
{if($3==0)
exit(0);
else $$=$1/$3;}
38

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

|number
;
%%
void main()
{
printf(" enter expr\n");
yyparse(); printf("valid=
%d",res); exit(0);
}

void yyerror()
{
printf("invalid\n");
exit(0);
}

39

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-12

AIM:

Write a YACC program to check validity of a strings abcd,aabbcd using grammar a^nb^nc^md^m,
where n , m>0

PROGRAM:

Grammer.y

%{
#include<stdio.h>
#include<stdlib.h>
int yyerror(char*);
int yylex();

%}
%token A B C D NEWLINE
%%
stmt: S NEWLINE { printf("valid\n");
return 1;
}
;
S: X Y
;

X: A X B
|
;
Y: C Y D
|
;
%%
extern FILE *yyin;
void main()
{
printf("enter \n");
do
{
yyparse();
}
while(!feof(yyin));

}
int yyerror(char* str)
{
40

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

printf("invalid ");
return 1;
}
Grammer.l

%{
#include"y.tab.h"
%}
%%
a|
A {return A;}
c|
C {return C;}
b|
B {return B;}
d|
D {return D;}
[ \t] {;}
"\n" {return NEWLINE;}
. {return yytext[0];}
%%
int yywrap()
{
return 1;
}

41

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

EXPERIMENT-13
AIM:

Write a C program to find first of any grammar.

PROGRAM:

#include<stdio.h>

#include<ctype.h>
void FIRST(char
); int count,n=0;
char prodn[10][10], first[10];

void main()
{
int i,choice;
char c,ch;
printf("How many productions ? :");
scanf("%d",&count);
printf("Enter %d productions epsilon= $ :\n\n",count);
for(i=0;i<count;i++)
scanf("%s%c",prodn[i],&ch);
do
{ n=
0;
printf("Element :");
scanf("%c",&c);
FIRST(c);
printf("\n FIRST(%c)= { ",c);
for(i=0;i<n;i++)
printf("%c ",first[i]);
printf("}\n");

printf("press 1 to continue : ");


scanf("%d%c",&choice,&ch);
}
while(choice==1);
}

void FIRST(char c)
{
int j;
42

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

if(!(isupper(c)))first[n++]=c;
for(j=0;j<count;j++)
{
if(prodn[j][0]==c)
{
if(prodn[j][2]=='$') first[n++]='$';
else if(islower(prodn[j][2]))first[n++]=prodn[j][2];
else FIRST(prodn[j][2]);
}
}
}

43

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Multiple Choice Questions


Q1. What is a compiler? B. requires fixed format commands
A. A compiler does a conversion line by line as C. is a mnemonic form of machine language
the program is run D. is a quite different from the SCL interpreter
B. A compiler converts the whole of a higher Ans.: C
level program code into machine code in
one step Q7. Every symbolic references to a memory
C. A compiler is a general purpose language operand has to be assembled as
providing very efficient execution A. (offset, index base)
D. All of the Above B. (segment base, offset)
Ans.: B C. (index base, offset)
D. offset
Q2. What are the stages in the compilation Ans.: B
process?
A. Feasibility study, system design, and testing Q8. Type checking is normally done during
B. Implementation and documentation A. lexical analysis
C. Analysis Phase, Synthesis Phase B. syntax analysis
D. None of These C. syntax directed translation
Ans.: C D. code optimization
Ans.: C
Q3. What is the definition of an interpreter?
A. An interpreter does the conversion line by Q9. Which translator program converts
line as the program is run assembly language program to object
B. An interpreter is a representation of the program
system being designed A. Assembler
C. An interpreter is a general purpose language B. Compiler
providing very efficient execution C. Microprocessor
D. All of the Above D. Linker
Ans.: A Ans.: A

Q4. Symbol table can be used for Q10. A compiler for a high level language that
A. Checking type compatibility runs on one machine and produces code for
B. Storage allocation a different machine is called
C. Suppressing duplication of error massages A. Optimizing compiler
D. All of the Above B. One pass compiler
Ans.: D C. Cross compiler
D. Multipass Compiler
Q5. A basic block can be analyzed by Ans.: C
A. A DAG
B. A graph which may involve the cycles Q11. An intermediate code form is
C. Flow Graph A. Postfix notation
D. None of These B. Syntax trees
Ans.: A C. Three address code
D. All of these
Q6. Assembly language Ans.: D
A. is usually the primary user interface

44

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Q12. Which one of the following is not a syntax B. context free grammar
error C. context sensitive grammar
A. Semantic D. none of the above
B. Lexical Ans.: A
C. Arithmetic
D. Logical Q19. ‘Divide by 0’ is a
Ans.: A A. lexical error
B. syntactic error
Q13. Semantic errors can be detected C. semantic error
A. at compile time only D. internal error
B. at run time only Ans. C
C. both at compile and run time
D. none of these Q20. In which of the following has attribute
Ans.: C values at each node
A. Associated parse trees
Q14. The action of passing the source program B. Postfix parse tree
into the proper syntactic classes is known as C. Annotated parse tree
A. syntax analysis D. Prefix parse tree
B. lexical analysis Ans.: C
C. interpretation analysis
D. general syntax analysis Q21. CFG can be recognized by a
Ans.: B A. Push-down automata
B. 2-way linear bounded automata
Q15. Undeclared name is.....................error C. Both (a) and (b)
A. syntax D. None of these
B. lexical Ans. c
C. semantic
D. not an error Q22. The ambiguous grammar can have
Ans. C a. Only one parse tree
b. More than one parse tree
Q16. Intermediate code generator is used c. Parse trees with l-values
between d. None of the above
A. symbol table and code generator Ans. B
B. symbol table and code optimizer
C. code optimizer and code generator Q23. Which of the following suffices to convert
D. semantic analyzer and code optimizer an arbitrary CFG to an LL(1) grammar
Ans.: D a. Removing left recursion alone
b. Factoring the grammar alone
Q17. Grouping of characters into tokens is done c. Removing left recursion and factoring
by the the grammar
A. scanner d. None of the above
B. parser Ans. C
C. code generator
D. code optimizer Q24. The ‘k’ in LR(k) cannot be
Ans.: A a. 0
b. 1
Q18. Lexical analysis phase uses c. 2
A. regular grammar d. None of these
45

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Ans. d a. (LUD)+
Q25. Non backtracking form of the top-down b. L(LUD)*
parser are called c. (LD)*
d. L(LD)*
a. Recursive-descent parsers Note: Here U stands for Union
b. Predictive parsers
c. Shift reduce parsers Ans. b
d. None of these Q31. Backtracking is possible in
Ans. b
a. LR parsing
Q26. Parsing table in the LR parsing contains b. Predictive parsing
a. action, goto c. Recursive descent parsing
b. state, action d. None of the above
c. input, action Ans. c
d. state, goto Q32. Consider the following grammar
Ans. a
a. expr op expr
Q27. The condensed form of the parse tree is b. expr->id
called c. op-> + | *
a. L-attributed Which of the following is true?
b. Inherited attributed a. op and expr are start symbols
c. DAG b. op and id are terminals
d. Syntax tree c. expr is start symbol and op is terminal
Ans. d d. none of these
Q28. The post fix form of A-B/(C*D$E) is Ans. c

a. ABCDE$*/- Q33. To compute FOLLOW(A) for any


b. AB/C*DE$ grammar symbol A
c. ABCDE$-/* a. We must compute FIRST of some
d. ABCDE/-*$ grammar symbols
Ans. a b. No need of computing FIRST of some
Q29. Shift-reduce parsers are symbol
c. May compute FIRST of some symbol
a. top-down parsers d. None of these
b. bottom-up parsers Ans. a
c. both (a) and (b)
d. none of these 34. Which language is generated by the given
Ans. b grammar S-> 0S1 | 01

Q30. In some programming languages, an a. 0011


identifier is permitted to be a letter followed b. 00*11
by any number of letters or digits. If L and c. 001*1
D denote the sets of letter and digits d. 00*1*1
respectively, which expression defines an Ans. d
identifier? Q35. The space consuming but easy parsing is

46

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

a. LALR a. +-AB*C-D
b. SLR b. *+-ABCD
c. LR c. *+AB-CD
d. Predictive parser d. *AB+CD
Ans. a Ans. c
Q36. A top down parser generates Q42. Which of the following is true for the flow
of control among procedures during
a. left-most derivation
execution of program?
b. right-most derivation
c. right-most derivation in reverse a. Control flows randomly
d. left-most derivation in reverse b. Control flows line by line without
Ans. a jumping
c. Control flows sequentially
Q37. Synthesized attribute can easily be
d. None of these
simulated by an
Ans. c
a. LL grammar
Q43. In a syntax directed translation scheme, if
b. Ambiguous grammar
the value of an attribute of a node is a
c. LR grammar
function of the values of the attributes of its
d. None of the above
children, then it is called a
Ans. c
a. Synthesized attribute
Q38. Choose the false statement
b. Inherited attribute
a. LL(k) grammar has to be a CFG c. Canonical attribute
b. LL(k) grammar has to be unambiguous d. None of the above
c. There are LL(k) grammars that are Ans. a
not Context Free
Q44. Which of the following is not an
d. LL(k) grammars cannot have
intermediate code form?
left recursive non-terminals
Ans. c a. Postfix notation
b. Syntax trees
Q39. If a grammar is unambiguous then it is c. Three address code
surely be d. Quadruples
a. regular Ans. d
b. LL(1) Q45. Three address codes can be implemented
c. Both (a) and (b) by
d. Cannot say
Ans. d a. indirect triples
b. direct triples
Q40. Predictive parsing is a special case of c. both (a) and (b)
a. top down parsing d. none of the above
b. bottom up parsing Ans. a
c. recursive descent parsing Q46. In a bottom up evaluation of a syntax
d. none of the above directed definition, inherited attributes can
Ans. c be
Q41. The prefix form of (A+B)*(C-D) is
47

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

a. always be evaluated c. no unary operators


b. be evaluated only if the definition is L- d. none of the above
attributed Ans. a
c. be evaluated only if the definition has
synthesized attributes Q48. Inherited attribute is a natural choice in
d. none of the above
Ans. c a. keeping track of variable declaration
b. checking of the correct use of L-values
Q47. Three address code involves and R-values
c. both (a) and (b)
a. at the most 3 address d. none of the above
b. exactly 3 address Ans. c

48

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

ractical Exam Sample Paper: Compiler Design Lab


1. Write a program to develop a lexical analyzer to recognize pattern identifier in C

2. Write a program to develop a lexical analyzer to recognize pattern constants, comments in C

3. Write a program to develop a lexical analyzer to recognize pattern constants, operators in C

4. Write a program to implement recursive descent parser….

5. Write a program to convert an infix notation into prefix notation

6. Write a program to convert an infix notation into postfix notation

7. Write a program to calculate the value of a postfix notation

8. Explain all the phases of compiler

9. Write a program to find out whether a given expression is valid or not

10. Write a program to perform following operations on a link list: Creation, Insertion and Display

11. Write a program to perform following operations on a link list: Creation, Insertion and Deletion

12. Generate Lexical Analyzer using LEX

13. Write a program to recognize a valid arithmetic expression that uses operators +,-,*,/ using
YACC

14. Write a program to recognize a valid variable which starts with a letter followed by any number
of letters or digits using YACC

15. Write a program to recognize the grammar an where n>=10

16. Explain the concept of LEX and YACC

17. Write a program to find the Macro Statements in a given C file (Use C file as input file)

18. Write a program to find number of white space characters in a C file.

19. Write a program which will take two input strings. Find all the possible sub common strings
from the small String.

49

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Text and Reference Books


 Bennett, Jeremy Peter. Introduction to compiling techniques: a first course using ANSI C, LEX
and YACC. McGraw-Hill, Inc., 1996.
 Levine, John R., et al. Lex & yacc. " O'Reilly Media, Inc.", 1992.
 Aho, Alfred V., Jeffrey D. Ullman, and R. Sethi. "Principles of Compiler Construction." (1977).
 Louden, Kenneth C. "Compiler construction." Cengage Learning (1997).

50

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Downloaded by Surbhi Maheshwari


lOMoARcPSD|41236959

Downloaded by Surbhi Maheshwari (subhi.kabra@gmail.com)


lOMoARcPSD|41236959
lOMoARcPSD|41236959

ownloaded by Surbhi Maheshwari (subhi.kabra@gmail.com)


lOMoARcPSD|41236959

Downloaded by Surbhi Maheshwari (subhi.kabra@gmail.com)

You might also like