[go: up one dir, main page]

0% found this document useful (0 votes)
34 views10 pages

UNIT 1 Notes CD

The document provides an overview of compilers, assemblers, and cross compilers, detailing their functions, phases, and types. It explains the front-end and back-end processes of compilers, including lexical analysis, syntax analysis, semantic analysis, and code generation. Additionally, it discusses input buffering methods and compiler construction tools that facilitate the development of compilers.

Uploaded by

gakareharsh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views10 pages

UNIT 1 Notes CD

The document provides an overview of compilers, assemblers, and cross compilers, detailing their functions, phases, and types. It explains the front-end and back-end processes of compilers, including lexical analysis, syntax analysis, semantic analysis, and code generation. Additionally, it discusses input buffering methods and compiler construction tools that facilitate the development of compilers.

Uploaded by

gakareharsh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Compiler :

• Compiler is a program that translates code from a source language to a target


language.
• It converts a High-Level Language (HLL) program into a Machine-Level Language
(MLL) program.
• It detects and presents error information to the user.
• Automates the translation process for faster and more efficient execution.
• Essential for executing programs written in high-level languages.
• Produces an optimized and executable version of the program.

LIST OF COMPILERS
1. Ada compilers
2. ALGOL compilers
3. BASIC compilers
4. C# compilers
5. C compilers
6. C++ compilers
7. COBOL compilers
8. Java compilers

Assembler
• Assembler is a program that converts an assembly language program into
machine language code.
• It automates the translation process for assembly language into machine
language.
• Assembly language uses mnemonics (symbols) to represent machine
instructions.
• Programmers use assembly language as it is easier to read and write than
machine code.
• The input to an assembler is called the source program.
• The output of an assembler is a machine language translation known as the
object program.
Cross compiler,
• A cross compiler creates programs for a different system than the one it runs
on.
• It helps developers write code on one device and use it on another.
• Commonly used for embedded systems, IoT devices, and operating systems.
• Useful for devices with limited power and resources.
• Helps in making software for different types of hardware.

The T-Diagram is a standard way to represent the relationship in a compiler:


• S (Source language): Language being compiled (e.g., C, C++).
• I (Implementation language): Language in which the compiler itself is written (e.g.,
C).
• T (Target language): Machine code or intermediate language for the target platform.

Working of a Cross Compiler :

• Source Code is written on the Host Machine.


• The Cross Compiler takes this source code and generates a Target-compatible
binary.
• This binary is then transferred to and run on the Target Machine.
Where Cross Compilers are Used
• Bootstrapping New Platforms: For building initial tools on a new or bare-metal
system.
• Embedded Systems Development: Where targets have limited resources and cannot
run full toolchains.
• Microcontrollers: They typically lack an operating system, so code is compiled off-
device.
• Platform Separation: Keeps the development (host) and execution (target)
environments cleanly separated.
• Paravirtualization: Enables compilation of code for various OSs from a single system.

Front End
• The front-end handles source language processing and is independent of the
target machine.
• Front end includes lexical analysis, syntax analysis, semantic analysis,
intermediate code generation and creation of symbol table.
• Certain amount of code optimization can be done by front end.

Phases of the Front End:

1. Lexical Analysis (Scanning)


o The lexical analyser is the first phase of compiler.
o Reads input characters and produces a sequence of tokens.
o Removes comments and unnecessary white spaces.
o Acts as a subroutine for the parser.
2. Syntax Analysis (Parsing)
o Syntax analysis is also called hierarchical analysis or parsing.
o Checks code structure against grammar rules.
o Identifies syntax errors and generates a syntax tree if the code is correct.
o
3. Semantic Analysis
o Ensures correct meaning of the code.
o Semantic analyzer determines the meaning of a source string.
o Verifies type compatibility, scope, and statement correctness (e.g., matching
parentheses, if-else conditions).
4. Intermediate Code Generation
o Produces an intermediate representation, making translation easier.
o Uses a format like Three Address Code for better optimization.
5. Symbol Table Creation
o Stores details about variables, functions, constants, class names, etc.
o Helps in later phases like optimization and code generation.
o Information about following entities

Variable/Identifier
Procedure/function
Keyword
Constant
Class name
Label name

Back end
• The back end consists of those phases, that depends on target machine and do not
depend on source program.
• The back end focuses on generating optimized machine-dependent code.

Phases of the Back End:

1. Code Optimization
o Improves intermediate code for efficiency.
o Reduces execution time and memory usage.
o Example:

t1 = id3 * 2.0
t2 = id2 * t1
id1 = id1 + t2

2. Code Generation
o Converts intermediate code into machine instructions.
o Example:

MOV id3, R1
MUL #2.0, R1
MOV id2, R2
MUL R2, R1
MOV id1, R2
ADD R2, R1
MOV R1, id1

3. Error Handling & Symbol Table Operations


o Manages runtime errors and reports meaningful error messages.
o Uses the symbol table for variable and function details.
Input Buffering
Overview

• The lexical analyzer scans the input left to right one character at a time.
• Two pointers are used:
o Begin Pointer (bp): Marks the start of a lexeme.
o Forward Pointer (fp): Moves ahead to find the end of a lexeme.
• Input buffering improves efficiency by reading input in larger chunks instead of one
character at a time.

Why Input Buffering?

• Reading one character at a time is slow and inefficient due to frequent system
calls.
• Input buffering reduces system calls, improving performance.
• It simplifies compiler design by managing input more efficiently.

Methods of Input Buffering

1. One Buffer Scheme

• Uses a single buffer to store input.


• If a lexeme is too long, it may cross the buffer boundary, requiring the buffer to
be refilled.
• Issue: Overwriting part of the lexeme during refilling.

2. Two Buffer Scheme

• Uses two alternating buffers to store input.


• Process:
o bp and fp start at the first character of the first buffer.
o fp moves right until it finds a blank space, marking the end of a lexeme.
o A special character (Sentinel/eof) marks the end of each buffer.
o When fp reaches eof, the next buffer is loaded, ensuring smooth scanning.
• Issue: If the lexeme is longer than both buffers combined, scanning fails.

Advantages of Input Buffering

✅ Reduces system calls, improving performance.


✅ Simplifies compiler design by making input management more efficient.

Disadvantages of Input Buffering

❌ Large buffers may consume excessive memory, slowing down performance or causing
crashes.
❌ Improper buffer management can cause errors in compilation.
Phases of Compiler:

1. Lexical Analyzer:
• Breaks the source code into tokens (words, symbols, numbers).
• Removes unnecessary spaces and comments.
2. Syntax Analyzer:
• Checks if the code follows grammar rules.
• Creates a tree structure (parse tree) for further processing.
3. Semantic Analyzer:
• Ensures meaning is correct (e.g., proper variable usage).
• Checks type compatibility and function calls.
4. Intermediate Code Generator:
• Converts the code into an easy-to-optimize intermediate
format.
• Makes it easier to translate into machine code later.
5. Code Optimizer:
• Improves code efficiency by reducing unnecessary operations.
• Helps in making the program run faster.
6. Code Generator:
• Translates optimized intermediate code into machine code.
• Produces the final executable program.
Example :
Lexical Analyser :
• The lexical analyzer is the first phase of a compiler that processes source code into
tokens.
• It reads the input source code character by character to analyze its structure.
• The code is broken into meaningful tokens such as keywords, identifiers, and
symbols.
• Whitespaces, comments, and unnecessary characters are ignored to improve
efficiency.
• It detects invalid characters and reports lexical errors during compilation.
• The lexical analyzer operates as an independent module within the compiler.
Functions of Lexical Analyzer:
• Reads Source Code → Takes program code as input.
• Breaks into Tokens → Splits code into meaningful elements (tokens).
• Removes Whitespaces & Comments → Ignores unnecessary spaces and comments.
• Identifies Keywords & Symbols → Recognizes reserved words, operators, and
identifiers.
• Generates Token Stream → Converts the code into a sequence of tokens for the next
compiler phase.
• Handles Errors → Detects errors like invalid characters.
Example:
int a = 5;
The lexer produces tokens like:
• int → Keyword
• a → Identifier
• = → Operator
• 5 → Number
• ; → Symbol

Compiler Construction Tools


Compiler construction involves several complex phases, and using
specialized tools simplifies the development process.
Below are the tools, their working, and how they relate to the
diagrams you shared:
Parser Generator
• Creates a parser (syntax analyzer) for checking the structure of
code.
• Uses context-free grammar to define the rules of a
programming language.
• Examples: PIC, EQM

Scanner Generator
• Creates a lexical analyzer (scanner) that breaks code into
tokens (like keywords, variables, symbols).
• Uses regular expressions to define the patterns of tokens.
• Example: Lex
Syntax Directed Translation Engines
• Converts a parse tree into intermediate code (often in three-
address format).
• Works by moving through (traversing) each node of the parse
tree.
• Each node has one or more translations associated with it.

Automatic Code Generators


• Converts intermediate code into machine code for the target
machine.
• Uses a set of rules and templates to match operations.
• Helps generate the final executable code.

Data-Flow Analysis Engines (Simplified)


• Used for code optimization.
• Tracks how values move through the program.
• Helps detect and remove unnecessary code, and improve
performance.
• A key step in making the code run faster and more efficiently.

Compiler Construction Toolkits (Simplified)


• A set of tools used to build different parts of a compiler.
• Helps automate and speed up the development of compilers.
• Useful for both full compiler creation or building specific
components.

You might also like