[go: up one dir, main page]

0% found this document useful (0 votes)
14 views17 pages

toc2

The document provides an overview of Context-Free Grammars (CFG), including definitions, production rules, derivation trees, and simplification techniques. It discusses ambiguity in CFGs, closure properties, and the Chomsky hierarchy classification of grammars. Additionally, it outlines methods for reducing CFGs, removing unit and null productions, and includes examples for better understanding.

Uploaded by

laleshpawar2025
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views17 pages

toc2

The document provides an overview of Context-Free Grammars (CFG), including definitions, production rules, derivation trees, and simplification techniques. It discusses ambiguity in CFGs, closure properties, and the Chomsky hierarchy classification of grammars. Additionally, it outlines methods for reducing CFGs, removing unit and null productions, and includes examples for better understanding.

Uploaded by

laleshpawar2025
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

An Education Empowered by industry . . .

FABTECH TECHNICAL CAMPUS


COLLEGE OF ENGINEERING AND RESEARCH
( Approved by AICTE, New Delhi, DTE., (M.S.) & affiliated to Solapur University & DBATU , Lonere)
Pandharpur Road, Gat No. 565/1, Sangola, Taluka:- Sangola, District:- Solapur. 413 307. P.O. Box No.04
M: 8408888657, 8408888612 Website: www.fabtecheducation.com E-mail : ftc.coer@gmail.com

BTCOC502 Theory of Computations

Unit 2: Context Free Grammars


Context Free Grammars: Definition, Production rules, Ambiguous grammar, Removal of

ambiguity, Chomsky hierarchy, Context Free Grammar (CFG) – definition, Simplification of

CFG.

Definition − A context-free grammar (CFG) consisting of a finite set of grammar rules


is a quadruple (N, T, P, S) where
 N is a set of non-terminal symbols.
 T is a set of terminals where N ∩ T = NULL.
 P is a set of rules, P: N → (N ∪ T)*, i.e., the left-hand side of the production
rule P does have any right context or left context.
 S is the start symbol.
Example

 The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.


 The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S → bSb, S → ε
 The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F | ε

Generation of Derivation Tree


A derivation tree or parse tree is an ordered rooted tree that graphically represents the
semantic information a string derived from a context-free grammar.

Representation Technique
 Root vertex − Must be labeled by the start symbol.
 Vertex − Labeled by a non-terminal symbol.
 Leaves − Labeled by a terminal symbol or ε.
If S → x x …… x is a production rule in a CFG, then the parse tree / derivation tree will
1 2 n

be as follows −

There are two different approaches to draw a derivation tree −


Top-down Approach −
 Starts with the starting symbol S
 Goes down to tree leaves using productions
Bottom-up Approach −
 Starts from tree leaves
 Proceeds upward to the root which is the starting symbol S

Derivation or Yield of a Tree

The derivation or the yield of a parse tree is the final string obtained by concatenating
the labels of the leaves of the tree from left to right, ignoring the Nulls. However, if all
the leaves are Null, derivation is Null.
Example
Let a CFG {N,T,P,S} be
N = {S}, T = {a, b}, Starting symbol = S, P = S → SS | aSb | ε
One derivation from the above CFG is “abaabb”
S → SS → aSbS → abS → abaSb → abaaSbb → abaabb
Sentential Form and Partial Derivation Tree

A partial derivation tree is a sub-tree of a derivation tree/parse tree such that either all
of its children are in the sub-tree or none of them are in the sub-tree.
Example
If in any CFG the productions are −
S → AB, A → aaA | ε, B → Bb| ε
the partial derivation tree can be the following −
If a partial derivation tree contains the root S, it is called a sentential form. The above
sub-tree is also in sentential form.

Leftmost and Rightmost Derivation of a String

 Leftmost derivation − A leftmost derivation is obtained by applying production


to the leftmost variable in each step.
 Rightmost derivation − A rightmost derivation is obtained by applying
production to the rightmost variable in each step.
Example
Let any set of production rules in a CFG be
X → X+X | X*X |X| a
over an alphabet {a}.
The leftmost derivation for the string "a+a*a" may be −
X → X+X → a+X → a + X*X → a+a*X → a+a*a
The stepwise derivation of the above string is shown as below −
The rightmost derivation for the above string "a+a*a" may be −
X → X*X → X*a → X+X*a → X+a*a → a+a*a
The stepwise derivation of the above string is shown as below −
Left and Right Recursive Grammars
In a context-free grammar G, if there is a production in the form X → Xa where X is a
non-terminal and ‘a’ is a string of terminals, it is called a left recursive production.
The grammar having a left recursive production is called a left recursive grammar.
And if in a context-free grammar G, if there is a production is in the form X →
aX where X is a non-terminal and ‘a’ is a string of terminals, it is called a right
recursive production. The grammar having a right recursive production is called
a right recursive grammar.

Ambiguity in Context-Free Grammars


If a context free grammar G has more than one derivation tree for some string w ∈
L(G), it is called an ambiguous grammar. There exist multiple right-most or left-most
derivations for some string generated from that grammar.
Problem
Check whether the grammar G with production rules −
X → X+X | X*X |X| a
is ambiguous or not.

Solution
Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.
Derivation 1 − X → X+X → a +X → a+ X*X → a+a*X → a+a*a
Parse tree 1 −

Derivation 2 − X → X*X → X+X*X → a+ X*X → a+a*X → a+a*a


Parse tree 2 −
Since there are two parse trees for a single string "a+a*a", the grammar G is
ambiguous.

CFL Closure Property


Context-free languages are closed under −

 Union
 Concatenation
 Kleene Star operation

Union
Let L and L be two context free languages. Then L ∪ L is also context free.
1 2 1 2

Example

Let L = { a b , n > 0}. Corresponding grammar G will have P: S1 → aAb|ab


1
n n
1

Let L = { c d , m ≥ 0}. Corresponding grammar G will have P: S2 → cBb| ε


2
m m
2

Union of L and L , L = L ∪ L = { a b } ∪ { c d }
1 2 1 2
n n m m

The corresponding grammar G will have the additional production S → S1 | S2

Concatenation
If L and L are context free languages, then L L is also context free.
1 2 1 2

Example
Union of the languages L and L , L = L L = { a b c d }
1 2 1 2
n n m m

The corresponding grammar G will have the additional production S → S1 S2

Kleene Star
If L is a context free language, then L* is also context free.

Example

Let L = { a b , n ≥ 0}. Corresponding grammar G will have P: S → aAb| ε


n n

Kleene Star L = { a b }*
1
n n

The corresponding grammar G will have additional productions S1 → SS | ε


1 1

Context-free languages are not closed under −


 Intersection − If L1 and L2 are context free languages, then L1 ∩ L2 is not
necessarily context free.
 Intersection with Regular Language − If L1 is a regular language and L2 is a
context free language, then L1 ∩ L2 is a context free language.
 Complement − If L1 is a context free language, then L1’ may not be context
free.

CFG Simplification
In a CFG, it may happen that all the production rules and symbols are not needed for
the derivation of strings. Besides, there may be some null productions and unit
productions. Elimination of these productions and symbols is called simplification of
CFGs. Simplification essentially comprises of the following steps −

 Reduction of CFG
 Removal of Unit Productions
 Removal of Null Productions

Reduction of CFG
CFGs are reduced in two phases −
Phase 1 − Derivation of an equivalent grammar, G’, from the CFG, G, such that each
variable derives some terminal string.
Derivation Procedure −
Step 1 − Include all symbols, W , that derive some terminal and initialize i=1.
1

Step 2 − Include all symbols, W , that derive W .


i+1 i
Step 3 − Increment i and repeat Step 2, until W = W .
i+1 i

Step 4 − Include all production rules that have W in it. i

Phase 2 − Derivation of an equivalent grammar, G”, from the CFG, G’, such that each
symbol appears in a sentential form.
Derivation Procedure −
Step 1 − Include the start symbol in Y and initialize i = 1.
1

Step 2 − Include all symbols, Y , that can be derived from Y and include all production
i+1 i

rules that have been applied.


Step 3 − Increment i and repeat Step 2, until Y = Y .
i+1 i

Problem

Find a reduced grammar equivalent to the grammar G, having production rules, P: S →


AC | B, A → a, C → c | BC, E → aA | e

Solution

Phase 1 −
T = { a, c, e }
W = { A, C, E } from rules A → a, C → c and E → aA
1

W = { A, C, E } U { S } from rule S → AC
2

W = { A, C, E, S } U ∅
3

Since W = W , we can derive G’ as −


2 3

G’ = { { A, C, E, S }, { a, c, e }, P, {S}}
where P: S → AC, A → a, C → c , E → aA | e
Phase 2 −
Y ={S}
1

Y = { S, A, C } from rule S → AC
2

Y = { S, A, C, a, c } from rules A → a and C → c


3

Y = { S, A, C, a, c }
4

Since Y = Y , we can derive G” as −


3 4

G” = { { A, C, S }, { a, c }, P, {S}}
where P: S → AC, A → a, C → c

Removal of Unit Productions


Any production rule in the form A → B where A, B ∈ Non-terminal is called unit
production..

Removal Procedure −

Step 1 − To remove A → B, add production A → x to the grammar rule whenever B →


x occurs in the grammar. [x ∈ Terminal, x can be Null]
Step 2 − Delete A → B from the grammar.
Step 3 − Repeat from step 1 until all unit productions are removed.
Problem
Remove unit production from the following −
S → XY, X → a, Y → Z | b, Z → M, M → N, N → a
Solution −
There are 3 unit productions in the grammar −
Y → Z, Z → M, and M → N
At first, we will remove M → N.
As N → a, we add M → a, and M → N is removed.
The production set becomes
S → XY, X → a, Y → Z | b, Z → M, M → a, N → a
Now we will remove Z → M.
As M → a, we add Z→ a, and Z → M is removed.
The production set becomes
S → XY, X → a, Y → Z | b, Z → a, M → a, N → a
Now we will remove Y → Z.
As Z → a, we add Y→ a, and Y → Z is removed.
The production set becomes
S → XY, X → a, Y → a | b, Z → a, M → a, N → a
Now Z, M, and N are unreachable, hence we can remove those.
The final CFG is unit production free −
S → XY, X → a, Y → a | b

Removal of Null Productions


In a CFG, a non-terminal symbol ‘A’ is a nullable variable if there is a production A →
ε or there is a derivation that starts at A and finally ends up with
ε: A → .......… → ε

Removal Procedure

Step 1 − Find out nullable non-terminal variables which derive ε.


Step 2 − For each production A → a, construct all productions A → x where x is
obtained from ‘a’ by removing one or multiple non-terminals from Step 1.
Step 3 − Combine the original productions with the result of step 2 and remove ε -
productions.
Problem
Remove null production from the following −
S → ASA | aB | b, A → B, B → b | ∈
Solution −
There are two nullable variables − A and B
At first, we will remove B → ε.
After removing B → ε, the production set becomes −
S→ASA | aB | b | a, A ε B| b | &epsilon, B → b
Now we will remove A → ε.
After removing A → ε, the production set becomes −
S→ASA | aB | b | a | SA | AS | S, A → B| b, B → b
This is the final production set without null transition.

Chomsky Classification of Grammars


According to Noam Chomosky, there are four types of grammars − Type 0, Type 1,
Type 2, and Type 3. The following table shows how they differ from each other −

Grammar Grammar Accepted Language Accepted Automaton


Type

Type 0 Unrestricted grammar Recursively enumerable Turing Machine


language

Type 1 Context-sensitive Context-sensitive language Linear-bounded


grammar automaton
Type 2 Context-free grammar Context-free language Pushdown automaton

Type 3 Regular grammar Regular language Finite state automaton

Take a look at the following illustration. It shows the scope of each type of grammar −

Type - 3 Grammar
Type-3 grammars generate regular languages. Type-3 grammars must have a single
non-terminal on the left-hand side and a right-hand side consisting of a single terminal
or single terminal followed by a single non-terminal.
The productions must be in the form X → a or X → aY
where X, Y ∈ N (Non terminal)
and a ∈ T (Terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule.

Example
X → ε
X → a | aY
Y → b

Type - 2 Grammar
Type-2 grammars generate context-free languages.
The productions must be in the form A → γ
where A ∈ N (Non terminal)
and γ ∈ (T ∪ N)* (String of terminals and non-terminals).
These languages generated by these grammars are be recognized by a non-
deterministic pushdown automaton.

Example
S → X a
X → a
X → aX
X → abc
X → ε

Type - 1 Grammar
Type-1 grammars generate context-sensitive languages. The productions must be in
the form
αAβ→αγβ
where A ∈ N (Non-terminal)
and α, β, γ ∈ (T ∪ N)* (Strings of terminals and non-terminals)
The strings α and β may be empty, but γ must be non-empty.
The rule S → ε is allowed if S does not appear on the right side of any rule. The
languages generated by these grammars are recognized by a linear bounded
automaton.

Example
AB → AbBc
A → bcA
B → b

Type - 0 Grammar
Type-0 grammars generate recursively enumerable languages. The productions have
no restrictions. They are any phase structure grammar including all formal grammars.
They generate the languages that are recognized by a Turing machine.
The productions can be in the form of α → β where α is a string of terminals and
nonterminals with at least one non-terminal and α cannot be null. β is a string of
terminals and non-terminals.

Example
S → ACaB
Bc → acB
CB → DB
aD → Db

Chomsky Normal Form


A CFG is in Chomsky Normal Form if the Productions are in the following forms −

 A→a
 A → BC
 S→ε
where A, B, and C are non-terminals and a is terminal.

Algorithm to Convert into Chomsky Normal Form −


Step 1 − If the start symbol S occurs on some right side, create a new start
symbol S’ and a new production S’→ S.
Step 2 − Remove Null productions. (Using the Null production removal algorithm
discussed earlier)
Step 3 − Remove unit productions. (Using the Unit production removal algorithm
discussed earlier)
Step 4 − Replace each production A → B …B where n > 2 with A → B C where C →
1 n 1

B …B . Repeat this step for all productions having two or more symbols in the right
2 n

side.
Step 5 − If the right side of any production is in the form A → aB where a is a terminal
and A, B are non-terminal, then the production is replaced by A → XB and X → a.
Repeat this step for every production which is in the form A → aB.
Problem

Convert the following CFG into CNF


S → ASA | aB, A → B | S, B → b | ε

Solution

(1) Since S appears in R.H.S, we add a new state S and S →S is added to the
0 0

production set and it becomes −


S →S, S→ ASA | aB, A → B | S, B → b | ∈
0

(2) Now we will remove the null productions −


B → ∈ and A → ∈
After removing B → ε, the production set becomes −
S →S, S→ ASA | aB | a, A → B | S | ∈, B → b
0

After removing A → ∈, the production set becomes −


S →S, S→ ASA | aB | a | AS | SA | S, A → B | S, B → b
0

(3) Now we will remove the unit productions.


After removing S → S, the production set becomes −
S →S, S→ ASA | aB | a | AS | SA, A → B | S, B → b
0

After removing S → S, the production set becomes −


0

S → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA


0

A → B | S, B → b
After removing A→ B, the production set becomes −
S → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
0

A→S|b
B→b
After removing A→ S, the production set becomes −
S → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
0

A → b |ASA | aB | a | AS | SA, B → b
(4) Now we will find out more than two variables in the R.H.S
Here, S → ASA, S → ASA, A→ ASA violates two Non-terminals in R.H.S.
0

Hence we will apply step 4 and step 5 to get the following final production set which is
in CNF −
S → AX | aB | a | AS | SA
0

S→ AX | aB | a | AS | SA
A → b |AX | aB | a | AS | SA
B→b
X → SA
(5) We have to change the productions S → aB, S→ aB, A→ aB
0

And the final production set becomes −


S → AX | YB | a | AS | SA
0

S→ AX | YB | a | AS | SA
A → b A → b |AX | YB | a | AS | SA
B→b
X → SA
Y→a

You might also like