[go: up one dir, main page]

0% found this document useful (0 votes)
254 views35 pages

Class 18 Context Free Grammar

Uploaded by

midhulasrikj3053
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
254 views35 pages

Class 18 Context Free Grammar

Uploaded by

midhulasrikj3053
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

UNIT III

Context Free Grammar (CFG)


UNIT III CONTEXT-FREE GRAMMAR AND LANGUAGES 9

Context Free Grammar (CFG) − Parse trees − Ambiguity in grammars and


languages − Definition of the pushdown automata − Languages of a pushdown
automata − Equivalence of pushdown automata and CFG − Deterministic
pushdown automata- closure properties and Pumping Lemma for CFG
CFG
A CFG is a way of describing language by
recursive rules or substitution rules called
productions.
A CFG consists of a set of variables, a set of
terminal symbols and a start variable as well
as the production.
Variable  represent by Capital letters
Terminal represent by lower case letters,numbers or
special symbols
Start variable  occurs on the LHS of the topmost rule.
Components of CFG
• Terminal finite set of symbols
• NTset of strings
• Start symbol
• ProductionT & NT can be combined to form
strings
Each production consists of a NT followed by
an arrow, followed by a string of NT and T.

CFG G=(V,T,P,S)
Example
The grammar ({A}, {a, b, c}, P, A)
P : A → aA, A → abc.

The grammar ({S}, {a, b}, P, S)


P: S → aSa, S → bSb, S → ε

The grammar ({S, F}, {0, 1}, P, S)


P: S → 00S | 11F, F → 00F | ε

S ⇾ aSb | ε
The implication is that V = {S}, Σ = {a,b}, the start
symbols is S and the rules are:
S ⇾ aSb
S⇾ε
Notational Conventions
1. These symbols are terminals:
i) Lower-case letters early in the alphabet: a, b, c
ii) Operator symbols: +, -
iii) Punctuation symbols: parentheses, comma
iv) Digits: 0, 1, …,9
v) Boldface strings: id, if
2. These symbols are nonterminal:
i) Upper-case letters early in alphabet: A, B, C
ii) The letter S: start symbol
iii) Lower-case italic names: expr, stmt
Notational Conventions (cont.)
3. Upper-case letters late in alphabet: X, Y, Z,
represent grammar symbols (terminals or
nonterminal)
4. Lower-case letters late in alphabet: u, v,…z,
represent string of terminals
5. Lower-case Greek letters : , , , represent
string of grammar symbols
6. A-productions (all productions): A  1| 2|…|
k
7. Start symbol: the left side of the first
production
Derivation:
A replacement according to a production is known
as derivation. (or)
A derivation is a sequence of steps which begins
with the start symbol, uses the rules to do
replacements, and ends with a terminal string

Reduction or recursive Inference:


Reduction is the process of replacing a string by an
NT according to a grammar production.
Generation of Derivation Tree
A derivation tree or parse tree is an ordered rooted tree that graphically represents
the semantic information a string derived from a context-free grammar.

Representation Technique
•Root vertex − Must be labeled by the start symbol.
•Vertex − Labeled by a non-terminal symbol.
•Leaves − Labeled by a terminal symbol or ε.

If S → x1x2 …… xn is a production rule in a CFG, then the parse tree / derivation tree
will be as follows −
Derivation
Expand the start symbol using one of its
productions and further expand the result
string by replacing one of the variable by
the body of one of its production, until we
derive a string consisting entirely of
terminals.
Derivations
E  E+E | E*E| (E) | - E | id
E derives -E
– E-E
The derivation of -(id) from E
− E  - E  - (E)  -(id)
A    , if A  

  : one step derivation


*
•  : zero or more steps derivations
•  : one or more steps derivations

Derivation or Yield of a Tree
The derivation or the yield of a parse tree is the final string
obtained by concatenating the labels of the leaves of the tree
from left to right, ignoring the Nulls. However, if all the leaves
are Null, derivation is Null.

Example
Let a CFG {N,T,P,S} be
N = {S}
T = {a, b}
Starting symbol = S
P = S → SS | aSb | ε
One derivation from the above CFG is “abaabb”
S → SS → aSbS → abS → abaSb → abaaSbb → abaabb
Sentential Form and Partial Derivation Tree
A partial derivation tree is a sub-tree of a derivation tree/parse
tree such that either all of its children are in the sub-tree or none
of them are in the sub-tree.

Example
If in any CFG the productions are −
S → AB, A → aaA | ε, B → Bb| ε
the partial derivation tree can be the following

If a partial derivation tree contains the root S, it is called


a sentential form. The above sub-tree is also in sentential form.
Reduction
• It is used to recognize a valid string or sentence
w.
• Start with sentence w.
• Each substring of w should be replaced by the
left side of the production.
• This replacement should eventually reduce the
sentence to a start symbol s.
• Eg:
To recognize the sentence –(id + id) for the
arthimetic grammar
EE+E | E*E | (E) | -E | id
Consider the grammar EE*E | E+E | (E) |id
and the input string id+(id*id)

Derivation: Reduction:
EE+E Id+(id*id)
id +E E+(id*id)
id +(E) E+(E*id)
id+(E*E) E+(E*E)
id+(id*E) E+(E)
id+(id*id) E+E
E
Parse Tree
The parse tree for G are trees with the following
condition.
1.Each interior node is labeled by a variable in V.
2.Each leaf is labeled either by a variable, a
terminal or epsilon
3. If an interior node is labeled A, and its children
are labeled X1,X2,….Xk respectively, from the
left, then AX1,X2,….Xk is a production P.
Parse Tree and Derivations
E  E+E | E*E | (E) | - E | id

17
Parse Tree and Derivations
(cont.)
Left Most Derivation and Right
Most Derivation
Leftmost Derivation:
At each step, we replace leftmost variable by one
of its production bodies.
we indicate that a derivation is left most by,

Rightmost Derivation:
We replace rightmost variable is replaced by one
of its bodies.
Example
Let any set of production rules in a CFG be
X → X+X | X*X |X| a
over an alphabet {a}.
The leftmost derivation for the string "a+a*a" may be −
X → X+X → a+X → a + X*X → a+a*X → a+a*a

The stepwise derivation of the above string is shown as below −


The rightmost derivation for the above string "a+a*a" may be −
X → X*X → X*a → X+X*a → X+a*a → a+a*a
The stepwise derivation of the above string is shown as below −
S → aB / bA
A → aS / bAA / a
B → bS / aBB / b
string w = aaabbabbba

S → bB / aA
A → b / bS / aAA
B → a / aS / bBB
For the string w = bbaababa,
Ambiguity
• More than one parse tree for some sentences
• More than leftmost derivation for some
sentences
• More than rightmost derivation for some
sentences
Check whether the grammar G with production rules −
X → X+X | X*X |X| a is ambiguous or not.
Solution
Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.
Derivation 1 Derivation 2
X → X+X X → X*X
→ a +X → X+X*X
→ a+ X*X → a+ X*X
→ a+a*X → a+a*X
→ a+a*a → a+a*a
Parse tree 1 − Parse tree 2 −

Since there are two parse trees for a single string "a+a*a", the grammar G is
ambiguous.
Suppose we have a context free grammar G with production rules :
S -> aSb | bSa | SS | e
String : abab

The leftmost derivation of The rightmost derivation of string


string abab from grammar G abab from grammar G above is
above is done as : done as :
S => aSb S => SS
=> abSab => SaSb
=> abab => Sab
=> aSbab
=> abab
E→I
E→E+E
E→E*E
E → (E)
I → ε | 0 | 1 | 2 | ... | 9
String: "3 * 2 + 5"
Solution:
For the string "3 * 2 + 5", the above grammar can generate two parse trees by
leftmost derivation:

Since there are two parse trees for a single string "3 * 2 + 5", the grammar G
is ambiguous.
Check whether the given grammar G is ambiguous or not.
E→E+E
E→E-E
E → id
From the above grammar String "id + id - id"

First LMD Second LMD


E→E+E E→E-E
→ id + E →E+E-E
→ id + E - E → id + E - E
→ id + id - E → id + id - E
→ id + id- id → id + id - id

Check whether the given grammar G is ambiguous or not.


S → aSb | SS
S→ε
For the string "aabb"
A → AA
A → (A)
A→a
For the string "a(a)aa" the above grammar can generate two parse trees:
Regular Expression
vs. Context-free grammar
• Every language that can be described by a
regular expression can also be described by
a context-free grammar
– (a|b)*abb
– A0  aA0 | bA0 | aA1
A1 bA2
A2 bA3
A3 
• Every regular set is a context-free language
The Grammar was constructed from the NFA using
the following construction:
1.For each state i of NFA, create a nonterminal symbol Ai.
2.If state i has a transition to state j on symbol a, introduce the
production AiaAj.
3.If state i goes to state j on input Є introduce the production AiAj.

4.If it is an accepting state, introduce AiЄ


5.If i is the start state, make Ai be the start symbol of the grammer.
Resolving Problems/Difficulties – (2)
Since Reg. Lang.  Context Free Lang., it is possible
to go from reg. expr. to CFGs via NFA.

Recall: (a | b)*abb

a
start a b b
0 1 2 3

b
Resolving Problems/Difficulties – (3)
Construct CFG as follows:
1. Each State I has non-terminal Ai : A0, A1, A2, A3
i a j
2. If then Ai a Aj
i b j
3. If then Ai bAj

4. If I is an accepting state, Ai  : A3  

5. If I is a starting state, Ai is the start symbol : A0


T={a,b}, NT={A0, A1, A2, A3}, S = A0
PR ={ A0 aA0 | aA1 | bA0 ;
a
A1  bA2 ; start
0 a 1 b 2 b 3
A2  bA3 ;
A3   } b
How Does This CFG Derive Strings ?
a
start a b b
0 1 2 3

vs.
A0 aA0, A0 aA1
A0 bA0, A1 bA2
A2 bA3, A3 
Left Recursive Grammars
In a context-free grammar G, if there is a production in the
form X → Xa where X is a non-terminal and ‘a’ is a string of
terminals, it is called a left recursive production. The
grammar having a left recursive production is called a left
recursive grammar.

Right Recursive Grammars


In a context-free grammar G, if there is a production is in the
form X → aX where X is a non-terminal and ‘a’ is a string of
terminals, it is called a right recursive production. The
grammar having a right recursive production is called a right
recursive grammar.
Eliminating Immediate Left
Recursion
• A grammar is left recursive if it has a
production A+A
• Top-down parsing methods cannot handle
left-recursion grammars because top-down
parsing is corresponding to the leftmost
derivation.
A  A1 | A 2 | ... | A m | 1 |  2 | ... |  n
 A  1 A' |  2 A' | ... |  n A'
A'  1 A' |  2 A' | ... |  m A' | 
35

You might also like