Bachelor of Software Engineering (Hons)
Bachelor of Computer Science (Hons)
ITS64304: Theory of Computation
Tutorial – 1: Language and Grammar
Complete Discussion
Aim
The aim of this tutorial is to learn regular and context free grammar. By the end of this tutorial,
you should be able to analyze and construct grammars for a given specification and determine
the language for a given grammar. (Aligns with Module Learning Outcome 1)
Formal language
Taylor’s Graduate Capabilities (TGCs) developed
1.1 Grammars
Examples
1) Construct a grammar over {a, b} that recognizes the language {aib2i | i ≥ 1}
e.g., valid strings are: abb, aabbbb, aaabbbbbb, etc.
S → abb | aSbb
S => abb
S => aSbb => aabbbb
S => aSbb => aaSbbbb => aaabbbbbb
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 1
Solution
1) Start writing the shortest possible string followed by the next shortest and so on, until you
could see a pattern. The pattern seen could be then written as a production rule.
The shortest string possible in this case is abb, then aabbbb, then aaabbbbbb and so on. You
repeat a once and b twice every time. This could be written as S → aSbb, and we also need S
→ abb. Hence the grammar for the above question could be written as, S → aSbb | abb
Questions
1. For each of the following languages, construct a grammar that recognizes it.
a) L = {w {a, b} | w has even length}
E.g., valid strings are: λ, aa, ab, ba, bb, aaaa, aaab, aabb, etc.
//One valid grammar for this question is available in Lecture 1 slides. Please refer, go thru’
it to understand the thought process in arriving at the solution.
S → Oa | Ob | e
O → aS | bS
Validation:
S => Oa => aSa => aa
S => …..
S => …..
etc.
The above grammar is one possible right solution. There may be other possible solutions
with a different approach producing different set of production rules that recognizes the
given language.
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 2
b) L = {w {a, b}* | w has odd length}
* = This asterisk symbol must be read as Kleene star (named after the famous
mathematician, Stephen Cole Kleene (pronounced as /ˈkleɪni/ KLAY-nee; i). You all can
read it as Clean Star, no harm. * represents the length of the string, or repetition, or
iteration. They all are synonymous. * can take a value from 0 to ∞.
{a, b}* // * can take a value from 0 to ∞
* = 0; then the length of the string is zero. Meaning the string is an empty string which
we represent as λ
* = 1; the length of the string made-up of the alphabets a, b (read as a or b) is one
character in length. Meaning string can be a, b
* = 2; the length of the string made-up of any combination of a, b is two characters in
length. Meaning strings can be aa, ab, ba, bb
* = 3; the length of the string made-up of any combination of a, b is three characters in
length. Meaning strings can be aaa, bbb, aab, aba, abb, baa, bab, bba
And so on….
Example valid strings for the given language are:
a, b, aaa, bbb, aab, aba, abb, baa, bab, bba, aaaaa, babab, ……….., aaaaaaa, bababab etc.
S → a | b | SSS //lower-case letters are called as terminals. Usually, alphabets will be
// represented in lower-case. Strings will be made-up of alphabets.
//Upper-case letters are called as non-terminals. Meaning, they can be
// replaced or expanded.
The above grammar is one possible right solution. There may be other possible solutions
with a different approach producing different set of production rules that recognizes the
given language.
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 3
Validation:
Derivations must be done to confirm/prove/rationale to support that the grammar
constructed is the right solution for the given language specification. The grammar
constructed must accept/produce all pattern of strings that are valid in the given language
and must not accept/produce even one invalid string.
S => a
S => b
S => SSS => aSS => aaS => aaa
S => SSS => bSS => bbS => bbb
S => SSS => bSS => baS => bab
S => SSS => SSSSS =>
S => SSS => SSSSS => SSSSSSS => ………
c) L = {w {a, b}* | w has b’s followed by a’s}
//this language specification means, one or more b is followed by one or more a.
E.g., valid strings are: ba, baaaaaaaaaaaaaaa, bbbbbbbbbba,
baba, bbbaaaaaba bbbbaabababa These are some examples of invalid strings in the
given language specification.
Assignment 1 task:
Based on direction given in class, please do this problem as take-home exercise as part of
your Assignment 1. You are strongly encouraged to use the Padlet on myTIMeS under this
module site to share your answer to initiate appreciation or constructive criticism from
your peers on your solution. This would help to learn from each other among students as
peers in terms of collaborative learning.
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 4
d) L = {w {a, b}* | w has ≥ 3b’s}
E.g. valid strings are: bbb, abbb, babb, bbab, bbba, abbbaaaaababa, bbaba,
babababababa
S → AbAbAbA
A → λ | aA | bA
S => AbAbAbA => bAbAbA => bbAbA => bbbA => bbbbA => bbbbbA => bbbbbbA =>
bbbbbb
As a learning exercise discuss with your ‘learning groups’ in challenging yourself to
reconstruct the above grammar which is a Context Free Grammar (CFG) into a Regular
Grammar. You may share your answer in the Padlet on myTIMeS under this module site
to initiate appreciation or constructive criticism from your peers on your solution. This is
not part of Assignment 1.
e) L = {anbn | n ≥ 0}
Example valid strings are: λ, ab, aabb, aaabbb etc.
S → λ | ab | aSb S → λ | aSb
The above grammars are correct for the given language specification. However, the
second grammar is the simplest possible grammar as it has a smaller number of production
rules compared the 1st solution.
S => λ
S => aSb => ab
S => aSb => aaSbb => aabb
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 5
f) L = {anbn cm| n ≥ 0, m ≥ 1}
E.g., valid strings are: c, cc, ccc, and so on… abc, abcc, abccc, aabbcc, aaabbbccc, aabbc,
aaabbbc and so on.
S → KJ
K → e | aKb
J → c | Jc
The above grammar is one possible right solution given by one of the students in your
class during the tutorial session. There are other possible grammars or solutions.
S => KJ => J => Jc => cc
S => KJ => aKbJ => abJ => abc
S => KJ => aKbJ => aaKbbJ => aabbJ => aabbJc => aabbcc
g) L = {ambn cm| m ≥ 0, n ≥ 1}
e.g., strings are: b, bb, bbb etc. abc, aabcc, aaabccc, abbc, abbbc, aabbcc, aabbbcc, etc.
S -> J | aJc
J -> b | bJ
S => aJc => abJc => abbJc => abbbc
S => ………………. =>* aabcc
S -> aSc | J
J -> b | Jb
S → aSc | b | Sb
The above are possible solutions or grammar for the given language specification.
However, the 2nd solution given is the smallest possible grammar as it has lesser number
of production rules compared to the 1st solution.
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 6
S => aSc => aaScc => aaaSccc => aaaJccc => aaaJbaaa => aaabbaaa
2. For each of the following Regular Expressions, write a regular grammar that defines the same
language.
Regular Grammar: The left-hand side of the rule has only one letter and essentially it must
be a non-terminal (upper-case letters). On the right-hand side of the rule at the maximum it
can have only two characters.
Context Free Grammar (CFG): The left-hand side of the rule has only one letter and essentially
it must be a non-terminal. There are no restrictions on the right-hand side of the rule.
Meaning, the right-hand side of the rule can have any number of characters.
a) L = a*b*c*
E.g., valid strings are λ, a, b, c, ab, ac, bc, aaaabcccccc, abbbc, aabccccc, aaaaa, bbbbb,
ccccc, etc.
Invalid strings are cb, ca, cab, cba etc.
Assignment 1 task:
Based on direction given in class, Pls do this problem as take-home exercise as part of
your Assignment 1. You are strongly encouraged to use the Padlet on myTIMeS under this
module site to share your answer to initiate appreciation or constructive criticism from
your peers on your solution. This would help to learn from each other among students as
peers in terms of collaborative learning.
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 7
b) L = (a υ b υ c)* a(a υ b υ c)* b(a υ b υ c)* υ (a υ b υ c)* b(a υ b υ c)* a(a υ b υ c)*
If * = 0; then L = ab U ba
If * = 1; then possible strings are aabbc
If A= (a υ b υ c)*, then L = AaAbA U AbAaA
The valid strings possible for (a υ b υ c)* are λ, a, b, c, aa, ab, ac, bb, bc, ba, cc, ca, cb, aaa,
bbb, ccc, aab, aba, etc.,
S → A aA bA | A bA aA
A → e | aA | bA | cA
A => aA => abA => ab
Note: The above is a Context Free Grammar (CFG). The question given asks you to write
a regular grammar.
As a learning exercise discuss with your ‘learning groups’ in challenging yourself to
reconstruct the above CFG into a Regular Grammar. You may share your answer in the
Padlet on myTIMeS under this module site to initiate appreciation or constructive criticism
from your peers on your solution. This is not part of Assignment 1.
c) a(a υ b υ c)* b(a υ b υ c)*
S → aAbA
A → e | aA | bA | cA
Note: The above is a Context Free Grammar (CFG). The question given asks you to write
a regular grammar.
As a learning exercise discuss with your ‘learning groups’ in challenging yourself to
reconstruct the above CFG into a Regular Grammar. You may share your answer in the
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 8
Padlet on myTIMeS under this module site to initiate appreciation or constructive criticism
from your peers on your solution. This is not part of Assignment 1.
Reflection
Are Context Free Grammars more powerful than regular expressions? In the Padlet available on
myTIMeS reflect your thoughts.
-oo0oo-
i
https://en.wikipedia.org/wiki/Stephen_Cole_Kleene#cite_ref-3
ITS64304 Tutorial - 1: Language and Grammar – Complete discussion 9