COS 335-Automata Theory and Formal Languages
COS 335-Automata Theory and Formal Languages
Course Content
Basic Set theory; Function; Alphabet; String, String Concatenation;
Definition of Automaton; Classes of automata: Deterministic and Non-Deterministic
automata; Grammar; Types of Grammar: Context-Free and Context-Sensitive; Formal
Languages; Regular Expression; Turing Machine.
Mrs. Ezugwu E. A
Compiled by £instein
BASIC SET THEORY
The Union of the sets A and B, denoted by 𝐴 ∪ 𝐵 is the set of all objects that are a member of A or B
or both. For example, the union of {1, 2, 3} and {2, 3, 4} is the set {1, 2, 3, 4}.
Intersection of the sets A and B, denoted by 𝐴 ∩ 𝐵, is the set of all objects that are members of both
A and B. for example, the intersection of {1, 2, 3} and {2, 3, 4} is the set {2, 3}.
Then, the compliment of set A, will de-note all those elements which belong to the universal set but
not A.
𝐴′ = {8, 10}.
LANGUAGES
This is the body of words and method of combining words understood by a considerable
community. The languages we consider for our discussion is an abstraction of Natural Languages,
that is, our focus here is on formal languages that need precise and formal definitions. Programming
languages belong to this category.
SYMBOLS
Symbols are indivisible objects or entity that cannot be defined. That is, symbols are the
atoms of the world of languages. A symbol is any single objects such as [a, 0, 1, #].
ALPHABETS
The usual symbols used in programming languages are digits, letters and special characters.
The alphabet of a language is normally denoted by Σ. When more than one alphabets are considered
for discussion, then subscripts may be used. e.g. (Σ1, Σ2 etc.) or sometimes other symbols like G may
also be introduced.
LENGTH OF STRING
The number of symbols in a string 𝛼 is called its length denoted by|𝛼|.
Example: let 𝛼 = 010011
What is |𝛼| of 𝛼?
Solution: |𝛼| = 6
EMPTY STRING
A string whose length is zero and denoted by ∈ or 𝜆 or ∧ i.e epsilon or Lambda or cap.
STRING OPERATIONS
CONCATENATION OF STRINGS
The concatenation of 𝑋 𝑎𝑛𝑑 𝑌 denoted by 𝑋𝑌, is the string 𝑎1𝑎2𝑎3𝑎4…𝑎n 𝑏1𝑏2𝑏3𝑏4…𝑏w, that is, the
concatenation of 𝑋 𝑎𝑛𝑑 𝑌 denoted by 𝑋𝑌 is the string that has a copy of 𝑋 followed by a copy of 𝑌
without any intervening space between them. This is shown below:
𝑋𝑌 = 𝑎1𝑎2𝑎3𝑎4…𝑎n𝑏1𝑏2𝑏3𝑏4…𝑏w
REFLECTION/REVERSAL OF A STRING
Let 𝑋 = 𝑎1𝑎2𝑎3… 𝑎 n, the reflection of 𝑋 is written as 𝑋T. For any string 𝑋 = 𝑎1𝑎2𝑎3… 𝑎n, the
reflection of the string is:
Example: 𝑌 = 𝑎𝑏𝑐𝑑 :
𝑌T = 𝑑𝑐𝑏𝑎
POWER OF STRING
For any string 𝑋, the power of string (𝑋n) is defined as the number of times the string occurs.
-: 𝑋3 = 011011011
𝑋1 = 011
LANGUAGE
A language is a set of words and operations on those words. Σ* is the set of all strings on Σ. A
language over an alphabet is a set of strings over that alphabet. In general, a language is not the set
of all the sentences derivable from Σ, therefore, a language 𝐿 is any subset of Σ*. That is, any 𝐿 ⊆ Σ*
is a language. e.g. Σ = {𝑎, 𝑏, 𝑐}
𝐿1 ∩ 𝐿2 = {𝑎}
COMPLIMENTATION
Usually, Σ∗ is the universal set that the compliment is taken with respect to, Thus for a language 𝐿,
the compliment is 𝐿̅ = {𝑋 ∈ Σ∗ | 𝑋 ∉ 𝐿}.
Example: let 𝐿 = {𝑋 |𝑋| 𝑖𝑠 𝑒𝑣𝑒𝑛}, then its compliment is the language 𝐿̅ = { 𝑋 ∈ Σ ∗ | |𝑋| 𝑖𝑠 𝑜𝑑𝑑}.
REFLECTION OF A LANGUAGE
Example:
LANGUAGE CONCATENATION
𝐿1 𝐿2 ≠ 𝐿2 𝐿1
𝐿(Σ) = 𝐿 = (Σ)
POWER OF A LANGUAGE L
𝐿𝑛 = 𝐿. 𝐿. 𝐿 … 𝑛 𝑡𝑖𝑚𝑒𝑠 i.e. if 𝑥 ∈ 𝐿 𝑡ℎ𝑒𝑛 𝑥 𝑛 ∈ 𝐿. If the language is 𝐿, then for any language at all;
𝐿0 = Σ 𝑒𝑚𝑝𝑡𝑦,
𝐿1 = 𝐿. 𝐿 (𝑚𝑒𝑎𝑛𝑠 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛)
𝐿3 = 𝐿. 𝐿. 𝐿
Example
𝐿∗ = (𝑈𝑛𝑖𝑜𝑛 𝑛 𝑖𝑛 𝑁)𝐿𝑛
𝐿∗ = 𝐿0 ∪ 𝐿1 ∪ 𝐿2 ∪ … ∪ 𝐿∞
Thus 𝐿∗ is the set of all strings derivable by any number of concatenation of strings in 𝐿. It is also
useful to define 𝐿+ = 𝐿. 𝐿∗ , i.e. all strings derivable by one or more concatenation of strings in 𝐿. That is:
𝐿∗ = 𝐿0 ∪ 𝐿1 ∪ 𝐿2 ∪ … ∪ 𝐿∞
𝐿∗ = 𝐿1 ∪ 𝐿2 ∪ 𝐿3 ∪ …
Note, 𝑒 is in 𝐿∗ for every Language 𝐿 including the previously introduced of Σ ∗ is an instance of Kleene Star.
AUTOMATA
An automaton is an abstract computing device (machine) which accepts some input (a string), and
processes it according to some stipulated rules and produces output (yes/no or string) and may also have a
stack or tape as its storage device.
Input String
Tape
Read/Write head
Finite Control
Unit
Output
Finite Memory
An Automaton operates in a discrete time frame just as it is obtainable in real computers. A particular
state of the automaton including the state of the control unit, input and storage is called configuration and the
transaction from a configuration to the next is a move. If the output of the automaton is yes/no, the
automaton is called an acceptor but of the output is a string, it is called a transducer.
TYPES OF AUTOMATA
An automaton without internal storage gives Yes/No output and as such is known as Accepters. It is
called deterministic finite automata because:
a a a
0 b 1 2
b
b
Fig. 2
The state pointed by the arrow is the initial state, double circles show the final states.
When processing a string, we should start from the initial state and follow through the edges
according to the symbols (strings) until the end of the string is reached.
Example 1:
Using the diagram (Fig. 2), determine if the string aab will be accepted or rejected by the system.
NOTE:
If a string leads to a final state (double circles), that string will be accepted i.e. the
automaton gives a “yes” answer or output but if the string leads to a state that is not final,
the string is rejected i.e. the automaton gives a “no” answer or output.
Thus, from the example: aab is accepted because it leads to a final state.
Example 2:
From the diagram (Fig. 2), Determine if the string aaba will be accepted or not.
Solution
aaba will be rejected because it leads to a non-final state.
For each state and each symbol, there is exactly one out arrow with the symbol.
a a
b a
Right, correct Wrong, not correct
The more symbols we have in the alphabet, the more outgoing arrows we will have:
a
TRANSITION FUNCTION
A transition function which is usually denoted by 𝛿 is a mapping from a (state, symbol) pair to the
corresponding next state:
State0 Symbol
State1
S0 S1 … Sn
State0 State1 Staten
𝑀 = (𝑄, Σ, δ, 𝑞𝑜 , 𝐹)
Where:
a,b
A B
a,b
𝛿(𝐴𝑖 , 𝑏) = 𝐵𝑗 , this says when the automata is in state 𝐴and input 𝑏 is given, it goes to state 𝐵.
𝛿(𝐵𝑖 , 𝑎) = 𝐴𝑗 , This says when the automata is in state 𝐵 and input 𝑎 is given, it goes to state 𝐴.
𝛿(𝐵𝑖 , 𝑏) = 𝐴𝑗 , This says when the automata is in state 𝐵and input 𝑏 is given, it goes to state 𝐴.
Therefore;
𝛿 ∗ (0, 𝑎𝑎𝑏) = 1
𝛿 ∗ (0, 𝑎𝑎𝑏) = 2
Solution
1.
𝒂 𝒃
𝟎 𝟎 𝟏
𝟏 𝟐 𝟐
𝟐 𝟐 𝟐
b a,b
0 1 2
b
The set of all strings accepted by a DFA is the language accepted by the DFE. Therefore every
automaton has a particular language it defines. The language that our automaton accepts is:
{𝑎𝑛 , 𝑎𝑛 𝑏: 𝑛 ≥ 0}
By interpretation, this means all strings of the form 𝑎 … 𝑎 𝑎𝑛𝑑 𝑎 … 𝑎𝑏 is accepted by the above
automaton. If a DFA can be constructed for a given language, then the language is known as regular
language.
Example:
𝑀 = (𝑄, Σ, 𝛿, 𝑞0 , 𝐹)
𝑄 = (𝑞0 , 𝑞1 )
𝑞0 𝑖𝑠 𝑡ℎ𝑒 𝑠𝑡𝑎𝑟𝑡 𝑠𝑡𝑎𝑡𝑒
𝐹𝑘 = {𝑞1 }
𝛿(𝑞0 , 0) = 𝑞0 , 𝛿(𝑞1 , 0) = 𝑞1
𝛿(𝑞0 , 1) = 𝑞1 , 𝛿(𝑞1 , 1) = 𝑞1
Draw the state diagram an the transition table, at what state did the system actually move/transit?
Solution
State Diagram
0 0
𝑞0 1 𝑞1
Transition Table
𝟎 𝟏
→ 𝒒𝟎 𝒒𝟎 𝒒𝟏
∗ 𝒒𝟏 𝒒𝟏 𝒒𝟏
Assignment
With the aid of a suitable diagram, determine if the language given below is regular or not.
𝐿 = {𝑎𝑛 𝑏𝑚 : 𝑛, 𝑚 ≥ 1}