Lecture 3: DFAs and Regular Languages
CSC 320: Foundations of Computer Science
Quinton Yong
quintonyong@uvic.ca
Is the set of all languages / decision problems countable?
• Firstly, let’s consider how many possible strings there are over an alphabet 𝚺
• i.e. how large is 𝚺 ∗ for any alphabet 𝚺
• Clearly, the alphabet 𝚺 is countable since it is a finite set
• Claim: 𝚺 ∗ is countably infinite
• Proof: We can enumerate Σ ∗ . Example, consider Σ = 0, 1 .
𝜺 𝟎 𝟏 𝟎𝟎 𝟎𝟏 𝟏𝟎 𝟏𝟏 𝟎𝟎𝟎 …
1 2 3 4 5 6 7 8 …
Is the set of all languages / decision problems countable?
• Now, how large is the set of all possible languages over an alphabet 𝚺?
• We know that the set of possible strings 𝚺 ∗ is countable (countably infinite)
• Languages are subsets of 𝚺 ∗ (select strings which are accepted)
• Every language is countably infinite or finite
• So, the set of all languages over 𝚺 is the set of all subsets of 𝚺 ∗ (powerset)
𝓟(𝚺 ∗ )
• Thus, the size of the set of all languages over alphabet 𝚺 is 𝓟 𝚺 ∗
Is the set of all languages / decision problems countable?
• We will prove that the powerset of any countably infinite set is uncountable
• This would imply that 𝓟(𝚺 ∗ ) is uncountable
• Since 𝚺 ∗ is countably infinite, it has a bijection with ℕ
• We can just prove that 𝓟(ℕ) is uncountably infinite
Barber Paradox
The barber shaves everyone who doesn’t shave themselves.
Who shaves the barber?
• If the barber shaves themself, then the barber doesn’t shave the barber…
• If the barber doesn’t shave themself, then the barber shaves the barber…
This is a paradox.
We will use this idea in the proof.
𝓟(ℕ) is uncountable
• Assume for a contradiction that 𝓟(ℕ) is countable
• We list every subset of ℕ as 𝑺𝟏 , 𝑺𝟐 , 𝑺𝟑 , … such that every possible subset of ℕ is
equal to a subset 𝑺𝒊 for some 𝒊
𝑺𝟏 = …
𝑺𝟐 = …
𝑺𝟑 = …
…
• Consider the subset 𝑫 ⊆ ℕ where 𝑫 = 𝒊 ∈ ℕ| 𝒊 ∉ 𝑺𝒊
• e.g. if 𝑺𝟏 does not contain the number 1, then 1 is in 𝑫
• e.g. if 𝑺𝟓 does not contain the number 5, then 5 is in 𝑫
• e.g. if 𝑺𝟖 contains the number 8, then 8 is not in 𝑫
𝓟(ℕ) is uncountable
• Consider the subset 𝑫 ⊆ ℕ where 𝑫 = 𝒊 ∈ ℕ| 𝒊 ∉ 𝑺𝒊
• For example, suppose the list of every subset of ℕ was as follows:
𝑺𝟏 = 𝟐, 𝟒, 𝟔, …
𝑺𝟐 = 𝟏, 𝟐, 𝟐𝟏, …
𝑺𝟑 = 𝟏, 𝟖, 𝟏𝟑, …
𝑺𝟒 = 𝟑, 𝟔𝟏, 𝟏𝟓𝟐, …
𝑺𝟓 = { 𝟓, 𝟏𝟎, 𝟏𝟓, … }
…
• 𝑫 would be {𝟏, 𝟑, 𝟒, … }
• 𝟏, 𝟑, 𝟒 ∈ 𝑫 since 𝟏 ∉ 𝑺𝟏 , 𝟑 ∉ 𝑺𝟑 , and 𝟒 ∉ 𝑺𝟒
• 𝟐, 𝟓 ∉ 𝑫 since 𝟐 ∈ 𝑺𝟐 and 𝟓 ∈ 𝑺𝟓
𝓟(ℕ) is uncountable
• Consider the subset 𝑫 ⊆ ℕ where 𝑫 = 𝒊 ∈ ℕ| 𝒊 ∉ 𝑺𝒊
• Since 𝑫 ⊆ ℕ, 𝑫 is in the list of subsets and there is an 𝑺𝒋 such that 𝑺𝒋 = 𝑫
𝑺𝟏 = …
𝑺𝟐 = …
…
𝑺𝒋 = 𝑫 = …
• Is the number 𝒋 in the set 𝑫 (which is 𝑺𝒋 )?
• If 𝑫 contains 𝒋, then by definition, 𝒋 is not in 𝑫
paradox
• If 𝑫 does not contains 𝒋, then by definition, 𝒋 is in 𝑫
• Thus, 𝑫 cannot exist and cannot be listed.
𝓟(ℕ) is uncountable
• The way we defined 𝑫 was a valid subset of ℕ
• However, we showed that it cannot exist, so we cannot list every subset of ℕ
• Thus, we have a contradiction
• Therefore, 𝓟(ℕ) is uncountable
Recap:
• By showing 𝓟(ℕ) is uncountable, we also have that 𝓟(𝚺 ∗ ) is uncountable
• Therefore, the set of all languages / decidable problems is uncountably infinite
• This means there are an uncountably infinite number of problems
Finite Automata
• In this course, we want to understand and evaluate computability and
computational limits
• To do that, we need to have model(s) of a computer which capture its
computational power
• The simplest model is a finite state machine or finite automaton
• Very little finite amount of memory independent of problem size
• Real world examples: Automatic doors, elevator, household appliances,
substring search
State Diagram
State diagrams are used to describe finite automata
0
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
This state diagram contains:
• States: {𝑞1 , 𝑞2 , 𝑞3 , 𝑞4 }
• Start state: 𝑞1 (indicated by start arrow)
• Accept state: {𝑞3 } (indicated by double circle)
• Transitions: arrow from state to state (according to received input)
• Inputs: labels on transitions (symbols from alphabet, in this case Σ = 0, 1 )
State Diagram
State diagrams are used to describe finite automata
0
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
What happens when we input string 11011?
State Diagram
State diagrams are used to describe finite automata
0
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
What happens when we input string 11011?
State Diagram
State diagrams are used to describe finite automata
0
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
What happens when we input string 11011?
State Diagram
State diagrams are used to describe finite automata
0
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
What happens when we input string 11011?
State Diagram
State diagrams are used to describe finite automata
0
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
What happens when we input string 11011?
State Diagram
State diagrams are used to describe finite automata
0
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
What happens when we input string 11011?
• Once we read each symbol of the input, we land on an accept state
• This finite state machine accepts the string
State Diagram
State diagrams are used to describe finite automata
0
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
What other strings are accepted by this finite automaton?
• Is the empty string 𝜺 accepted?
• Is the string 00?
• Is the string 000?
What language is recognized by this finite automaton (i.e. what are all the strings
which are accepted)?
Formal Definition: Deterministic Finite Automaton
A deterministic finite automaton (DFA) is a 5-tuple (𝑸, 𝚺, 𝜹, 𝒒𝟎 , 𝑭) where
• 𝑸 is a finite set called the states
• 𝚺 is a finite set called the alphabet
• 𝜹: 𝑸 × 𝚺 → 𝑸 is the transition function (i.e. 𝜹(𝐜𝐮𝐫𝐫_𝐬𝐭𝐚𝐭𝐞, 𝐬𝐲𝐦𝐛𝐨𝐥) = 𝐧𝐞𝐱𝐭_𝐬𝐭𝐚𝐭𝐞)
• 𝒒𝟎 ∈ 𝑸 is the start state
• 𝑭 ⊆ 𝑸 is the set of accept states
State Diagram DFA Definition
𝑞1 , 𝑞2 , 𝑞3 , 𝑞4 , 0, 1 , 𝛿, 𝑞1 , 𝑞3 with
0
𝛿: 𝑄 × Σ → 𝑄 defined by
1 𝑞2
𝑞4 0 𝜹 𝟎 𝟏 input
1 current 𝑞1 𝑞3 𝑞2 next
0 1
𝑞1 𝑞2 𝑞4 𝑞3
𝑞3 1
𝑞3 𝑞1 𝑞2
0
𝑞4 𝑞4 𝑞3
Language of a DFA
Let 𝑴 = (𝑸, 𝚺, 𝜹, 𝒒𝟎 , 𝑭) be a DFA and 𝑨 be the set of all strings that 𝑴 accepts
• 𝑨 is called the language of machine 𝑴
• 𝑳 𝑴 = 𝑨 (i.e. 𝐿 𝑀 denotes the language of 𝑴)
• 𝑴 recognizes the language 𝑨
Note: a machine that accepts no string recognizes the empty language ∅
Example: Language 𝑳 𝑴 of a DFA 𝑴
𝑴 𝑞1 , 𝑞2 , 𝑞3 , 0, 1 , 𝛿, 𝑞1 , 𝑞3 with
1 𝑞2 0, 1 𝛿: 𝑄 × Σ → 𝑄 defined by
𝜹 𝟎 𝟏
𝑞1 𝑞3 0, 1 𝑞1 𝑞3 𝑞2
𝑞2 𝑞3 𝑞3
0 𝑞3 𝑞1 𝑞2
What is the language of DFA 𝑴?
𝑳 𝑴 = 𝑤 ∈ Σ ∗ | 𝒘 ≥ 1, if 𝒘 starts with 1 then 𝒘 ≥ 2
DFA: Formal Definition of Computation
• Let 𝑴 = (𝑸, 𝚺, 𝜹, 𝒒𝟎 , 𝑭) be a DFA and let 𝒘 = 𝒘𝟏 𝒘𝟐 … 𝒘𝒏 be a string over 𝚺
• Then 𝑴 accepts 𝒘 if there is a sequence of states 𝒓𝟎 , 𝒓𝟏 , 𝒓𝟐 , … , 𝒓𝒏 in 𝑸 such that
1. 𝒓𝟎 = 𝒒𝟎
2. 𝜹 𝒓𝒊 , 𝒘𝒊+𝟏 = 𝒓𝒊+𝟏
3. 𝒓𝒏 ∈ 𝑭
𝒘𝟏 𝒘𝟐 … 𝒘𝒏
𝒘𝟏 𝒘𝟐 … 𝒘𝒏 𝒓𝟎 𝒓𝟏 𝒓𝟐 𝒓𝒏
• 𝑴 recognizes language 𝑳 if 𝑳 = 𝑳 𝑴 = 𝒘 ∈ 𝚺 ∗ | 𝑴 accepts 𝒘
Computation of DFA – Sequence of States
• What is the sequence of states when the following DFA 𝑴 computes
𝒘 = 𝟎𝟎𝟏𝟏𝟎𝟏
0
𝑴
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
• 𝑞1 , 𝑞3 , 𝑞1 , 𝑞2 , 𝑞3 , 𝑞1 , 𝑞2
• Since 𝑞2 is not a final state, 𝒘 is not accepted
• 𝒘∉𝑳 𝑴
Computation of DFA – Sequence of States
• What is the sequence of states when the following DFA 𝑴 computes
𝒘 = 𝟎𝟎𝟏𝟏𝟎𝟏𝟏
0
𝑴
1 𝑞2
𝑞4 0
1
0 1
𝑞1
𝑞3 1
0
• 𝑞1 , 𝑞3 , 𝑞1 , 𝑞2 , 𝑞3 , 𝑞1 , 𝑞2 , 𝑞3
• Since 𝑞3 is a final state, 𝒘 is accepted
•
• 𝒘∈𝑳 𝑴
Regular Languages
• A language 𝑳 is called a regular language if there exists a deterministic
finite automaton that recognizes 𝑳
• The set of all languages recognized by the set of all possible DFAs is
called the regular languages
Regular Languages
𝑴
1 𝑞2 0, 1
𝑞1 𝑞3 0, 1
• In this DFA 𝑴 which we saw previously, the language of 𝑴 is
𝑳 𝑴 = 𝑤 ∈ Σ ∗ | 𝒘 ≥ 1, if 𝒘 starts with 1 then 𝒘 ≥ 2
• Since there exists a DFA which recognizes 𝑳 𝑴 , 𝑳 𝑴 is a regular language
DFA Example 1
Is this the state diagram for a valid DFA?
𝑞4 0, 1, 2
𝑞3 0, 1
Try to write the formal DFA definition (i.e. the 5-tuple): 𝑴 = (𝑸, 𝚺, 𝜹, 𝒒𝟎 , 𝑭) where
• 𝑄 = {𝑞3 , 𝑞4 }
• Σ = 0, 1, 2
• 𝑞0 = 𝑞3
• 𝐹 = 𝑞3
• However, in a DFA, the transition function 𝜹: 𝑸 × 𝚺 → 𝑸 must have an outgoing
transition for each alphabet symbol
• This state diagram does not correspond to a valid DFA since 𝒒𝟑 does not have an
outgoing transition for 2
DFA Example 2
Is this the state diagram for a valid DFA?
𝑞4 0, 1
𝑞3 0, 1
Yes. The DFA is 𝑴 = (𝑸, 𝚺, 𝜹, 𝒒𝟎 , 𝑭) where
• 𝑄 = {𝑞3 , 𝑞4 }
• Σ = 0, 1
• 𝑞0 = 𝑞3
• 𝐹 = 𝑞3
• 𝛿 is as described by the state diagram
Note: When asked to provide a DFA, you must give the 5-tuple (the state diagram
by itself is not a formal DFA). For the transition function, unless specified, you can
provide the table or say “described by the state diagram”.
DFA Example 2
What is the language of the DFA corresponding to this state diagram?
𝑴 𝑞4 0, 1
𝑞3 0, 1
• 𝑳 𝑴 = 𝜺
• Note that this is not the same as writing 𝑳 𝑴 = ∅
• This DFA accepts the empty string
• 𝑳 𝑴 would be ∅ if the DFA doesn’t accept any string