Set Theory: Prof. Asim Tewari IIT Bombay
Set Theory: Prof. Asim Tewari IIT Bombay
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Set Theory: Basics
Set is a correlation of mathematical objects taken from a suitable
domain of disclosure.
Examples:
▪ N = {0, 1, 2, 3, 4. . .}
(set of natural numbers)
▪ E = {0, 2, 4, 6. . .}
(set of even natural numbers)
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Set Operators
▪ Union of two sets A and B is the set of all elements in either set A or B.
▪ Written A B.
▪ A B = {x | x A or x B}
▪ Intersection of two sets A and B is the set of all elements in both sets
A or B.
▪ Written A B.
▪ A B = {x | x A and x B}
▪ Difference of two sets A and B is the set of all elements in set A which
are not in set B.
• Written A - B.
• A - B = {x | x A and x B}
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Set Operators
▪ Symmetric Difference of two sets A and B is the set of all elements
which are either in “set A and not in B” or in “set B and not in A”
▪ Written A Δ B.
▪ A Δ B = (A \ B) (B \ A)
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Cartesain Product
▪ Cartesian Product: Given two sets A and B, the set of
– All ordered pairs of the form (a , b) where a is any
element of A and b any element of B, is called the
Cartesian product of A and B.
▪ Denoted as A x B
• A x B = {(a,b) | a A and b B}
• Example: Let A = {1,2,3}; B = {x,y}
– A x B = {(1,x),(1,y),(2,x),(2,y),(3,x),(3,y)}
– B x A = {(x,1),(y,1),(x,2),(y,2),(x,3),(y,3)}
– B x B = B2 = {(x,x),(x,y),(y,x),(y,y)}
▪ In general,
A1 x A2 x……x An = {(a1,a2………an) | a1 A1,……… an An}
Example:
𝑅 2 = 𝑅 𝑋 𝑅 2 − 𝐷 𝐸𝑢𝑐𝑙𝑖𝑑𝑒𝑎𝑛 𝑠𝑝𝑎𝑐𝑒
= { 𝒙𝟏 , 𝒙𝟐 : 𝒙𝟏 , 𝒙𝟐 ∈ 𝑹}
𝑅 𝑑 = 𝑅 𝑋 𝑅 … … … 𝑋 𝑅 𝑑 − 𝐷 𝐸𝑢𝑐𝑙𝑖𝑑𝑒𝑎𝑛 𝑠𝑝𝑎𝑐𝑒
= { 𝒙𝟏 , 𝒙𝟐 … … … 𝒙𝒅 : 𝒙𝟏 , 𝒙𝟐 … … … 𝒙𝒅 ∈ 𝑹}
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Topology in Euclidean Space
▪ Euclidean Metric – A measure of distance between two points
𝑥−𝑦 = 𝑥1 − 𝑦1 2 + … … … 𝑥𝑑 − 𝑦𝑑 2
▪ Closed Ball
𝑏 𝑎, 𝑟 = {𝑥 ∈ 𝑅𝑑 : 𝑥 − 𝑎 ≤ 𝑟}
▪ Open Ball
𝑏 𝑖𝑛𝑡 𝑎, 𝑟 = {𝑥 ∈ 𝑅𝑑 : 𝑥 − 𝑎 < 𝑟}
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Topology in Euclidean Space
▪ Open Set - A set is said to be open if for each 𝑥 ∈ 𝐴, a
positive number 𝜖 can be found depending on x, such that
𝑏 𝑥, 𝜖 ∁ 𝐴
▪ Example: In case of d = 1, (u,v) is a open set
▪ System of open sets of 𝑅𝑑 is denoted by O.
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Topology in Euclidean Space
▪ The Interior Aint of a general set A is the union of all the
open sets contained in A
▪ Aint is the largest open set contained in A.
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Operations on Subsets of Euclidean Space
▪ Addition: 𝑥 + 𝑦 = 𝑥1 + 𝑦1 , 𝑥2 + 𝑦2 , … … … 𝑥𝑑 + 𝑦𝑑
▪ Scalar Multiplication by c ∈ 𝑅,
𝑐. 𝑥 = 𝑐𝑥 = (𝑐𝑥1 , 𝑐𝑥2 , … … … 𝑐𝑥𝑑 )
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Operations on Subsets of Euclidean Space
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Operations on Subsets of Euclidean Space
▪ Minkowski Subtraction: 𝐴 ⊝ 𝐵 = 𝑦𝐴 𝐵∈𝑦ځ
▪ 𝑜𝑟 𝐴𝑐 ⊕ 𝐵 𝑐
▪ (A ⊝ Č) ⊕ 𝐶 ⊆ 𝐴 ⊆ (A ⊕ Č) ⊝ 𝐶
▪ Dilation: 𝑨 ⟼ 𝑨 ⊕ Č
▪ Erosion: 𝑨 ⟼ 𝑨 ⊝ Č
▪ Opening of A by C (Erosion followed by Dilation)
▪ Closing of A by C (Dilation followed by Erosion)
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Operations on Subsets of Euclidean Space
(a) The operations of erosion and opening
applied to a set. Components that overlap
are separated while small components and
roughnesses vanish or are reduced.
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Monty Hall Problem Puzzle
• The host of a game show, offers the guest a choice of three doors. Behind one is a
expensive car, but behind the other two are goats.
After you have chosen one door, he reveals one of the other two doors behind which is a
goat (he wouldn't reveal a car).
Now he gives you the chance to switch to the other unrevealed door or stay at your
initial choice. You will then get what is behind that door.
You cannot hear the goats from behind the doors, or in any way know which door has
the prize.
Should you stay, or switch, or doesn't it matter?
•
Your first choice has a 1/3 chance of having the car, and that does not change.
The other two doors HAD a combined chance of 2/3, but now a Goat has been revealed
behind one, all the 2/3 chance is with the other door.
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Probability Of Second Girl Child
• A family has two kids, one of them is a girl. Assume safely that the probability of each gender is 1/2.
What is the probability that the other kid is also a girl?
• ANS:
1/3
This is a famous question in understanding conditional probability, which simply means that given some information you might be
able to get a better estimate.
The following are possible combinations of two children that form a sample space in any earthly family:
Girl – Girl
Girl – Boy
Boy – Girl
Boy - Boy
Since we know one of the children is a girl, we will drop the Boy-Boy possibility from the sample space.
This leaves only three possibilities, one of which is two girls. Hence the probability is 1/3
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Hard Logic Probabilty Puzzle
• Bruna was first to arrive at a 100 seat theater.
She forgot her seat number and picks a random seat for herself.
After this, every single person who get to the theater sits on his/her seat if its
available else chooses any available seat at random.
Neymar is last to enter the theater and 99 seats were occupied.
Whats the probability what Neymar gets to sit in his own seat ?
• ANS: 1/2
one of two is the possibility
1. If any of the first 99 people sit in neymar seat, neymar will not get to sit in his
own seat.
2. If any of the first 99 people sit in Bruna's seat, neymar will get to sit in his seat.
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Probability Space
▪ Probability space, which has three components (Ω; F; P),
respectively the sample space, event space, and probability
function
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Probability Space (Sample Space and Event)
Ω : sample space. Set of all possible outcomes of an experiment in the sample space. An
element of Ω is denoted by ω.
The algebra of set theory carries over directly into probability theory. The union of two
events, A and B, is the event C that either A occurs or B occurs or both occur:
C = A ∪ B. For example, if A is the event that the commuter stops at the first light (listed
before), and if B is the event that she stops at the third light,
B = {sss, scs, ccs, css}
then C is the event that she stops at the first light or stops at the third light and consists
of the outcomes that are in A or in B or in both:
C = {sss, ssc, scc, scs, ccs, css}
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Probability Space (Sample Space and Event)
Similar to the union of two events, intersection of two events and complement of an
event is also defined.
The laws of set theory work for the events:
Commutative Laws:
A∪B=B∪A
A∩B=B∩A
Associative Laws:
(A ∪ B) ∪ C = A ∪ (B ∪ C)
(A ∩ B) ∩ C = A ∩ (B ∩ C)
Distributive Laws:
(A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C)
(A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C)
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Probability Space (Event space F)
▪ In probability theory, the event space F is modelled as a σ-algebra (or σ – field) of
Ω, which is a collection of subsets of with the following properties:
(1) Ω ∈ F
(2) If an event A ∈ F, and B ∈ F, then A – B ∈ F and
(3) If Ai ∈ F, then ( 𝐹 ∈ 𝑖𝐴 ڂclosed under countable union). A countable
sequence can be indexed using the natural integers.
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Probability Measure
▪ Probability function P assigns a number (“probability") to each event in F. It is
a function mapping F ⟶ [0,1] satisfying:
1. P(A) ≥ 0, for all A ∈ F.
2. P(Ω) = 1
3. Countable additivity: If Ai ∈ F are pairwise disjoint (i.e., Ai ∩ Aj = 𝜙,
for all i ≠ j), then P(∞ڂ
𝑖=1 𝐴 𝑖 ) =σ∞
𝑖=1 𝑃(𝐴𝑖 ).
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Random Variable
▪ A random variable is a function from the sample space to the real
numbers, 𝑋 𝑟 , 𝑟 ∈ Ω, and measurable with respect to P, i.e for every
real number x, the set {𝑟: 𝑋 𝑟 < 𝑥} is an event in F
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining