Basic-St Lectures
Basic-St Lectures
ANUSH TSERUNYAN
This note is intended to quickly introduce basic notions in set theory to (undergraduate) students
in mathematics not necessarily specializing in logic, thus requiring no background. It uses the
sections on well-orderings, ordinals and cardinal arithmetic of [Kun83], as well as some parts from
[Mos06].
Contents
1. Zermelo–Fraenkel set theory 2
2. Well-orderings 6
3. Ordinals 7
4. Transfinite induction 9
5. The set of natural numbers 10
6. Equinumerosity 11
7. Finite/infinite sets and Axiom of Choice 14
8. Cardinals and cardinality 16
9. Uncountable cardinals and the Continuum Hypothesis 16
10. Cofinality and Kőnig’s theorem 17
11. Classes 18
Exercises 19
References 21
Definition 1.1. Recursively define the notion of a formula in the language of ZF as follows:
(i) For variables x, y, x = y is a formula;
(ii) For variables x, y, x ∈ y is a formula;
(iii) If φ is a formula, then so is ¬(φ);
(iv) If φ and ψ are formulas, then so are (φ) ∧ (ψ), (φ) ∨ (ψ) and (φ) → (ψ);
(v) If φ is a formula and x a variable, then ∀x(φ) and ∃x(φ) are formulas.
When writing a formula, one doesn’t have to religiously follow the rules about parentheses
described in the definition above as long as there is no ambiguity in reading the formula; for
example, ¬x = y is technically not a formula, but it’s ok to write it this way since there is no
ambiguity: it is clearly the same as ¬(x = y). Also, when we want to emphasize that a formula φ
says something about a set (variable) x, we write φ(x), although φ may also contain other variables;
for example, φ(x) may be x ∈ y. Lastly, we use abbreviations
Axioms of ZFC. We now list all of the axioms of ZFC for reference, although we discuss only
some of them here and leave the detailed discussion of the rest for later. Axioms 0–7 form the
system ZF and ZFC is just ZF together with Axiom 9 (Choice).
Axiom 2. Pairing:
∀u ∀v ∃p ∀w (
w∈p
↔
(w = u ∨ w = v)
).
This says that for any sets u, v, there is a set p that contains both u, v as its members and no other
members. We denote this set p by {u, v} and call it the (unordered) pairing of u, v.
Axiom 3. Union:
∀C ∃U ∀x (
x∈U
↔
∃S ∈ C (x ∈ S )
).
This says that for each set C (think of C as a collection of sets), there is a set U whose members
are exactly the elements that belong to some set S in this collection C . We denote this set U by C
S
and call it the union of the sets in C .
Axiom 4. Powerset:
∀X ∃P ∀Y (
Y∈P
↔
Y⊆X
),
where Y ⊆ X is an abbreviation for ∀z(z ∈ Y → z ∈ X). This says that for any set X there is a set P
(think of it as a collection of sets) whose elements are exactly the subsets of X. We denote this set
P by P(X) and call it the powerset of X.
This says that for each set X there is a set Y whose element are exactly those elements of X that
satisfy φ. We denote this set Y by {z ∈ X : φ(z)} and call it the subset of X defined by φ.
Note that this is not one axiom! This is an infinite collection of axioms, one for each φ. Further-
more, note that φ(z) may also contain variables other than z, X, Y and we think of them as parameters.
Axiom 6. Infinity:
∃X (
∅∈X
∧
∀x ∈ X (S (x) ∈ X)
),
where ∅ and S (x) are defined as follows. It can be deduced from the above axioms that there is a
set with no elements, which we denote by ∅ and call the emptyset. Furthermore, it can be deduced
that for every set x, there is a set y whose members are exactly the elements of x and x itself, i.e.,
y ..= x ∪ {x}. We denote this set y by S (x) and call it the successor of x.
Call a set X inductive if ∅ ∈ X and for each x ∈ X, its successor S (x) is also an element of X.
Infinity axiom states that there exists an inductive set.
Axiom 7. Foundation:
∀X (
X,∅
→
∃y ∈ X ¬∃z ∈ X (z ∈ y)
),
Axiom 9. Choice:
∀C [
∅<C
→
[
∃f : C → C ∀S ∈ C ( f (S ) ∈ S )
],
where for sets X, Y, the notation f : X → Y means that f is a function from X to Y. The notion of a
function, as well as that of an ordered pair (x, y) and Cartesian product X × Y, can be defined using
the axioms above, and we leave it as an exercise (see Exercise 1).
Choice axiom (more commonly referred to as Axiom of Choice, abbreviated, AC) states that for
any set C with ∅ < C (think of C as a collection of nonempty sets), there is a function f that assigns
to each set S ∈ C an element of S .
Notation 1.2. For a set X and sets x0 , x1 , . . . , xn , we write X = {x0 , x1 , . . . xn } to mean that for all sets
x,
x ∈ X ↔ (x = x0 ) ∨ (x = x1 ) ∨ · · · ∨ (x = xn ) .
Furthermore, for a formula φ, we write X = {x : φ(x)} to mean that for all sets x,
x ∈ X ↔ φ(x).
Caution 1.3. For a given formula φ, we don’t know a priori if there exists a set X such that
X = {x : φ(x)}. In fact, it doesn’t exist for some φ (see the discussion of Russel’s paradox in
Section 11). It is tempting to say that the existence of such a set X follows from Comprehension
axiom, but the latter only gives the existence of sets of the form {x ∈ Y : φ(x)} and not {x : φ(x)}.
To apply Comprehension, one has to first find (prove existence of) an appropriate set Y.
Below we abandon using the awkward formal (symbolic) language of ZFC and start writing our
definitions and theorems in ordinary (mathematical) English, with the understanding that if we had
to, we could translate everything to the formal language.
5
2. Well-orderings
Definition 2.1. A binary relation < on a set A is called an ordering (or a strict ordering) if for all
x, y, z ∈ A,
(i) (Irreflexivity) x ≮ x,
(ii) (Transitivity) x < y < z ⇒ x < z.
We will refer to the pair (A, <) as an ordering or an ordered set if < is an ordering of A.
Observation 2.2. Conditions (i) and (ii) imply that orderings are asymmetric, i.e.
x < y ⇒ y ≮ x.
Remark 2.3. The term ordering is also often used for a binary relation ⩽ on a set A that is reflexive,
anti-symmetric, and transitive. This is why, the notion in Definition 2.1 is often referred to as a
strict ordering. The two different notions are easily obtained from one another: indeed, given a
nonstrict ordering ⩽, one obtains its strict version by defining a < b ..⇔ a ⩽ b and a , b; conversely,
given a strict ordering, its nonstrict version is defined by a ⩽ b ..⇔ a < b or a = b. In this note, we
will only use strict orderings and hence we simply call it ordering throughout.
For an ordering <, we write x ⩽ y to mean “x < y or x = y”.
Definition 2.4. An ordering < on a set A is called linear (or total) if in addition we have:
(iii) (Totality) For all x, y ∈ A, either x = y or x < y or y < x.
A linear ordering < is called a well-ordering if
(iv) (Well-foundedness) every subset of A has a least element x, i.e. for all y ∈ A, x ⩽ y. Note that
such x is necessarily unique.
Definition 2.5. Let (A, <) be an ordering. For a ∈ A, let pred(a, A, <) denote the set of predecessors
of a, i.e.
pred(a, A, <) ..= {b ∈ A : b < a} .
We simply write pred(a), when (A, <) is clear from the context. A subset B ⊆ A is called an initial
segment if it is closed downward, i.e. for all x ∈ B, pred(x, A, <) ⊆ B. Call B a proper initial
segment if B , A.
Lemma 2.6. If (A, <) is a well-ordering, then any proper initial segment B is equal to pred(a) for
some a ∈ A; in other words, B has a supremum.
Proof. Take a to be the least element of A \ B. □
Definition 2.7. For ordered sets (A, <A ), (B, <B ), call a function f : A → B an isomorphism of
(A, <A ) with (B, <B ) (or just an order-isomorphism) if it is a bijection and for all a0 , a1 ∈ A,
a0 <A a1 ⇐⇒ f (a0 ) <B f (a1 ).
Say that (A, <A ) and (B, <B ) are isomorphic, and denote (A, <A ) ≃ (B, <B ), if there is an isomorphism
of (A, <A ) with (B, <B ).
Notation. If (A, <) is a well-ordering and B ⊆ A, then (B, <|B ) is also a well-ordering, where
<|B ..= < ∩ B × B. We will abuse the notation and write (B, <) below.
For well-orderings (A, <A ) and (B, <B ), write (A, <A ) ≺ (B, <B ) if
(A, <A ) ≃ (pred(b), <B ),
for some b ∈ B. Write (A, <A ) ⪯ (B, <B ) if (A, <A ) ≺ (B, <B ) or (A, <A ) ≃ (B, <B ).
6
Observation 2.8. (A, <A ) ⪯ (B, <B ) is equivalent to the existence of an isomorphism of (A, <A ) with
an initial segment of (B, <B ).
Lemma 2.9. For well-orderings (A, <A ), (B, <B ), there is at most one isomorphism of (A, <A ) with
an initial segment of (B, <B ).
Proof. Let B′ , B′′ be initial segments of B (possibly equal) and suppose towards a contradiction
that there are order-isomorphisms f : A → B′ , g : A → B′′ such that f , g. Let a be the <A -least
element in A for which f (a) , g(a). Without loss of generality, assume f (a) <B g(a). Then
f (a) ∈ B′′ , so we can apply g−1 to the last inequality and get a′ ..= g−1 ( f (a)) <A a. Hence, by the
choice of a, f (a′ ) = g(a′ ) = f (a), contradicting the injectivity of f . □
Corollary 2.10. (A, <) ⊀ (A, <) for any well-ordering (A, <).
Proof. The identity map a 7→ a is already an isomorphism of (A, <A ) with its initial segment, so
there cannot be any other, by Lemma 2.9. □
Observation 2.11. For any well-orderings (A, <A ), (B, <B ), (C, <C ),
(a) (A, <A ) ≺ (B, <B ) ⪯ (C, <C ) =⇒ (A, <A ) ≺ (C, <C );
(b) (A, <A ) ⪯ (B, <B ) ≺ (C, <C ) =⇒ (A, <A ) ≺ (C, <C ).
Remark 2.12. Recall that functions are simply sets of pairs satisfying a certain condition. More
precisely, a function f : A → B is a subset of A × B satisfying the following: for every a ∈ A
there is exactly one b ∈ B such that (a, b) ∈ f . (We commonly write f (a) = b instead of awkward
(a, b) ∈ f .) Thus, for example, the notation f ⊆ g makes sense for functions f, g.
Theorem 2.13. For well-orderings (A, <A ) and (B, <B ), exactly one of the following holds:
(i) (A, <A ) ≃ (B, <B );
(ii) (A, <A ) ≺ (B, <B );
(iii) (B, <B ) ≺ (A, <A ).
Proof. That these statements are mutually exclusive follows from Lemma 2.9 via Observation 2.11.
To show that one of them holds, let
f ..= (a, b) ∈ A × B : (pred(a), <A ) ≃ (pred(b), <B ) .
By uniqueness, the ordinal α in the definition of A′ is unique for each a ∈ A. Thus, by Axiom 8
(Replacement), there is a set C and a function f : A′ → C such that for all a ∈ A′ , f (a) is an
ordinal and pred(a, A, <A ) ≃ f (a). By shrinking C, we may assume that f is surjective. Note
that A′ is an initial segment of A and hence C is transitive. Thus, by (c) of Lemma 3.5, C is an
ordinal. It is also easy to see that f is an isomorphism between (A′ , <A ) and C. Now if A′ , A,
then A′ = pred(a, A, <A ) for some a ∈ A \ A′ . But then pred(a, A, <A ) ≃ C and hence a ∈ A′ , a
contradiction. Thus, A′ = A and we are done. □
Definition 3.13. For a well-ordering (A, <A ) the unique ordinal α with (A, <A ) ≃ α is called the
order type of (A, <A ) and denoted by tp(A, <A ).
4. Transfinite induction
The usual proof of the induction theorem for N given in an undergraduate course uses the fact
that the usual ordering on N is a well-ordering. The same proof works for any well-ordered set:
Theorem 4.1 (Transfinite induction). Let (A, <) be a well-ordered set and let P ⊆ A be a set such
that for any a ∈ A, pred(a, A, <) ⊆ P implies a ∈ P. Then P = A.
Proof. Assume for contradiction that P , A and let a be the least element of A \ P. By our choice,
pred(a, A, <) ⊆ P, and hence a ∈ P, a contradiction. □
9
Remark 4.2. In the proofs using transfinite induction on ordinals, the cases of 0, successor ordinals
and limit ordinals are usually considered separately. An example is the proof of Theorem 4.4 below.
It is common in mathematics to define sequences (indexed by N) inductively.2 Such an example
is the Fibonacci sequence. In set theory however, we often want to inductively define sequences
that are indexed by a larger well-ordered set (typically an ordinal). The next theorem shows that
this indeed can be done. For notational simplicity, we state it for ordinals, but the same proof would
work for arbitrary well-ordered sets.
First, note (realize?) that a sequence indexed by an ordinal κ, is nothing but a function defined
on κ. The idea of inductive definition of such function is that in order to define its value at a point
α ∈ κ, we use its values at the predecessors of α. To make this precise, we need the following:
Definition 4.3. For sets A, B, a partial function from A to B is a usual function from a subset of A
to B. We denote this by f : A ⇀ B, and we denote the domain of f by dom( f ). We denote the set
of partial functions from A to B by Partial(A, B).
Theorem 4.4 (Transfinite recursion). For every ordinal κ, set A, and map3 F : Partial(κ, A) × κ → A,
there is a unique function f : κ → A such that for every α ∈ κ,
f (α) = F( f |α , α).
Proof. We prove by transfinite induction on S (κ) that for all γ ⩽ κ, there is a unique function
fγ : γ → A satisfying
fγ (α) = F( fγ |α , α), (∗)
for every α < γ. Granted this, we would take f = fκ and be done.
For γ = 0, put fγ = ∅ and note that this is the only function defined on ∅ in general. For γ a
limit Sordinal, it follows from uniqueness that for all α < β < γ, fα ⊆ fβ (see Remark 2.12). Thus
fγ = β<γ fβ is a function from γ to A and it clearly satisfies (∗) as so do the functions fβ , β < γ.
The uniqueness of fγ also follows from the uniqueness of fβ for each β < γ.
For the successor case, let γ = S (β). For α ∈ γ, put
F( fβ , β) if α = β
(
fγ (α) = .
fβ (α) otherwise
It is clear that fγ satisfies (∗) and that it is the unique such function. □
Remark 4.5. The proof of Theorem 4.4 shows that its conclusion holds even if the map F is only
defined at points ( f, α) ∈ Partial(κ, A) × κ with α = dom( f ), which is what usually happens in
practice.
6. Equinumerosity
There are two ways to figure out whether the number of students is equal to the number of chairs
in the classroom:
(1) Count the students and count the chairs, and see if those numbers are equal.
(2) Ask the students to each pick a chair and sit on it. If there are standing students or empty chairs,
then the answer is no; otherwise, it is yes.
In set theory we take the second approach because that’s the one that works for infinite sets.
Definition 6.1. Two sets A, B are called equinumerous, noted A ≡ B, if there is a bijection between
them. We write A ⊑ B if A is equinumerous with a subset of B, i.e. A injects into B. Finally, we
write A ⊏ B if A ⊑ B but A . B.
4
We used Axiom 5 (Comprehension) here.
11
We leave it as an exercise5 to prove that
N≡Z≡Q
and
R ≡ (0, 1) ≡ [0, 1] ≡ 2N ≡ P(N).
Notation: For sets A, B, a function f : A → B and a subset C ⊆ A, let f ′′C or f ′′ (C) denote the
image of C under f , i.e.
f ′′C = {b ∈ B : ∃a ∈ C(b = f (a))} .
Here is a very easy but relevant lemma:
Lemma 6.2. Let A, B be nonempty sets. Any injection f : A → B has a surjective left inverse
g : B → A, i.e. g ◦ f = idA .
The following can be safely considered as the first theorem of set theory.
Proof. Part (b) follows from (a) by Lemma 6.2. For (a), assume that there is such surjection f and
let B = {a ∈ A : a < f (a)}. Since f is surjective, there is b ∈ A such that f (b) = B. Now either
b ∈ f (b) or b < f (b), and both cases yield a contradiction. □
Cantor’s theorem in particular implies that P(N) (and hence also R) is uncountable.
Proof. Prototypical case: Let A = B = N ∪ {∞}, f (n) ..= g(n) ..= n + 1, for n ∈ N, and f (∞) ..=
g(∞) ..= ∞. To distinguish the two copies of N ∪ {∞}, denote the elements n of A and B by nA and
nB , respectively. We define a bijection h : A → B as follows:
f (a)
if a = 2n or a = ∞
h(a) ..=
g (a) if a = 2n + 1.
−1
5
Read Theorem 6.5 first.
12
Schematically, from 0A 0B we obtain 0A 7 0B
f g f
w ' g−1 '
1A 1B 1A 1B
f g
w '
2A 2B 2A 7 2B
f g f
w ' g−1 '
3A 3B 3A 3B
f g
.. x .. & .. .. .. ..
. . . . . .
f
∞A o / ∞B ∞A f / ∞B
g
General case: Let f : A → B and g : B → A be injections. We prove by reducing this to the
prototypical case as follows: we will obtain partitions
G G
A= An ⊔ A∞ and B = Bn ⊔ B∞
n∈N n∈N
such that f ′′ An = Bn+1 and g′′ Bn = An+1 for each n ∈ N, as well as f ′′ A∞ = B∞ , so we define a
desired bijection h : A → B just like in the prototypical example, namely, for a ∈ A,
f (a)
if a ∈ A2n or a ∈ A∞
h(a) =
.
.
g−1 (a) if a ∈ A2n+1
from A0 B0 obtain A0 3; B0
f g f
s{ #+ g−1 #+
A1 B1 A1 B1
f g
s{ #+
A2 B2 A2 3; B2
f g f
s{ #+ g−1 #+
A3 B3 A3 B3
f g
.. t| .. "* .. .. .. ..
. . . . . .
f
+3 +3
A∞ ks B∞ A∞ f B∞
g
To carve out such partitions, we define recursively decreasing sequences (Ān )n∈N and ( B̄n )n∈N by
Ā0 ..= A, B̄0 ..= B,
B̄n+1 ..= f ′′ Ān , and Ān+1 ..= g′′ B̄n ,
and let An ..= Ān \ Ān+1 , Bn ..= B̄n \ B̄n+1 , as well as
\ \
A∞ ..= Ān and B∞ ..= B̄n .
n∈N n∈N
It remains to verify that the partitions A = n∈N An ⊔ A∞ and B = n∈N Bn ⊔ B∞ are as desired.
F F
The injectivity of f implies that f ′′ commutes with set-subtraction and intersections, so we have
13
f ′′ An = f ′′ (Ān \ Ān+1 ) = f ′′ Ān \ f ′′ Ān+1 = B̄n+1 \ B̄n+2 = Bn+1 and
\ \ \ \
f ′′ A∞ = f ′′ Ān = f ′′ Ān = B̄n+1 = B̄n = B∞ .
n∈N n∈N n∈N n∈N
Similarly, we have the analogous statements for g, which finishes the proof. □
The first condition must hold since otherwise f |n would be a bijection from n to A, contradicting
our hypothesis. Thus f is injective and we are done.
However, something is wrong with the above recursive definition. Namely, the right side is
′′
not a (predefined) function. The expression “some element from A \ ( f |n ) n ” does not define a
correspondence n 7→ an ∈ A \ ( f |n )′′ n for all n ∈ ω at once. What would fix our problem6 is if we
had a “choice” function c : P(A) → A such that for each nonempty B ⊆ A, c(B) ∈ B. Then we
would define our f by
f (n) = c(A \ ( f |n )′′ n),
for n ∈ ω, and this f would clearly work.
Similarly, if we had such a choice function c : P(A) → A, we could prove that (4)⇒(3) as
follows: if f : A → ω is a surjection, then define g : ω → A by a 7→ c( f −1 (a)). Clearly g is
injective.
Thus, the existence of a choice function c : P(A) → A would prove the equivalence of (1)-(4).
Does such a function always exist? This is not postulated by any of the axioms of ZF, and in fact,
the existence of such a choice function cannot be proven from ZF7. Hence, we need an extra axiom
(Axiom 9), called Axiom of Choice (AC), that states the following:
If A is a set whose elements are nonempty
S sets, then there exists a choice function; that is, a
function f : A → A such that for every x ∈ A, f (x) ∈ x.
The proof system ZF+AC is called the Zermelo–Fraenkel set theory with Choice and denoted by
ZFC. Below, all of the statements or exercises that use AC are marked with (AC).
Using the above attempts to prove (1)⇒(3) and (4)⇒(3), we now get
Proposition 7.5 (AC). For a set A, the following are equivalent:
(1) A is not equinumerous with a natural number.
(2) A is Dedekind infinite.
(3) ω ⊑ A.
(4) A ↠ ω.
For this and many other reasons, most mathematicians work in ZFC rather than in ZF.
Remark 7.6. To prove (1)⇒(3) and (4)⇒(3), one does not need the full power of AC: it is enough to
assume a weaker axiom, called Dependant Choice (DC), which (roughly speaking) gives a function
that makes a choice depending on the previously made choices. This is exactly what we used in the
attempt to prove (1)⇒(3).
6
I’m not saying this is absolutely necessary; one could get away with something a bit weaker.
7
This is a highly nontrivial theorem.
15
8. Cardinals and cardinality
Definition 8.1. An ordinal α is called a cardinal if there is no ordinal β < α such that β ≡ α.
Proposition 7.3 implies the following:
Corollary 8.2.
(a) Every natural number is a cardinal.
(b) ω is a cardinal.
Definition 8.3. Let A be a set and κ a cardinal. A set A is said to have cardinality κ if A ≡ κ. We
denote this by |A| = κ. We say that A has cardinality if it has cardinality λ for some cardinal λ.
Does every set have cardinality? The following lemma gives a reformulation of this question:
Lemma 8.4. A set A has cardinality if and only if it can be well-ordered.
Proof. ⇒: If there is a bijection f : A → κ for some cardinal κ, just copy the well-ordering of κ to
A via f .
⇐: If there is a well-ordering <A on A, then by Theorem 3.12, there is an ordinal α such that
(A, <A ) ≃ α. Put
κ = min {β ⩽ α : A ≡ β} .
It is clear that κ is a cardinal and |A| = κ. □
Thus, the real question is whether every set can be well-ordered or not. Think of R for example:
the natural ordering of R is a linear ordering but not a well-ordering. In fact, it cannot be proved
in ZF that R can be well-ordered. More generally, it turns out that the statement “every set can be
well-ordered” is equivalent to AC and is known as Zermelo’s theorem. We state this in Theorem 8.6
together with an equivalent formulation called Zorn’s lemma. First, we need the following:
Definition 8.5. Let A be a set and < be an ordering on it. An element a ∈ A is called an upper
bound for a subset B ⊆ A if for any b ∈ B, b ⩽ a.8 B is called a chain if < |B is a linear ordering on
B. An element a ∈ A is called maximal if there is no element b ∈ A with a < b.
Theorem 8.6. Each of the following statements is equivalent to AC:
1. Zorn’s Lemma: For every ordered set (A, <) whose every chain has an upper bound, there is a
maximal element in A.
2. Zermelo’s Theorem: Every set can be well-ordered.
Proof. The proof is outlined in Exercises 9 to 11. □
Corollary 8.7 (AC). Every set has cardinality.
For small cardinals such as ℵ0 , ℵ1 , ℵ2 , it is also common to use the notation ω0 , ω1 , ω2 , instead.
Continuum hypothesis. One of the most notorious questions in set theory (or in mathematics in
general) is whether ω1 = |P(N)|; equivalently, whether ω1 = |R|. The positive answer to this is
known as the Continuum Hypothesis (CH). In 1940, Gödel showed that it is consistent with ZFC
that CH holds. On the other hand, Cohen proved in 1963 that the failure of CH is also consistent
with ZFC, thus showing the independence of CH from ZFC. To do this, Cohen introduced his
famous method of forcing, which, together with its numerous variations, has since been the main
method of proving independence results.
We prove this theorem below, but first note that this is indeed a generalization of Cantor’s theorem
since applying it to I = ω, Ai = 1 and Bi = 2 (for all i ∈ ω), gives ω ⊏ 2ω . Moreover, we also get:
Corollary 10.5. κ ⊏ κcof(κ) , for every infinite cardinal κ. In particular, ℵω ⊏ ℵωω .
Proof. Put I = cof(κ) and let (Ai )i∈I be a cofinal sequence inSκ, and thus, = κ. Putting Bi = κ,
S
F i∈I AiQ
it is clear that Ai ⊏ Bi . Hence, Kőnig’s theorem implies κ = i∈I Ai ⊑ i∈I Ai ⊏ i∈I Bi = κcof(κ) . □
F Q
Proof of Theorem 10.4. First, we show that i∈I Ai ⊑ i∈I Bi . Using AC, we get a sequence ( fi )i∈I
of injections fi : Ai ,→ Bi . Note that because Bi @ Ai , Q none of these fi is surjective, so Bi \ fi′′ Ai , ∅.
Applying AC again, we F get a sequence
Q x = (xi )i∈I ∈ i∈I Bi such that xi ∈ Bi \ fi′′ Ai for each i ∈ I.
Finally, we define f : i∈I Ai → i∈I Bi by
fi (a) if j = i
f (i, a)( j) ..=
x j
otherwise.
F Q
It is easy to check that f is injective and thus i∈I Ai ⊑ i∈I Bi .
F Q
Now suppose
Q towards a contradiction that there is a surjection g : i∈I Ai ↠ i∈I Bi . We will
define x ∈ i∈I Bi that is not in the image of g.
The main point is that the map gi : Ai → Bi defined by a 7→ g(i, a)(i) cannot be surjective by the
′′
F so Bi \ gi AiF
hypothesis, , ∅ and we can choose a point from it. To this end, fix a choice function
π : P ( i∈I Bi ) \ {∅} → i∈I Bi . Now for each i ∈ I, we define
the value x(i) such that it ensures x
is not in the g-image of {i} × Ai , namely: x(i) ..= π Bi \ g′′i Ai . It is clear that x is not in the image
of g, for if x = g(i, a) for some i ∈ I and a ∈ Ai , then x(i) , gi (a) = g(i, a)(i). □
11. Classes
This section can be safely skipped by an impatient reader. It was mainly written to satisfy the
curiosity of those readers (in particular, the author’s significant other), who have heard the phrase
“It’s a proper class, not a set.” and have always wondered what it means.
Given a mathematical statement (formula) φ(x), one may wonder if there exists a set y such that
y = {x : φ(x)} ,
i.e. y contains all of the sets (in the world) that satisfy φ(x). The answer of course depends on φ.
For example, if φ(x) is the statement “x ∈ z”, for some set z, then such a set y exists, namely y = z.
Another such example is φ(x) ≡ “∀y ∈ x(y , y)” as {x : φ(x)} = ∅. On the other hand, if φ(x) is the
statement “x < x”, then such y doesn’t exist since if it did, then either y ∈ y or y < y, and both yield
a contradiction. (This is the famous Russell’s paradox.)
However, when writing proofs or theorems, it is often convenient to refer to {x : φ(x)} as if it was
a set (even when it isn’t) and call it a class. This is just abuse of language and is not part of formal
mathematics. The word class stands for the totality of all sets satisfying the formula φ, and it can be
eliminated from any statement that contains it, although this usually makes statements longer. For
example, for a class C = {x : φ(x)}, the statement “x ∈ C” really means “x satisfies φ”. A perhaps
more convincing example is the statement “The class ON of all ordinals is well-ordered by ∈.” as it
is equivalent to the statement of (b) of Lemma 3.4 together with (b) of Lemma 3.5.
18
The classes that are not equal to sets (like the class defined by “x < x”) are called proper classes.
The intuition is that proper classes are “too big” to be sets. The existence of such classes comes
from the fact that it is not postulated by ZF (or ZFC) that for a given formula φ(x), there must be a
set y such that y = {x : φ(x)}. What is postulated is Axiom Schema of Comprehension (Axiom 5),
which we recall here:
For every set y, there is a set z such that z = {x ∈ y : φ(x)}.
Thus, to avoid ending up with a proper class when trying to define a set, we make sure that all of
the elements we want to put in our set a priori belong to some (perhaps larger) set. In rare cases
when this is not possible, we may need to use Axiom 8 (Replacement) instead, as we did in the proof
of Theorem 3.12.
Proposition 11.1. The following classes are proper:
(a) R = {x : x < x};
(b) V = {x : x = x};
(c) ON = {x : x is an ordinal}.
Proof. Part (a) is left as an exercise. For (b), note that if V was a set, then by Axiom 5 (Compre-
hension), R = {x ∈ V : x < x} would also be a set, which is a contradiction. Part (c) is left as an
exercise. □
Exercises
1. Let x, y, A, B be sets.
(a) Show that {x} is a set.
(b) Show that (x, y) ..= {{x} , {x, y}} is a set and write a formula φ(z) that holds if and only if
z is an ordered pair. Moreover, write formulas φ0 (z, x) and φ1 (z, y) such that φ0 (z, x) and
φ1 (z, y) hold if and only if z = (x, y). In other words, φ0 defines the function z 7→ x and φ1
defines the function z 7→ y.
(c) Show that A × B ..= {(x, y) : x ∈ A ∧ y ∈ B} is a set.
(d) Define the notion of a function f : A → B as a certain subset of A × B, i.e. write down
which sets are called functions from A to B.
3. Prove part (d) of Lemma 3.3 and part (a) of Lemma 3.4. Do not use any of the later parts (b)-(c)
of Lemma 3.4 in your proofs.
4. Prove that there does NOT exist a set that contains all of the ordinals.
13. Prove the following version of Hartogs’ theorem that doesn’t use Axiom 8 (Replacement):
Theorem 11.2 (Hartogs). For every set A, there is a well-ordered set (H(A), <H(A) ) such that
H(A) @ A and (H(A), <H(A) ) is ⪯-least such well-ordering, i.e. if (B, <B ) is another well-ordering
with the property that B @ A, then (H(A), <H(A) ) ⪯ (B, <B ).
Outline: Let WO(A) be as in Theorem 9.1 and denote by H(A) the quotient of WO(A) by the
equivalence relation ≃, i.e. H(A) = WO(A)/ ≃. For (B, <B ) ∈ WO(A), let [(B, <B )] denote
the ≃-equivalence class of (B, <B ) in H(A). Define an ordering <H(A) on H(A) as follows: for
[(B, <B )], [(C, <C )] ∈ H(A),
[(B, <B )]<H(A) [(C, <C )] ⇐⇒ (B, <B ) ≺ (C, <C ).
Show that <H(A) is actually a well-ordering and verify that (H(A), <H(A) ) satisfies the conclusion
of the theorem.
14. Show that R ..= {x : x < x} is a proper class.
15. An open interval in ω1 is a set of the form (α, β) ..= {γ ∈ ω1 : α < γ < β} or [0, α) ..= α, for some
α < β < ω1 . The topology generated by open intervals is naturally called the open interval
topology.
(AC) Prove that the open interval topology on ω1 is sequentially compact (i.e. every sequence
has a convergent subsequence), but not compact (in the sense of open covers).
16. Let X be a second-countable topological space.
(a) Show that X has at most continuum many open subsets.
(b) Let α, β, γ denote ordinals. A sequence of sets (Aα )α<γ is called monotone if it is either
increasing (i.e. α < β ⇒ Aα ⊆ Aβ , for all α, β < γ) or decreasing (i.e. α < β ⇒ Aα ⊇ Aβ ,
for all α, β < γ); call it strictly monotone, if all of the inclusions are strict.
20
Prove that any strictly monotone sequence (Uα )α<γ of open subsets of X has countable
length, i.e. γ is countable.
Hint: Use the same idea as in the proof of (a).
(c) Show that every monotone sequence (Uα )α<ω1 open subsets of X eventually stabilizes, i.e.
there is γ < ω1 such that for all α < ω1 with α ⩾ γ, we have Uα = Uγ .
Hint: Use the regularity of ω1 .
(d) Conclude that parts (a), (b) and (c) are also true for closed sets.
References
[Kun83] K. Kunen, Set Theory: An Introduction to Independence Proofs, Studies in Logic and the Foundations of
Mathematics, vol. 102, Elsevier, 1983.
[Mos06] Y. N. Moschovakis, Notes on Set Theory, 2nd ed., Undergraduate Texts in Mathematics, Springer, 2006.
21