[go: up one dir, main page]

0% found this document useful (0 votes)
206 views9 pages

Probabilistic Method in Combinatorics

The document discusses the probabilistic method, which uses probability to prove the existence of mathematical objects with certain properties. It provides examples to illustrate key probabilistic techniques like linearity of expectation, union bounds, and applying expectations to graph theory problems. The examples show how to construct probability spaces and calculate expectations to prove the existence of objects like permutations with certain properties or Hamiltonian paths in tournaments.

Uploaded by

Nguyễn Ysachau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
206 views9 pages

Probabilistic Method in Combinatorics

The document discusses the probabilistic method, which uses probability to prove the existence of mathematical objects with certain properties. It provides examples to illustrate key probabilistic techniques like linearity of expectation, union bounds, and applying expectations to graph theory problems. The examples show how to construct probability spaces and calculate expectations to prove the existence of objects like permutations with certain properties or Hamiltonian paths in tournaments.

Uploaded by

Nguyễn Ysachau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

The Probabilistic Method

Karen Ge

November 9, 2016

Abstract
Introduced by Paul Erdös, probabilistic method is a powerful com-
binatorial tool for existence problems. A simple probabilistic argument
goes like this: to show the existence of a mathematical object with
certain properties, we construct an appropriate probability space and
show that a randomly chosen object in this space has the desired prop-
erties with positive probability. Sometimes this simple approach does
not work. So we modify the objects to get the desired properties. We
survey common probabilistic method techniques.
Keywords: combinatorics, graph theory, expected value, probabilistic
method, alteration, Lovász Local Lemma

1 Linearity of Expectation

Theorem 1 (Linearity of Expectation). For any two random variables


X and Y , not necessarily independent, we have

E(X + Y ) = E(X) + E(Y ).

Linearity of Expectation is a cornerstone of probabilistic method because it


“localizes” the calculation of the properties of a global object with possibly
tangled cases. Our task becomes simpler when we only need to focus on a
local component, as illustrated in our first example. Note that a fixed point
x of a permutation f is a point that is not shifted by the permutation. That
is, f (x) = x.

Example 1. Compute the expected number of fixed points in a random


permutation of the numbers from 1 to n.

1
Solution. For each i ∈ {1, 2, ..., n}, let Xi = 1 if i is a fixed point of
the permutation and Xi = 0 otherwise. The number of fixed points X is
X1 + X2 + · · · + Xn . Since it is a random permutation, the probability that i
is not shifted by the permutation is (n−1)!
n! = n1 . By linearity of expectation,
the expected number of fixed points is

E(X) = E(X1 + X2 + · · · + Xn )
= E(X1 ) + E(X2 ) + · · · + E(Xn )
= 1 · P (1 is fixed) + 1 · P (2 is fixed) + · · · + 1 · P (n is fixed)
1
=n· = 1 .
n

Example 2. Let n be a positive integer. Let ak denote the number of


permutations of n elements with k fixed points. Compute

a1 + 4a2 + 9a3 + · · · + n2 an .

Solution. Let X be the number of fixed points of a random permutation f


of n elements. From the definition of E(X 2 ), we get

E(X 2 ) = 0 · P (f has 0 fixed points) + 1 · P (f has 1 fixed point)


+ 22 · P (f has 2 fixed points) + · · · + n2 · P (f has n fixed points)
a1 a2 an a1 + 4a2 + · · · n2 an
=1· + 22 · + · · · + n2 · = .
n! n! n! n!
Since X 2 = 2 X2 + X, we only need to compute


   
X X
E(X 2 ) = E(2 ) + E(X) = 2E( ) + E(X).
2 2

From the previous example, we know that E(X) = 1. Similarly, we can


compute E( X2 ).


   
X X n (n − 2)! 1
E( )= P (both i and j are fixed) = = .
2 2 n! 2
i6=j

Therefore,
1
a1 + 4a2 + 9a3 + · · · + n2 an = n! · E(X 2 ) = n!(2 · + 1) = 2n! .
2

2
2 Basic Techniques
Proposition 1 (Union Bound). Given events A1 , A2 , ..., An , we have

P (A1 ∪ A2 ∪ · · · + ∪An ) ≤ P (A1 ) + P (A2 ) + · · · + P (An ).

From Union Bound, we see that if the sum of the probabilities P (A1 ), ...,
P (An ) is less than 1, then the probability that none of the events A1 , ..., An
occur is positive.

Example 3. (Hong Kong) In each cell of a 100 × 100 matrix A = {aij }, one
of the integers 1, 2, ..., 5000 is written. Moreover, each integer appears in
the matrix exactly twice. Prove that one can choose 100 cells in the matrix
satisfying the three conditions below:

(a) Exactly one cell is chosen in each row.

(b) Exactly one cell is chosen in each column.

(c) The numbers in the cells chosen are pairwise distinct.

Proof. Let {b1 , b2 , ..., b100 } be a random permutation of {1, 2, ..., 100}. Let’s
consider the 100 cells

B = {a1,b1 , a2,b2 , ..., a100,b100 }.

Clearly B satisfies both (a) and (b). We compute the probability that B
does not satisfy (c). Let’s consider any n ∈ {1, 2, ..., 5000}. Two of the cells,
say aij and apq , in matrix A have the number n written in. If i = p or j = q,
then the probability that there exist two cells from B with n written in is 0
because none of the cells in B are from the same row or column. Otherwise,
1 1
we consider ai,bi and ap,bp . The probability that ai,bi = ap,bp = n is 100 · 99 .
Since there are 5000 numbers, Union Bound tells us that the probability
that any two of the numbers in B are the same is no more than
1 1
5000 · · < 1.
100 99
Thus, for any random sampling of 100 cells satisfying (a) and (b), the proba-
bility that those cells are pairwise distinct is positive. Therefore, there exist
100 cells satisfying the three conditions (a), (b), and (c).

3
Proposition 2. If a random variable X has expectation E(X) = c, then
there must be a realization of X with value at least c, and a realization of X
with value at most c.
For example, if the scores of a test are integers from 0 to 10 and the class
average is 7.3, then we know for sure that at least one student’s score is 8
or higher, and at least one student’s score is 7 or below.

Example 4. (Moscow 1993) In a botanical classifier, a plant is identified


by 100 features. Each feature can either be present or absent. A classifier is
considered to be good if any two plants have less than half of their features in
common. Prove that a good classifier cannot describe more than 50 plants.

Proof. Let’s pick a pair of plants uniformly at random from n plants. Let
Xi = 1 if the two plants in the chosen pair have opposite ith features and
Xi = 0 otherwise. Then let X be the total number of features for which the
plants differ. So X = X1 + X2 + · · · + X100 . Thus,
100
X 100
X
E(X) = E(Xi ) = P ( the chosen pair has opposite ith features).
i=1 i=1

For i ∈ {1, 2, ..., 100}, let ni be the number of plants with ith feature present.
Then the number of pairs of plants that have opposite ith features is ni (n −
ni ). Therefore,
100
X ni (n − ni )
E(X) = n
 .
i=1 2
n2
By AM-GM, ni (n − ni ) ≤ 4 . So
100 100
X ni (n − ni ) X n2 n 50n
E(X) = n(n−1)
≤ = 100 · = .
2n(n − 1) 2(n − 1) n−1
i=1 2 i=1
2
When n = 51, we get E(X) ≤ 51. Note that ni (n − ni ) = n4 if and only if
ni = n2 . Since ni is an integer, the equality cannot hold when n = 51. That
is, when n = 51, E(X) < 51. Since X is an integer, when there are 51 plants,
there must exist a pair of plants such that the number of opposite features
this pair has is ≤ 50. In other words, there exists a pair of plants that have
50 or more common features. Therefore, a good classifier cannot describe
51 plants. Consequently, it cannot describe more than 50 plants.

4
3 Graph Theory
A complete oriented graph is a graph in which every pair of vertices is con-
nected by a single directed edge. We call such an orientation a tournament.
A Hamiltonian path in a tournament is a directed path that passes through
each vertex exactly once.

Example 5. (Szele) There is a tournament on n vertices that has at least


n!
2n−1
Hamiltonian paths.

Proof. We construct a random tournament T on n vertices. For each pair


of vertices, we flip a fair coin to decide the direction of the edge between
them. Let X be the number of Hamiltonian paths in T . Note that each
Hamiltonian path corresponds to a permutation σ of the n vertices. Let’s
consider the sequence {σ(1), σ(2), ..., σ(n)}. Let Xσ = 1 if all the edges
(σ(i), σ(i + 1)) are in T and Xσ = 0 otherwise. We have
1
E(Xσ ) = P ((σ(i), σ(i + 1)) ∈ T for all i ∈ {1, 2, ..., n − 1}) = .
2n−1
P
Since X = σ Xσ and there are n! permutations, we have
X X n!
E(X) = E( Xσ ) = E(Xσ ) = n−1 .
σ σ
2

n!
Therefore there is a tournament with at least 2n−1
Hamiltonian paths.

Example 6. Show that one can construct a round-robin tournament out-


come with more than 1000 people such that for any set of 1000 people, some
contestant outside that set beats all of them.

Proof. We flip a fair coin to decide the results of the tournament. Let A be
the set of all the people in the tournament with |A| = n and B be any subset
of A of unbeaten people with |B| = 1000. We call a set “unbeaten” if for
any contestant outside this set, there is at least one person within the set
who beats this contestant. Let C be the set of all the B’s. We will calculate
the expected value of |C|.
Since B is a subset of A of unbeaten people, for any x ∈ A \ B, there is
a y ∈ B such that y beats x. The probability that someone in B beats x

5
is 1 − ( 12 )1000 and there are n − 1000 people who are in A but not in B.
Therefore,
X X  1 n−1000
E(|C|) = P (B is unbeaten) = 1 − ( )1000
2
|B|=1000 |B|=1000
  
n 1 n−1000
= · 1 − 1000 .
1000 2

Since exponential functions grow faster than polynomials, when n is big


enough, E(|C|) is less than 1. Since |C| is an integer, there is an n such that
|C| = 0. That is, we can construct a tournament outcome with n people
for some sufficiently large n, such that for any set of 1000 people, some
contestant outside that set beats all of them.

4 Alteration
So far, all our examples work when we randomly sample objects in a proba-
bility space and hope for the best. But this approach does not work all the
time. In some cases, we need to modify the objects to make it work.
In graph theory, a set of vertices is called independent if no two of them are
connected by an edge.

Example 7. (weak Turán) A graph G has n vertices and average degree d.


Prove that it is possible to select an independent vertex set of size at least
n
2d .

Proof. Without loss of generality, we assume d > 1. Let’s go through a


two-step process:

Step 1: Remove each vertex and its associated edges independently with
probability 1 − d1 .

Step 2: For each remaining edge, remove it by randomly removing one of


its vertices together with its associated edges.

We get an independent vertex set A after those two steps. To find E(|A|),
let’s define B and C to be the number of remaining vertices and edges after
Step 1, respectively. We have E(B) = n(1 − (1 − d1 )) = nd . We start with nd
2

6
edges. Note that every edge is associated with two vertices, and each vertex
has probability d1 of surviving Step 1. Thus we have
nd 1 1 n
E(C) = · · = .
2 d d 2d
Note that |A| ≥ B −C because each time we remove an edge, we also remove
at least one vertex. Thus,
n n n
E(|A|) ≥ E(B − C) = − = .
d 2d 2d
Therefore, it is possible to select an independent vertex set of size at least
n
2d .

5 Lovász Local Lemma


From
P Union Bound, we see that if A1 , A2 , ..., An are “bad” events, and
if ni=1 P (Ai ) < P
1, then there is a positive probability that none of them
occur. However, ni=1 P (Ai ) is not always a good estimate of P (∪Ai ).
Recall that if A1 , A2 , ..., An are independent events and each occurs with a
probability P (Ai ) ≤ p < 1, then the probability that none of them occur is
n
\ n
Y
P( Āi ) = P (Āi ) ≥ (1 − p)n > 0.
i=1 i=1

Lovász Local Lemma extends the idea of independency by allowing limited


dependencies.

Theorem 2 (Symmetric Lovász Local Lemma). Let A1 , A2 , ..., An be a


set of events with P (Ai ) ≤ p for all 1 ≤ i ≤ n. If each Ai is independent of
all but at most d of the other Aj and

ep(d + 1) ≤ 1, where e = 2.71818...,

then the probability that none of them occur is positive.

Example 8. Suppose 11n points are placed around a circle and colored
with n different colors in such a way that each color is applied to exactly
11 points. Prove that in any such coloring, there must be a set of n points
each with a different color and no two adjacent.

7
Proof. Let’s label the points around the circle a1 , a2 , ..., a11n , and a0 = a11n .
Let’s pick a point of each color randomly to form a set B of n points. For
each 1 ≤ i ≤ 11n, let Xi be the event that both ai and ai+1 are in B. So
X = X1 ∪ X2 ∪ · · · ∪ X11n is the union of all the bad events that we would
like to avoid. Note that
(
1
· 1 if ai and ai+1 are of different colors,
P (Xi ) = 11 11
0 otherwise.
By Union Bound,
11n
X 11n n
P (X) ≤ P (Xi ) < 2
= .
11 11
i=1

So Union Bond does not give us P (X) < 1 when n > 11. However, we see
that each Xi is independent from Xj for all j unless aj or aj+1 has the same
color as ai or ai+1 . Note that 10 other points are colored the same color as
ai , and each point appears in two consecutive pairs of points. So there are
at most 2 · 10 + 1 = 21 pairs (aj , aj+1 ) other than (ai , ai+1 ) sharing a color
with ai . Similarly, there are at most another 21 pairs sharing a color with
ai+1 . Thus Xi and Xj are independent for all but at most 42 values of j.
Now we can apply Lovász Local Lemma with p = 1112 and d = 42. Since
1 28 43
ep(d + 1) = e · · 43 < · < 1,
112 10 121
Lovász Local Lemma tells us that the probability that none of the bad
events occur is positive. Therefore, there must be a set of n points each
with a different color and no two adjacent.

6 Exercises
1. (Engel) Let S be a collection of line segments in the unit interval
I = [0, 1) whose total length is more than 12 . Prove that there exist
two points in S that are exactly distance 0.1 apart.
2. (Iran) Prove that in a tournament with 799 teams, there exist 14 teams
that can be partitioned into groups in a way that all of the teams in
the first group have won all of the teams in the second group.
3. (Russia) In the Duma there are 1600 delegates, who have formed 16000
committees of 80 persons each. Use probabilistic method to prove that
one can find two committees having at least four common members.

8
4. (Turán’s Theorem) A graph G has n vertices and average degree d.
Prove that it is possible to select an independent vertex set of size at
n
least d+1 .

5. Let n ∈ N and m ∈ N such that m > 0. Let A1 , A2 , ..., Am be


m nonempty subsets of the set {1, 2, ..., n}. Prove that there exists a
subset X of {1, 2, ..., n} such that |Aj ∩ X| is odd for more than m 2
different values of j ∈ {1, 2, ..., m}.

6. Prove that a good classifier in Example 4 cannot describe more than


34 plants.

7 References
1. Evan Chen, Expected Uses of Probability
http://www.mit.edu/˜evanchen

2. Po-Shen Loh, Probabilistic Methods in Combinatorics,


www.math.cmu.edu/˜ploh/docs/math/mop2009/prob-comb.pdf

3. Matoušek and Vondrák, The Probabilistic Method,


webbuild.knu.ac.kr/˜trj/Combin/matousek-vondrak-prob-ln.pdf

You might also like