[go: up one dir, main page]

0% found this document useful (0 votes)
12 views18 pages

TEOI DiscreteRandomVariables

Course on Discrete Random Variables

Uploaded by

Carlos Hurtado
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views18 pages

TEOI DiscreteRandomVariables

Course on Discrete Random Variables

Uploaded by

Carlos Hurtado
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Probability Pairs of random variables

Information Theory
Degree in Data Science and Engineering
Lesson 1: Discrete random variables

Jordi Quer, Josep Vidal

Mathematics Department, Signal Theory and Communications Department


{jordi.quer, josep.vidal}@upc.edu

2019/20 - Q1

1/18
Probability Pairs of random variables

Why probability?

2/18
Probability Pairs of random variables

Why probability?

Letters in an English text appear with different frequencies:


e is the most frequent: 12.702%
t is the second most frequent: 9.056%
z is the less frequent: 0.074%
The same happens with words: the 10 most frequent words in English are, in
the given order:

the, be, to, of, and, a, in, that, have, I.

Also for: instructions in a computer program, pixels in an image, samples in a


digitized wave sound, nucleotides in a DNA sequence, data content in a
database, weather forecast, etc.

3/18
Probability Pairs of random variables

Probability

Probability provides a way to model random phenomena: random experiments


are associated to a value within a set of possible outcomes, each occurring with
some probability. For example:
Flip a coin. Two possible outcomes: heads and tails, each with the same
probability 1/2.
Throw a dice. Six possible outcomes: 1 to 6 dots, each with the same
probability 1/6.

Throw two dice and take the


sum. Eleven possible
outcomes: S = {2, . . . , 12}
with probabilities shown in the
table at the right:

Probabilities are numbers in [0, 1] and their sum must be = 1.

4/18
Probability Pairs of random variables

Random variables
Mathematical formalization in terms of random variables: for us, a discrete
random variable X consists of
a discrete set X = {x1 , x2 , . . . , xq } of possible values xi ,
each occurring with a given probability

pi = p(xi ) = Pr(X = xi ) ∈ [0, 1],


Pq
and i=1 p(xi ) = 1.

If xi are numbers, the expected value (or expectation) of X is:


q
X
µX = E[X] = xi p(xi )
i=1

and the variance of X is


q
2  X
= E |X − E[X]|2 = |xi − E[X]|2 p(xi )

V ar(X) = σX
i=1

5/18
Probability Pairs of random variables

Random variables: examples

Bernoulli distribution. X has two outcomes: X = {1, 0}


with probabilities p(1) = p ∈ [0, 1] and p(0) = 1 − p.

Pr(X = x) = p(x) = px (1 − p)(1−x)

When flipping a coin X = {head, tail}. For a fair coin p = 1/2.


Uniform distribution. Y has n outcomes: Y = {y1 , . . . , yq }, each with
the same probability p(yi ) = 1/q. Throwing a dice corresponds to q = 6.
Binomial distribution. Z has n + 1 possible outcomes: Z = {0, 1, . . . , n},
with probabilities:
!
n k
Pr(Z = z) = p(z) = p (1 − p)n−z ,
z

that counts the number of 1’s in n independent samples of a random


variable X with Bernoulli distribution.
If p is the probability of heads in a coin flipping, the average number of
heads when flipping n coins is E[z] = pn.
6/18
Probability Pairs of random variables

A mathematical model for Information Theory

In information theory , a data source is a device that produces letters or


symbols that can be considered as observations of a random variable X taking
values in a finite alphabet X , (|X | = q) with certain probabilities.
Concrete values xi are not important. The relevant information about the
variable X is given by the probability distribution given by q numbers
q
X
p1 , p 2 , . . . , p q with pi ∈ [0, 1] and pi = 1.
i=1

When X is sampled several consecutive times, it produces a text: a string of


letters of the alphabet.
We shall see more sophisticated models that consider strings produced a
sequence X1 X2 X3 . . . of non-independent random variables.

7/18
Probability Pairs of random variables

Pairs of random variables

One may consider two or more random variables X and Y associated to the
same experiment. For example:

The experiment consists of rolling two different dice, and


X and Y are the outcomes of each dice;
X is the outcome of the first dice and Y is the sum;
X is the sum and Y says whether the two outcomes
are equal or different;
X is the sum and Y is the parity of the second dice; etc.

In each situation there are several relevant probability distributions.


Let X = {x1 , . . . , xq } and Y = {y1 , . . . , yr } be their values.

8/18
Probability Pairs of random variables

Joint and marginal probabilities

Everything is governed by the joint probability distribution:


p(xi , yj ) = Pr(X = xi , Y = yj ),
which satisfies:
q r
X X
p(xi , yj ) = 1.
i=1 j=1

The probabilities of each separate variables are called marginal probabilities.


They are computed from the joint distribution as:
r
X
p(xi ) = Pr(X = xi ) = p(xi , yj ),
j=1
q
X
p(yj ) = Pr(Y = yj ) = p(xi , yj ).
i=1

9/18
Probability Pairs of random variables

Conditional probabilities

Conditional probabilities can be defined as


p(xi , yj )
p(xi |yj ) = Pr(X = xi |Y = yj ) := , if p(yj ) 6= 0,
p(yj )
to be read as the probability of xi given yj or under the condition yj .

Must be interpreted as the probability that, in a sample of the pair of variables,


X takes the value xi under the condition that Y takes the value yj .
Analogously,
p(xi , yj )
p(yj |xi ) = Pr(Y = yj |X = xi ) := , if p(xi ) 6= 0,
p(xi )

10/18
Probability Pairs of random variables

Marginal and conditional probability distributions

Theorem
The marginal probabilities defined from a joint probability distribution satisfy:
q q q q
X X X X
p(xi ) = p(xi , yj ) = 1, p(yj ) = 1
i=1 i=1 j=1 j=1

Theorem
The conditional probabilities satisfy:
q r
X X
∀j p(xi |yj ) = 1, ∀i p(yj |xi ) = 1
i=1 j=1

11/18
Probability Pairs of random variables

Independent variables

Two variables X and Y are independent if the joint probability is always the
product of the marginal probabilities:

p(xi , yj ) = p(xi )p(yj ) ∀i, j.

This property is derived from the fact that the conditional probabilities do not
really depend on the condition:

p(xi |yj ) = p(xi ) and p(yj |xi ) = p(yj ) ∀i, j.

The dependency measures the degree of correlation between the two variables.

12/18
Probability Pairs of random variables

Pairs of random variables: examples

Experiment: Let X and Y be the outcomes of rolling two different dices,


X = Y = {1, 2, 3, 4, 5, 6}.
1 1 1
p(xi , yj ) = = · = p(xi )p(yj ) ∀ i, j.
36 6 6
They are independent, and their conditional distributions are:

p(yj |xi ) 1 2 3 4 5 6 p(xj |yi ) 1 2 3 4 5 6


1 1 1 1 1 1 1 1 1 1 1 1
1 6 6 6 6 6 6
1 6 6 6 6 6 6
1 1 1 1 1 1 1 1 1 1 1 1
2 6 6 6 6 6 6
2 6 6 6 6 6 6
1 1 1 1 1 1 1 1 1 1 1 1
3 6 6 6 6 6 6
3 6 6 6 6 6 6
1 1 1 1 1 1 1 1 1 1 1 1
4 6 6 6 6 6 6
4 6 6 6 6 6 6
1 1 1 1 1 1 1 1 1 1 1 1
5 6 6 6 6 6 6
5 6 6 6 6 6 6
1 1 1 1 1 1 1 1 1 1 1 1
6 6 6 6 6 6 6
6 6 6 6 6 6 6

13/18
Probability Pairs of random variables

Pairs of random variables: examples

Let S be the sum of the two dices, S = {2, 3, . . . , 12}.


The joint distribution of X and S is:

p(xi , sj ) 2 3 4 5 6 7 8 9 10 11 12 p(xi )
1 1 1 1 1 1 1
1 36 36 36 36 36 36
0 0 0 0 0 6
1 1 1 1 1 1 1
2 0 36 36 36 36 36 36
0 0 0 0 6
1 1 1 1 1 1 1
3 0 0 36 36 36 36 36 36
0 0 0 6
1 1 1 1 1 1 1
4 0 0 0 36 36 36 36 36 36
0 0 6
1 1 1 1 1 1 1
5 0 0 0 0 36 36 36 36 36 36
0 6
1 1 1 1 1 1 1
6 0 0 0 0 0 36 36 36 36 36 36 6
1 1 1 1 5 1 5 1 1 1 1
p(sj ) 36 18 12 9 36 6 36 9 12 18 36
1

The sum of all table entries = 1 (joint probability distribution).


The marginal probabilities are the sums of rows and columns.

14/18
Probability Pairs of random variables

Pairs of random variables: examples

Conditional distribution of S with respect to X: knowing the first dice,


probability of value of the sum.

p(sj |xi ) 2 3 4 5 6 7 8 9 10 11 12
1 1 1 1 1 1
1 6 6 6 6 6 6
0 0 0 0 0
1 1 1 1 1 1
2 0 6 6 6 6 6 6
0 0 0 0
1 1 1 1 1 1
3 0 0 6 6 6 6 6 6
0 0 0
1 1 1 1 1 1
4 0 0 0 6 6 6 6 6 6
0 0
1 1 1 1 1 1
5 0 0 0 0 6 6 6 6 6 6
0
1 1 1 1 1 1
6 0 0 0 0 0 6 6 6 6 6 6

The sum of each row must be = 1 (conditional probability distributions).

15/18
Probability Pairs of random variables

Pairs of random variables: examples


Conditional distribution of X with respect to S: knowing the sum, probability
of value of first dice.

p(xj |si ) 1 2 3 4 5 6
2 1 0 0 0 0 0
1 1
3 2 2
0 0 0 0
1 1 1
4 3 3 3
0 0 0
1 1 1 1
5 4 4 4 4
0 0
1 1 1 1 1
6 5 5 5 5 5
0
1 1 1 1 1 1
7 6 6 6 6 6 6
1 1 1 1 1
8 0 5 5 5 5 5
1 1 1 1
9 0 0 4 4 4 4
1 1 1
10 0 0 0 3 3 3
1 1
11 0 0 0 0 2 2
12 0 0 0 0 0 1

16/18
Probability Pairs of random variables

Bayes’ Theorem

What is the relation between p(xi |yj ) and p(yj |xi )? It is given by the Bayes’
theorem:

p(yj |xi )p(xi )


p(xi |yj ) =
p(yj )
where

q q
X X
p(yj ) = p(xi , yj ) = p(yj |xi )p(xi )
i=1 i=1

Example: When getting the result of a medical test on AIDS, what can be said
about our condition? Take as random variables:

Patient condition: X ∈ {healthy, sick}


Result of the test: Y ∈ {−, +}

17/18
Probability Pairs of random variables

Bayes’ Theorem

Use Bayes’ theorem and check the dependency with the prevalence p(sick).
Get a result using AIDS prevalence data in Spain. 18/18

You might also like