Chapter 1 .
Description of an ITS and basics in Information Theory
I. Global Scheme of an Information Transmission System (I.T.S.)
SOURCE
Base band signal
ENCODER CHANNEL ENCODER MODULATION
𝒃𝒊 , 𝒃𝒊𝒏𝒂𝒓𝒚 𝑺𝒆𝒒𝒖𝒆𝒏𝒄𝒆 TRANSMITTER
CHANNEL +
NOISE
𝒓𝒆𝒄𝒆𝒊𝒗𝒆𝒅 𝒃𝒊 , 𝒃𝒊𝒏. 𝒔𝒆𝒒.
Y(t)+ n(t)
DESTINAT-
ION
SOURCE CHANNEL DEMODULATION
DECODING DECODING + (filtering, etc.)
RECEIVER
II. Description :
The source encoding goal is to produce at its output a base band signal from the Binary Sequence feeding at its input.
➢ The goal of the channel encoder (Transmitter) and its decoder at the receiver is to reproduce with high
Fidelity (relyability) the binary sequence sent out from the source to the destination,
➢ The separation between source coding and channel coding does not limit the system performances.
III Sources and Source coding
1. Sources:
𝑨𝒏𝒂𝒍𝒐𝒈 𝑺𝒐𝒖𝒓𝒄𝒆𝒔
❑ Different sources ቊ
𝑫𝒊𝒔𝒄𝒓𝒆𝒕𝒆 𝑺𝒐𝒖𝒓𝒄𝒆𝒔 (𝑫𝒊𝒈𝒊𝒕𝒂𝒍 𝒔𝒐𝒖𝒓𝒄𝒆𝒔)
❑ In this course, we will address only Discrete or Digital sources.
𝑴𝒆𝒎𝒐𝒓𝒚 𝑺𝒐𝒖𝒓𝒄𝒆𝒔 (𝑴𝒂𝒓𝒌𝒐𝒗)
❑ Among discrete sources, we distinguish : Discrete Sources ⟼ ቊ
𝑴𝒆𝒎𝒐𝒓𝒚𝒍𝒆𝒔𝒔 𝒔𝒐𝒖𝒓𝒄𝒆𝒔
In this course, we will priorily study memoryless sources (DMS= Discrete Memoryless sources)
2-Definition: A discrete source is said to be memoryless i.o.i its different delivered symbols are statistically independent.
Example:
Let us consider the output of a D.M.S. which delivers a sequence of letters or symbols belonging to a finite Alphabet. This
can be noted by:
𝑨𝑺 = 𝑺𝒐𝒖𝒓𝒄𝒆 𝑨𝒍𝒑𝒉𝒂𝒃𝒆𝒕 = 𝒂𝟏 , 𝒂𝟐 , 𝒂𝟑 , … , 𝒂𝒏 = 𝒂𝒊 , 𝒊 = 𝟏, … , 𝒏
*- Each letter or symbol is randomly chosen following a well defined law of probability.
**-The letters sequence or the symbols sequence constitutes a word and several words form a message.
***- A word basis of high size or dimension constitutes a DICTIONARY
Particular case:
A simple example of a DMS is a binary source whose Alphabet is simply 𝑨𝑺 = 𝟎, 𝟏 and the probabilities
attached to the occurrence of both symbols are called a priory probabilities and are noted:
𝑷𝑿=𝟏 =𝒑
ቊ
𝑷 𝑿=𝟎 =𝟏−𝒑=𝒒=𝟏−𝒑 𝑿=𝟏
Observations:
𝟏
▪ When both binary symbols are equally-likely 𝑷 𝑿 = 𝟎 = 𝐏 𝑿 = 𝟏 = Tapez une équation ici. , then
𝟐
both are named “Bits”
▪ When both probabilities are different , both symbols are called “Digits”:
▪ An M-ary source is defined following its proper Alphabet : 𝑨𝑺 = 𝟎, 𝟏, . . . , 𝑴 − 𝟏 with M diff. Symbols.
▪ if 𝑴 = 𝟐𝒌, then each symbol of 𝑨𝑺 can be expressed by k binary symbols.
▪ Example: For k=2, M=4, 𝑨𝑺 = 𝟎, 𝟏, 𝟐. 𝟑 ( contains four symbols, each represented by 2 binary
symbols (bits, or digits). So we get the following binary table:
▪ S.N. Coding
0 00
1 01
2 10
3 11
Example 2 : For k=3, we get the following table, where k=3
Symbol Number Coding
0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111
▪ The most usual source model is this of a random process 𝑿𝟏 , 𝑿𝟐 , . . . , 𝑿𝒏 with n random Variables and its
respective alphabet 𝑨𝑺 = 𝒙𝟏, 𝒙𝟐 , … , 𝒙𝒏 .
In such a process the source is then called a memoryless one when :
𝑷 𝑿𝒏 = 𝒙𝒊 /𝑿𝒏−𝟏 , 𝑿𝒏−𝟐 ,..…., 𝑿𝟏 = 𝐏 𝑿𝒏 = 𝒙𝒊 , that means that the probability of the conditional event
is independent from those of previous symbols or events.
▪ A discrete source is said to be Stationary when the transmission probabilities of the symbols are written as:
𝑷 𝒙𝒊,𝒏 = 𝐏 𝒙𝒊,𝒏+𝒌 , 𝐂𝐞𝐜𝐢: ∀ 𝒌 , that´s to say that the statistical properties of this source do not depend on
time
III.- Information Measurement: (Information Quantity)
1. Definitions:
❑ The information measure or information quantity that a source can deliver increases with the increasing of the
messages number.
❑ The information quantity or measure delivered by a message is bonded to the occurrence probability of the same
message.
❑ From a given source, more a message is uncertain (less probable) more is its quantity or measure of information:
Intuitively we can then deduce from this observation that the information measure or quantity of information that
a message can deliver is proportional to the inverse Probability of occurrence of the message. We then obtain:
Let I denotes the information measure or the quantity of information linked to the message 𝒙𝒊 , then :
𝟏 𝟏 𝟏
𝑰 𝑿 = 𝒙𝒊 = 𝑰 𝒙𝒊 = 𝑰 𝒊 = 𝒍𝒐𝒈𝟐 𝑷(𝒊) = 𝒍𝒐𝒈𝟐 𝑷 𝑿=𝒙 = 𝒍𝒐𝒈𝟐 = −𝒍𝒐𝒈𝟐 𝑷𝒊
൝ 𝒊 𝑷𝒊
𝑷𝒊 = 𝑷 𝑿 = 𝒙𝒊 > 𝟎
Then 𝑰 𝑿 = 𝒙𝒊 is called Self-Information by Source Symbol.
2- Remarks:
➢ The logarithm basis gives the unity name of the Information measure.
𝒊𝒇 𝒃𝒐𝒕𝒉 𝒆𝒗𝒆𝒏𝒕𝒔 𝒂𝒓𝒆 𝒆𝒒𝒖𝒂𝒍𝒍𝒚 − 𝒍𝒊𝒌𝒆𝒍𝒚 , 𝒕𝒉𝒆𝒏, 𝒕𝒉𝒆 𝑰 − 𝒖𝒏𝒊𝒕𝒚 𝒊𝒔 𝒕𝒉𝒆: 𝑩𝒊𝒕/𝑺𝒚𝒎𝒃𝒐𝒍
▪ In basis 2 ቊ
𝒊𝒇 𝒃𝒐𝒕𝒉 𝒆𝒗𝒆𝒏𝒕𝒔 𝒂𝒓𝒆 𝒏𝒐𝒕 𝒆𝒒𝒖𝒂𝒍𝒍𝒚 − 𝒍𝒊𝒌𝒆𝒍𝒚, 𝒕𝒉𝒆 𝑰 − 𝒖𝒏𝒊𝒕𝒚 𝒊𝒔 𝒕𝒉𝒆: 𝑫𝒊𝒈𝒊𝒕/𝑺𝒚𝒎𝒃𝒐𝒍
▪ In Neperian (Neper) basis e=2.7182818….., the Information unit is the Natural Unit Called the Nit of information
also called in honor of HARTLEY, the Hartley = 1 Nit of Information/Symbol ( /=per )
▪ If the basis of the logarithm is 10, then the Information unit is called “dit” of Information /Symbol.
I.V) Memoryless Source Entropy (1-Definition)
A source can transmit n symbols or messages with very different self-Information
𝒙𝟏 ⟼ 𝑰 𝒙𝟏
𝑿𝑺 ൞ 𝒙𝟐 ⟼ 𝑰 𝒙𝟐 with: 𝑰 𝒙𝟏 ≠ 𝑰 𝒙𝟐 ≠, . . . , ≠ 𝑰 𝒙𝒏
𝒀𝑺 𝒙𝒏 ⟼ 𝑰 𝒙𝒏
SOURCE
To characterize the source, we only consider the statistical Average (Moyenne Statistique) which in the discrete
Case is defined by:
𝝁𝑿 = 𝑬 𝑿 = 𝒙𝒊 = σ𝒙 𝒙𝒈𝑿 𝒙 (Classical Definition)
Considering here that the variable is the self Information 𝑰 𝒙𝒊 instead of X, then, we can write :
𝝁𝑰 𝒙𝒊 = 𝑬 𝑰 𝒙𝒊 = σ𝒏𝒊=𝟏 𝑰 𝒙𝒊 𝑷𝒊 = σ𝒏𝒊=𝟏 −𝒍𝒐𝒈𝟐 𝑷𝒊 𝑷𝒊 ,
This average represents the Source Entropy of a D.M.S.
This source entropy is then noted :
𝑯 𝑿 = 𝑺𝒐𝒖𝒓𝒄𝒆 𝑬𝒏𝒕𝒓𝒐𝒑𝒚 𝒐𝒇 𝒂 𝑫. 𝑴. 𝑺.
2. Properties: A source Entropy has several properties namely:
❑ We can prove, that: 𝟎 ≤ 𝑯 𝑿 ≤ 𝒍𝒐𝒈𝟐 𝒏 .
❑ We can also prove that : if the Symbol of index i has a probability of 𝑷 𝑿 = 𝒙𝒊 = 𝟏, which means that
𝒙𝒊 : is the certain event, then : 𝑯 𝑿 = 0 , because the certain event does not transmit any kind of information, its
information average is null.
❑ Maximum Value of the source entropy:
This value can be attained when all of the transmitted symbols from the source are equally-likely: the mass function
then follows a discrete uniform law :
𝟏 𝒏 𝟏 𝟏
𝑷 𝑿 = 𝒙𝒊 = 𝑷𝒊 = ⟹ 𝑯 𝑿 = − σ𝒊=𝟏 𝒍𝒐𝒈𝟐 = 𝒍𝒐𝒈𝟐 𝒏 = 𝑯𝒎𝒂𝒙 𝑿
𝒏 𝒏 𝒏
So, the first property is proved.
3. Average source flowrate (Throughput) and redundancy of a D.M.S
a) Average source flowrate :
The average source flowrate also called Information flowrate of the DMS is the mean flowrate expressed in
𝒃𝒊𝒕𝒔
ഥ 𝑺 = 𝑹 = 𝒓𝑺. 𝑯 𝑿 𝒆𝒏
𝑫
bits/s and is given by : ቐ 𝒔 ,
𝒓𝑺 = 𝒔𝒐𝒖𝒓𝒄𝒆 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 𝒐𝒓 𝒓𝒚𝒕𝒉𝒎
b) Source Redundancy:
The redundancy of a discrete DMS means the average information quantity that repeats itself in the source
This quantity of information is defined by:
𝑹𝒓𝒆𝒅 = 𝑯𝒎𝒂𝒙 𝑿 − 𝐇 𝑿 = 𝒍𝒐𝒈𝟐 𝒏 − 𝑯 𝑿 :
V: MUTUAL INFORMATION , JOINT INFORMATION
1. Definitions
The mutual Information is the quantity of information that delivers a received signal on the source
transmitted signal. We work then in in the joint spaces with only two variables X,Y given the bivariate
Joint Probability Mass Function:𝒈𝑿,𝒀 𝒙, 𝒚 = 𝐏 𝑿 = 𝒙𝒊 , 𝒀 = 𝒚𝒋 .
➢ The random variables X and Y are respectively associated to the alphabets 𝑿 = 𝒙𝟏, 𝒙𝟐, … , 𝒙𝒏 and
𝒀 = 𝒚𝟏, 𝒚𝟐 , … , 𝒚𝒏 where X is delivered by a source and Y is the received one by the user.
➢ The realization of the event 𝒀 = 𝒚𝒋 (received), transforms the a priori probability 𝑷 𝑿 = 𝒙𝒊 to the
a posteriori probability 𝑷 𝑿 = 𝒙𝒊 /𝒀 = 𝒚𝒋 . The reception of 𝒚𝒋 allows getting information on the
source .
➢ For n symbols that the source can transmit are associated n a posteriori conditional probabilities
𝑷 𝑿 = 𝒙𝒊 /𝒀 = 𝒚𝒋 . Under these conditions the Mutual Information associated to the joint event
(𝑿 = 𝒙𝒊 /𝒀 = 𝒚𝒋 ), Mutual Information on 𝒙𝒊 when we receive 𝒚𝒋 is given by :
𝑷 𝒙𝒊 /𝒚𝒋
𝑰 𝒙𝒊 , 𝒚𝒋 = 𝒍𝒐𝒈𝟐 𝑷 𝒙𝒊
we also know from Bayes’s principle that:
𝑷 𝒙𝒊 ,𝒚𝒋
𝑷 𝒙𝒊 /𝒚𝒋 = 𝑷 𝒚𝒋
, so we can substitute it in the previous equality to get the final expression of the
𝑷 𝒙𝒊 /𝒚𝒋 𝑷 𝒙𝒊 ,𝒚𝒋
Mutual Information : 𝑰 𝒙𝒊 , 𝒚𝒋 = 𝒍𝒐𝒈𝟐 = 𝐈 𝒙𝒊 ∩ 𝒚𝒋 = 𝒍𝒐𝒈𝟐 (with:
𝑷 𝒙𝒊 𝑷 𝒙𝒊 𝑷 𝒚 𝒋
𝑷 𝒙𝒊 > 𝟎 𝒂𝒏𝒅 𝑷 𝒚𝒋 > 𝟎
Remark:
We can easily show that the last expression of the mutual information is symmetrical, so we
𝑷 𝒊 ∩𝒋
get: 𝑰 𝒙𝒊 , 𝒚𝒋 = 𝑰 𝒚𝒋 , 𝒙𝒊 = 𝑰 𝒊 ∩ 𝒋 = 𝐈 𝒋 ∩ 𝒊 = 𝒍𝒐𝒈𝟐
𝑷𝒊𝑷𝒋
2. Properties :
➢ 𝑰𝒇 𝑰 𝒙𝒊 , 𝒚𝒋 = 𝟎 = 𝐈 𝒊 ∩ 𝒋 ⟹ 𝒙𝒊 is independent from 𝒚𝒋 ⇒ 𝑷 𝒙𝒊 , 𝒚𝒋 = 𝑷 𝒙𝒊 𝑷 𝒚𝒋
or 𝑷 𝒊 ∩ 𝒋 = 𝑷 𝒊 𝑷 𝒋 and event i is independent from j
➢ If 𝑰 𝒙𝒊 , 𝒚𝒋 = 𝐈 𝒊 ∩ 𝒋 > 𝟎, means that : if one event is realized the probability of occurrence
of the second event rises up.
➢ 𝑰 𝒙𝒊 , 𝒚𝒋 = 𝐈 𝒊 ∩ 𝒋 < 𝟎, if one event is realized then the probability of occurrence of second
decreases to zero making the mutual information negative.
VI. Self Conditional Information:
In the same joint spaces, we can also define the Self Conditional Information of DMS. By this
information, we are in fact observing the information attached to the realization of the event 𝒙𝒊
Knowing that the event 𝒚𝒋 has already been realized.
𝟏
𝑰 𝒙𝒊 /𝒚𝒋 = 𝒍𝒐𝒈𝟐 = −𝒍𝒐𝒈𝟐 𝑷 𝒙𝒊 /𝒚𝒋
𝑷 𝒙𝒊 /𝒚𝒋
VII. Bivariate Average Mutual Information:
Bivariate Average Mutual Information is taken in the joint Probability space of two variables X, Y and
given by:
𝑷 𝒙𝒊 ,𝒚𝒋
𝑰 𝑿 ∩ 𝒀 = 𝑬 𝑰 𝒙𝒊 , 𝒚𝒋 = σ𝒏𝒊=𝟏 σ𝒎
𝒋=𝟏 𝑰 𝒙𝒊 , 𝒚𝒋 𝑷 𝒙𝒊 , 𝒚𝒋 = σ𝒏𝒊=𝟏 σ𝒎
𝒋=𝟏 𝑷 𝒙𝒊 , 𝒚𝒋 𝒍𝒐𝒈𝟐 𝑷 𝒙𝒊 𝑷 𝒚𝒋
➢ This mutual information verifies the following properties:
➢ 𝑰 𝑿 ∩ 𝒀 = 𝟎 , means that both variables are independent one another
➢ 𝑰 𝑿 ∩ 𝒀 ≥ 𝟎 always.
VIII. Bivariate Joint Entropy and Joint conditional Entropy :
1.- Bivariate joint Entropy: defined by:
𝒎
𝒏 𝟏
𝑯 𝑿∩𝒀 = 𝑷 𝒙𝒊 , 𝒚𝒋 𝒍𝒐𝒈𝟐 =𝐇 𝒀∩𝑿
𝒊=𝟏 𝑷 𝒙𝒊 , 𝒚𝒋
𝒋=𝟏
2- Joint Conditional Entropies :
These joint conditional Entropies are given by:
𝒏 𝒎
𝒏 𝒎 𝟏
𝑯 𝑿/𝒀 = 𝑷 𝒙𝒊 /𝒚𝒋 . 𝑷 𝒚𝒋 𝒍𝒐𝒈𝟐 = −𝑷 𝒙𝒊 , 𝒚𝒋 𝒍𝒐𝒈𝟐 𝑷 𝒙𝒊 /𝒚𝒋
𝒊=𝟏 𝒋=𝟏 𝑷 𝒙𝒊 /𝒚𝒋
𝒊=𝟏 𝒋=𝟏
3.- Properties:
Exercise 1: With the help of your faculty in Exercises Sessions (TD), prove the following relations
a)- 𝑰 𝒙𝒊 , 𝒚𝒋 = 𝑰 𝒙𝒊 − 𝐈 𝒙𝒊 /𝒚𝒋
b)- Show that the Average Mutual Information previously defined, can be rewritten in terms of
the source Entropy and he conditional entropy and hence :
𝑰 𝑿, 𝒀 = 𝑰 𝒀, 𝑿 = 𝑯 𝑿 − 𝑯 𝑿 /𝒀 = 𝑯 𝒀 − 𝑯 𝒀 /𝑿
Exercise 2
Show that:
1. 𝑰 𝑿, 𝒀 = 𝑰 𝒀, 𝑿 = 𝐇 𝑿 + 𝑯 𝒀 − 𝐇 𝑿 ∩ 𝒀
2.𝐇 𝑿 ∩ 𝒀 = 𝐇 𝑿 + 𝐇 𝒀 − 𝑰 𝑿, 𝒀
3. 𝑯 𝑿 /𝒀 ≤ 𝐇 𝑿 and 𝑯 𝑿 /𝒀 ≤ 𝑯 𝑿 + 𝐇 𝒀
4. 𝟎 ≤ 𝑰 𝑿, 𝒀 ≤ 𝐇 𝑿