[go: up one dir, main page]

0% found this document useful (0 votes)
9 views3 pages

Handout2 BasicsOf Random Variables

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 3

PKM 03-09-2021

The very basics of random variable

Roughly speaking a random variable (in short, r.v.) is a numerical characteristic related to a
(chance related) experiment or phenomenon. Random variables (the characteristics themsleves)
are usually denoted by capital letters, such as, X, Y, Z, whereas the observed numerical values
of such a characteristic are denoted by small letters, such as, x, y, z. When the “characteristic”
X is observed with the numerical value x , we say “X has taken the value x”.

Since the experiment is chance related, one cannot say, before the experiment actually takes
place, which numerical/characteristic value will appear. However, from the description of the
experiment one will usually know the totality, i.e., the set of all possible values that may occur.
In other words, one cannot say beforehand which particular value a r.v. X will take, but will
know the total range of values it may take. That is why, we characterize a r.v. by the possible
values it may take and the corresponding probabilities. We call this the probability distribution.

Formally, the probability distribution of a r.v. X is a mathematical object which provides

(i) the set of possible values X can take and

(ii) (a way to calculate) the probability of any event related to X.

A probability distribution can be given/expressed in many different ways, graphically or in terms


of formula’s, as long as they provide both (i) and (ii) above.

We often distinguish two types of random variables because the mathematical analysis related
to them (often) proceeds differently. Discrete r.v. is one which takes either finitely many or
countably infinite number of values. Another way of saying this is that one can make a list of
these values (the first one, the second, the third, etc.) Continuous r.v. on the other hand takes
uncountably infinite number of values, e.g., all values in an interval.

Discrete Distribution: probability mass function


The most common way of expressing the probability distribution of a discrete r.v. is in the form
of a table or formula providing all possible (values) x’s that X may take and the corresponding
probabilities p(x) = P (X = x). For example:
x 0 1 2 3
or equivalently, P (X = x) = 0.1 (1 + x), x = 0, 1, 2, 3.
p(x) 0.1 0.2 0.3 0.4

To calculate probability related to X, one adds up the individual probabilities. For example,
X
P (1 < X ≤ 3) = 0.3 + 0.4 = 0.7; and in general, P (X ∈ A) = p(x).
x∈A

The formula p(x), together with the possible values of x, is called the probability mass function,
in short, p.m.f.
P
Note that being a probability, p(x) ≥ 0 and the total probability must be one, i.e. x p(x) = 1.

Continuous Distribution: probability density function


For continuous r.v., the probability distribution is given by a curve/graph, with the interpreta-
tion that X takes values in the intervals where the graph is positive. Furthermore, probability
can be calculated by calculating an appropriate area under the curve. This is equivalent to pro-
viding a formula/function f (x) with explicit mention of the domain, i.e., the interval on which

1
the formula holds. From the domain, it will be clear which values X takes. For example:

f (x)
(
1 1
3 3 1 ≤ x ≤ 4,
or equivalently, f (x) =
0 otherwise.

1 4 x

From above we deduce that X takes values in the interval [1, 4] and
1 2
P (0 < X ≤ 3) = Area between (0,3) = Area between (1,3) = · (3 − 1) = ,
3 3
or equivalently, Z 3 Z 1 Z 3
1 2
P (0 < X ≤ 3) = f (x) dx = 0 dx + 3 dx = .
0 0 1 3
In general, one can obtain the probability by integrating the function over a suitable region:
Z
P (X ∈ A) = f (x) dx.
x∈A

The formula f (x), together with the range of x where this holds, is called the probability density
function, in short, p.d.f.
Note that we must have, f (x) R ∞≥ 0, i.e., the curve must lie above the x-axis and the total
probability must be one, i.e., −∞ f (x) dx = 1.

Cumulative Distribution Function


Another very useful mathematical object in the study of random variables is the cumulative
distribution function, in short, c.d.f. This quantity is applicable to both discrete and continuous
r.v. (distribution).
The cdf of X is a function F (·) on the real-line R, given by F (x) = P (X ≤ x). Usually, all
the statistical tables provide these sort of cumulative probabilities. Probabilities of the form
P (a < X ≤ b) are very easy to obtain from the c.d.f., namely,

P (a < X ≤ b) = P (X ≤ b) − P (X ≤ a) = F (b) − F (a).

One can also obtain the p.m.f. or p.d.f. easily from the c.d.f.:

• Discrete/pmf : possible values are where F (x) has jumps, and p(x) is then the jump-size.

• Continuous/pdf: f (x) = F 0 (x) [ = d


dx F (x)], and possible values are where f (x) is positive.

2
Features of random variable or probability distribution

To decide which r.v. (probability distribution) is suitable to model real-data, one compares dif-
ferent properties/characteristics of the data to the known properties/characteristics of different
r.v.’s and takes the one which fits the best. So, one needs to know/study different proper-
ties/characteristics of r.v.’s (or probability distributions).

The most commonly studied characteristics are the so called expectation (or mean) and variance.

Definition [Expectation]: Expected value of a r.v. X, denoted by E(X), is defined as


X X
Discrete: E(X) = x p(x) = x P (X = x)
x x
Z ∞
Continuous: E(X) = x f (x) dx,
−∞

provided the sum or integral exists. [There are situations when expectation does not exist.]

Definition [Variance]: Suppose X is a r.v. whose expectation exists and E(X) = µ, say.
Then the variance of X, denoted by Var(X), is defined as
X
Discrete: Var(X) = (x − µ)2 P (X = x)
x
Z ∞
Continuous: Var(X) = (x − µ)2 f (x) dx,
−∞

provided the sum or integral exists.

It can be shown that expectation of a transformed random variable, g(X), can be calculated by
similar formula as for E(X), provided the former exists, of course:
X Z ∞
 
E g(X) = g(x) P (X = x) [discrete] or g(x) f (x) dx [continuous].
x −∞

Note that, in general, E g(X) 6= g E[X] . For example, E eX 6= eE(X) .


   
 
Using the E g(X) formula, one can derive an alternative formula for variance, which is some-
times easier to use than the definition.
Var(X) = E(X 2 ) − (E(X))2 .
Furthermore, the following useful formulas can also be derived:
Var aX + b = a2 Var(X).
 
For a, b ∈ R, E aX + b = a E(X) + b and

Expected value indicates, in some sense, the “center” of all the values the r.v. takes. The variance
provides the “dispersion” or “spread” of the (probability) distribution around that central value.

Two other often used characteristics of r.v. are skewness, which measures the “symmetry” of
the distribution and kurtosis, which measures the “thickness of the tail” of the distribution.
They are defined as follows.
Definitions: Suppose X is a r.v. with E(X) = µ and Var(X) = σ 2 . Then
E[(X − µ)3 ] E[(X − µ)4 ]
Skewness = and Kurtosis = .
σ3 σ4

You might also like