Handout2 BasicsOf Random Variables
Handout2 BasicsOf Random Variables
Handout2 BasicsOf Random Variables
Roughly speaking a random variable (in short, r.v.) is a numerical characteristic related to a
(chance related) experiment or phenomenon. Random variables (the characteristics themsleves)
are usually denoted by capital letters, such as, X, Y, Z, whereas the observed numerical values
of such a characteristic are denoted by small letters, such as, x, y, z. When the “characteristic”
X is observed with the numerical value x , we say “X has taken the value x”.
Since the experiment is chance related, one cannot say, before the experiment actually takes
place, which numerical/characteristic value will appear. However, from the description of the
experiment one will usually know the totality, i.e., the set of all possible values that may occur.
In other words, one cannot say beforehand which particular value a r.v. X will take, but will
know the total range of values it may take. That is why, we characterize a r.v. by the possible
values it may take and the corresponding probabilities. We call this the probability distribution.
We often distinguish two types of random variables because the mathematical analysis related
to them (often) proceeds differently. Discrete r.v. is one which takes either finitely many or
countably infinite number of values. Another way of saying this is that one can make a list of
these values (the first one, the second, the third, etc.) Continuous r.v. on the other hand takes
uncountably infinite number of values, e.g., all values in an interval.
To calculate probability related to X, one adds up the individual probabilities. For example,
X
P (1 < X ≤ 3) = 0.3 + 0.4 = 0.7; and in general, P (X ∈ A) = p(x).
x∈A
The formula p(x), together with the possible values of x, is called the probability mass function,
in short, p.m.f.
P
Note that being a probability, p(x) ≥ 0 and the total probability must be one, i.e. x p(x) = 1.
1
the formula holds. From the domain, it will be clear which values X takes. For example:
f (x)
(
1 1
3 3 1 ≤ x ≤ 4,
or equivalently, f (x) =
0 otherwise.
1 4 x
From above we deduce that X takes values in the interval [1, 4] and
1 2
P (0 < X ≤ 3) = Area between (0,3) = Area between (1,3) = · (3 − 1) = ,
3 3
or equivalently, Z 3 Z 1 Z 3
1 2
P (0 < X ≤ 3) = f (x) dx = 0 dx + 3 dx = .
0 0 1 3
In general, one can obtain the probability by integrating the function over a suitable region:
Z
P (X ∈ A) = f (x) dx.
x∈A
The formula f (x), together with the range of x where this holds, is called the probability density
function, in short, p.d.f.
Note that we must have, f (x) R ∞≥ 0, i.e., the curve must lie above the x-axis and the total
probability must be one, i.e., −∞ f (x) dx = 1.
One can also obtain the p.m.f. or p.d.f. easily from the c.d.f.:
• Discrete/pmf : possible values are where F (x) has jumps, and p(x) is then the jump-size.
2
Features of random variable or probability distribution
To decide which r.v. (probability distribution) is suitable to model real-data, one compares dif-
ferent properties/characteristics of the data to the known properties/characteristics of different
r.v.’s and takes the one which fits the best. So, one needs to know/study different proper-
ties/characteristics of r.v.’s (or probability distributions).
The most commonly studied characteristics are the so called expectation (or mean) and variance.
provided the sum or integral exists. [There are situations when expectation does not exist.]
Definition [Variance]: Suppose X is a r.v. whose expectation exists and E(X) = µ, say.
Then the variance of X, denoted by Var(X), is defined as
X
Discrete: Var(X) = (x − µ)2 P (X = x)
x
Z ∞
Continuous: Var(X) = (x − µ)2 f (x) dx,
−∞
It can be shown that expectation of a transformed random variable, g(X), can be calculated by
similar formula as for E(X), provided the former exists, of course:
X Z ∞
E g(X) = g(x) P (X = x) [discrete] or g(x) f (x) dx [continuous].
x −∞
Expected value indicates, in some sense, the “center” of all the values the r.v. takes. The variance
provides the “dispersion” or “spread” of the (probability) distribution around that central value.
Two other often used characteristics of r.v. are skewness, which measures the “symmetry” of
the distribution and kurtosis, which measures the “thickness of the tail” of the distribution.
They are defined as follows.
Definitions: Suppose X is a r.v. with E(X) = µ and Var(X) = σ 2 . Then
E[(X − µ)3 ] E[(X − µ)4 ]
Skewness = and Kurtosis = .
σ3 σ4