Sta Statistical Theory
Sta Statistical Theory
LECTURE NOTE
MR D.P SALAMA
MONDAY (0-12)
1
CHAPTER ONE:
Random Variable:
Remember that in the previous work, we defined the concept of an experiment and
its associated experimental outcomes. A random variables provides a means for
describing experimental outcomes using numerical values. Random variables must
assume numerical values.
Definition:
Random variable will be denoted by upper case letters such as X, Y and Z. The
actual memerical values that a random variable can assume will be denoted by
lower case letters such.as x, y and z.
2
A random variable can be classified as being either discrete or continuous
depending on the numerical values it assumes.
A random variable that may assume either of values such as 0,1,2…. is referred to
as a discrete random variable.
3
Examples of continues random variables:
.
place (Min 150 F, max
212 F
Example 1:
Consider the experiment of testing a coin two time. Let X be the random variable
giving the number of tails obtained and Y be the sum of head obtained.
4
Solution:
Example 2
Solution:
S = 1, 2, 3, 4, 5, 6
Then we have:
X = Ѕ 0, 1
5
Discrete probability distributions:
The probability distribution for a random variable describes how probabilities are
distributed over the values of the random variable. For a discrete random variable
X, the probability distribution is defined by a probability function, denoted by F
(x). The probability function provides the probability for each value of the random
variable.
The probability mass function, sometimes called the probability mass function of
X, is used to denote the idea that a mass of probability is piled up at discrete points.
It is often very convincing to list the probability for a discrete random variable on a
table.
The function above is called the probability mass variable X is a function which
associate a real number to each value of X. Hence F x (x) is a function with domain
the interval [O, 1]
1. 0 ¿ Fx (x) ¿ 1
2. Ʃ Xj ¿ 0 Fx (x) = 1
6
3. Fx (x) = 0, if x = xj
Example 1:
Solution
Using property 2
Ʃ
Fx (x) = 1
x j¿
Ʃ
Cx=1
x j¿
Ʃ
C x j ¿ x =1
So that C (0+1+2) = 1
3c = 1
1
Hence: c = 3
x
Hence the P.M.F is: Fx (x) = 3 X = 0, 1, 2.
Example 2:
7
An experiment consists of tossing a fair win three times. Let the random
variable X be the sum of heads obtain. Find the probability mass function
(P.M.F) of X.
Solution:
Then
X (HHT,HTH,THH) = 2: X (HHH) = 3
1 3 3 1
F1 (0) = 8 F2 (1) = 8 F3 (2) = 8 F4 (3) = 8
x 0 1 2 3
Fx (x) 1 3 3 1
8 8 8 8
8
iv. Poission Distribution
v. Discrete uniform Distribution
vi. Negative Binomial Distribution
Now:
If X is a random variable which takes the value 1 when the outcome is a success
and value 0 when the outcome is failure with probability P. and q = 1-p,
respectively. Then X is called a Bernoulii random variable with P.M.F:
F (x:P) = Px q1-x if x = 0 or 1
o. otherwise (elsewhere)
or P (x) = Px (1 - p), x = o, 1
9
(I = 1,2, …,n0 if the experiment is a success and X i = 0 if the experiment result in
a failure.
The p.m.f of X called the binominal distribution. if the random variable follows
the binominal distribution with parameters n and p, we write. X B(n,p). the
probability of getting exactly x success out of n trials is given by the probability
mass function (P.MF) as:
n
f (x:n, p) = ( x ) Px q n-x
n
Where the binomial coefficient is ( x ) or C (n, x), or n Cx
When n =1, then the binomial distribution become the Bernoulli distribution.
10
Example 1:
Solution
1
n = 3, P = 2
Then;
n
p (x= 0) = x Px qn-x
n
Where x = n Cx; x = 0 and n = 3
1 1
Thus: P(x = 0) = 3C0 ( 2 )0 ( 2 )3
1
=8
1 1 3
P(x = 1) = 3C, ( 2 )1 ( 8 )2 = 8
1 1 3
P (x = 2) = 3C2 ( 8 )2 ( 2 )1 = 8
X 0 1 2 3
P (x) 1 3 3 3
8 8 8 8
11
Example 2:
Given that there are five children in a family find the probability that there are:
Solution
1
Then n = 5, and P = q = 2
12
Suppose a Bernouli trial is repeated until r successes are observed. Let the random
variable X be the number of failures before the rth success occurs. Then rt x
denoted the total number of trials needed to get exactly r successes and x failures.
If the probability of success on each trial is P and failure q= 1 – p, then X is called
a negative binomial random variable. This regular distribution is also called pascal
distribution, since r is restricted to be a integer.
Suppose a Bernoulli trial is repeated several independent times. This gives rise to
an infinite sequence of Bernoulli trials. Let the random variable X be the number
of trials needed to get the first success. if the probability of a success on each trial
is p and a failure is q = 1 –p, then x is called a geometric random variable.
The probability mass function (P.M.F) of X for this first kind is:
k m−5
P(Y=P)= ( s ) ( r−5 )
m
=K M-K
r
C C
S
MC
Defined as the number of trial before the first success, then the P.M.F of this
second kind is:
13
F(x) = Pqx , for x = 0,1,2,…
6- THE HYPERGEAMETRIC DISTRIBUTION
This is a discrete probability distribution that describes the number of success in a
sequence of n draws from a finite population N without replacement.
If detectives and N-D non-defectives. Suppose a sample of n distinct objects is
drawn from the batch and interest is on the probability of selecting exactly x
defective from the D defective and n-x non-defectives from N-D non-defectives.
This is known as hypergeametric experiment. The probability distribution of the
hypergeametric random variable X is called the hypergeametric distribution. it will
be denoted by F(x: N,n,D) where the parameters are N,D and n.
The probability mass function of the hypergeametric random variables is given by:
D
F(x;N,D,n) = ( x ) (N-D)
n=k , for x = o,1,2,…
N
(n)
0 otherwise.
This distribution possess a unique property in that:
F(x; N.D.n) = F (n – x; N, N-D, n)
i.e. The probability of getting x successes out of n samples drawn, equals the
probability of getting (n-x) failures out of the same number of samples.
7- THE POISSON DISTRIBUTION
If an experiment involves the number of occurrences of an event in a given time
interval or in a specified region, then it is called a poisson experiment. The given
time interval may be any length such as minute, hour, a day, a week, a month, or
even a year while the specified region could be a line segment, an area, or volume.
Let the random variable X be the number of occurrences in a poisson experiment,
then x is called a poisson random variable.
14
Poisson distribution is one of the simplest and perhaps most frequently used
probability distributions to model the time instants at which events occurs/ it is
used in calculating the probability of the number of events occurring in a fixed
period of time. If these events occur with known average rate and are independent
of the time since the last event occurred.
The probability that there are exactly x occurrence of an event during such time
interval has the probability mass functions.
F(x; x) = e -ʎ ʎx , x = 0,1,2,…
x!
Where v is the average number of occurrence in the given time interval or
specified region.
If event in a poisson process occur at an average rate of ʎ per unit, then the
expected number of occurrence X in the interval of length t has the poisson P.d-f:
F(x; ʎ) = e -ʎ ʎt (ʎt)x , x = 0,1,2,…
x!
Some properties of poisson distribution:
1. The average number of success ʎ, called the rate of events if known.
2. The number of successes occurring in no overlapping intervals is
independent.
3. The probability of exactly one success in a sufficiently short interval is
proportional to the length of the interval
4. The probability of two or more successes in a sufficiently short interval is
zero
8- THE DISCRETE UNIFORM DISTRIBUTION
Suppose a random variable X can assumes finite values and all the possible values
are equally probable. Then X is said to have a discrete uniform distribution. It is
the simplest of all discrete probability distributions. If a random variable has K
15
possible outcomes where the probability of any occurrence (outcomes) is 1/k, then
it is discrete uniform distribution with probability mass function (P.M.F) as:
F(x) 1/k for x = 1,2,…, k
0 otherwise
it has the parameter K.
N
in general, the value of k in the P.M.F of a uniform distribution is given by ( n )
Example 2:
16
A basketball player makes repeated shot from the free throw line until the second
basket is hit. If the probability of hitting a basket is 0.3. find the probability of
making the second basket on the twelve throw (Apply negative binomial)
Solution
Let X be the number of non-baskets, and r the number of hits.
Recalled the negative binomial distribution and hits P.M.F;
F(x: r, p)= (r + x -1) pr qx , x = 0,1,2,…
x
where
r = 2, x = 10, p = 0.3, and q= 0.7
by substitution;
F(10.2, 0.3) = (2+10) (0.3)2 (0.7)10
10
So that:
F (x; r, p) = (r + x -1) pr qx
17
x
1 1 5
F (8;,2, 6 ) = (2 + 8 - 1) ( 6 )2 ( 6 )8
8
1 5
= 9C8 ( 6 )2 ( 6 )8
= 9 (0.1667)2 (0.8333)8
= 9 (0.0278) (0.2326)
= 0.0582
= 0.06
Example 4:
A bag contains three green ball and five white balls. two balls are drawn at
random without replacement from the bad. If X denotes the number of green balls
in the sample then
i. Find the P.M.F of x.
ii. Find the probability distribution of X.
iii. Find P(x = 1 or 2)
(Apply hypergeametric distribution concept).
Solution
If X ﻜHypergeametric distribution with:
N= 8, D =3, n =2,
then, recall the P.M.F of geometric distribution.
D N −D
F (x: N,D,n) = ( x ) ( n−x )
N
(n)
Therefore:
D N −D
i. F(x: 8,2,3) = ( x ) ( n−x )
8
(2)
18
3 5
= ( x ) ( 2−x ) , x = 0,1,2.
8
(2)
ii. P (x = 0) = F (0,8,2,3)
3 5 5
iii. ( 0 ) ( 2 ) = 14
8
iv. (2)
3 5
2 0
8
2
3
= 28
15 3
= 28
+ 28
9
= 14
Example 5:
19
In a large hospital, the probability of giving birth to a male child is 0.03. What is
the probability that out of 48 women, less than two will give birth to a male child?
(Apply binomial distribution)
Solution
The number of male births follows a binomial distribution. Since a birth is either a
male with probability P = 0.03 or a female.
3 5
= ( 1) ( 1)
8
2
15
= 28
3 5
= (2) (0)
8
(2)
3
= 28
iiii. P(x – 1 or 2) = F(1: 8,2,3) or F(2: 8,2,3)
15 3
= ( 28 ) + ( 28 )
8
2
9
= 14
Example 5:
20
In a large hospital, the probability of giving birth to a male child is 0.03. What is
the probability that out of 48 women less than two will give birth to a male child?
(Apply binomial distribution)
Solution
The number of male births follows a binomial distribution. Since a birth is either a
male with probability P= 0.03 or a female with probability q = 0.97, and n = 48
Hence;
P(X <2) = P(X = 0) + P(X = 1)
= 0.97 48 + 48 c, (0.97)47 (0.03)
observe that without a calculator this calculation is very tedious. to simplify it, we
therefore need to use the poisson distribution which gives a good approximation to
the binomial distribution for large n and small p.
NOW:
Let ʎ = np
= 48 x 0.03
= 1.44
Then
P(X = 0) + P(X= 1) = e -1.44 + 1.44 e-1.44
= 0.5781
CUMULATIVE DISTRIBUTION FUNCTION FOR DISCRETE RANDOM
VARIABLE:
We sometimes variables by looking at their cumulative probability. That is for any
random variable X we may look at ( X ≤ b) for any real number b. This is the
cumulative probability for X evaluated at b. thus we can define a function.
F (b) as:
F (b) = P(X ≤ b).
And if X is discrete
21
F(b) = Ʃ b P(x)
x=∞
Where P(x) is the probability function.
The distribution function is often called the cumulative distribution function.
In other words, the cumulative distribution function of a random variable X
denoted by F5 (x) is defined to be that function with domain the real line and
counter domain the real line and counter domain the interval [ 0, 1].
It is defined as:
Fx (x) = P(X≤ x) = Ʃ Px Px (xj)
xj ≤ x
It gives the probability that the random variable X takes on a value less than or
equal to a given number x.
Properties of C.d.f
i. 0 ≤ Fx (x) ≤1
ii. Fx (x1) ≤ Fx (x2) if x1 <x2
iii. lim Fx (x) = Fx (∞ ) = 0. x → ∞
iv. Lim Fx (x) = Fx (- ∞ ) = 0. x → ∞
v. Fx (x) is defined uniquely for each random variable.
Example 1
Let be the number of boys in a family of 3. Find the distribution of X and the
cumulative distribution of X
Solution
Let ‘b’ denote boys and ‘g’ denote girls.
Then the sample space is given by:
S = ς bbb, bbg, bgb, gbb, ggn, gbg, bgg, ggg
To find the distribution of X,
Let X; X(ggg) = 0, X(ggb, gbg, bgg) = 1
The values of X are:
22
X = {0, 1, 2, 3}
0 if x < 0
1
8
if 0 ≤ x ¿ 1
4
Fx (x) = 8
if 1 ≤ x < 2
7
8
if 2 ≤ x < 3
8
if x ≥ 3
8
Example 2
Consider the experiment of tossing a fair disce. Find the cumulative distribution.
Solution
Let x denotes the outcome of the toss
Then the distribution of X is given by;
Values of X = {1, 2, 3, 4, 5, 6, }
The distribution is presented as
x 1 2 3 4 5 6
Fx (x) 1 1 1 1 1 1
6 6 6 6 6 6
23
The cumulative distribution function is obtained as shown below:
0 if x < 1
1
6
if 1 ≤ x ¿ 2
1
ƩFx (x) = 3
if 2 ≤ x < 3
1
2
if 3 ≤ x < 4
2
if 4 ≤ x ¿ 5
3
5
if 5 ≤ x < 6
6
6
6
if x ≥ 6
Example 1
Let X be the number of boys in a family of 3. Find:
24
i. The distribution of X
ii. the expected value of X
Solution
(i) Let ‘b’ denote boys and ‘g’ denote girls then the sample space is given
by:
S = { bbb, bbg, bgb, gbb, ggb, gbg, bgg, ggg}
To find the distribution of X
Let: X(ggg) = 0; X (ggb, gbg, bgg) = 1;
X(bbg, bgb, gbb) = 2; X(bbb) = 3.
The values of X are;
X = 0, 1, 2, 3
Hence the distribution of X is summarize as follows:
x 0 1 2 3
Fx (x) 1 3 3 1
8 8 8 8
Expected value of X:
E(x) = Ʃ x P(x)
1 3 3 1
= (0 x 8 ) + (1 x 8 ) + (2 x 8 ) + (3 x 8 )
0 3 6 3
=8+8+8+8
= 0 + 3 +6+3
8
12
= 8
3
- 2
25
1
=1 2
Example 2
Consider an experiment of tossing a fair die once and let x be the number that turns
up.
Find the expected value of X.
Solution
The sample space is
S = {1, 2, 3, 4, 5, 6}
Hence;
1
P(x = 1) = P(x = 2) = P(x =3) = … = P(x = 6) = 6
Then;
1 1 1 1 1 1
E(x) = (1 x 6 ) + (2 x 6 ) + (3 x 6 ) + (4 x 6 ) + (5 x 6 ) +(6 x 6 )
1 2 3 4 5 6
=6+6+6+6+6+6
= 1+2+3+4+5+6
6
1
=2 6
1
=3 2
PROPERTIES OF EXPECTATION
1. E (C) = C (where c is a constant)
2. E (a x) = a E(x)
3. E(ax +6) = a E(x) +b
4. E (X+ Y) = E(x) ± E (Y)
26
5. E(x Y) = E(x) E(Y)
VARIANCE
The variance of a rabndom variable X with a probability mass function P(x) and
expected value V is denoted by ver (x) or δ 2 = E(x - v)2
Sometimes the notation below is used
E(x - v)6 = δ 2
The variance of X shows how the values of X are dispsersed about the mean. If the
values of X are close to the mean, then the variance of x will be small and vice
verse. The positive square root of the variance gives the standard deviation. It has
the same unit of measurement as the values of X, unlike the variance.
The variance of X is usually evaluated using the formula below;
Var (x) = Ʃ ( - V)2 P(x)
Example 1:
below is the probability distribution of the random variable X denoting the number
of boys in a family of 3
x 0 1 2 3
P(x) 1 3 3 1
8 8 8 8
27
3 6 3
=0+ 8 + 8 + 8
= 0+3=6=3
8
12
= 8 = 1.5
Hence
3
Var (x) = 3 – ( 2 )2
3
=4
= 0.75
PROPERTIES OF VARIANCE
28
1. Var (c) = 0 (where c is constant)
2. Var (a x) = q2 var (x)
3. Var (a x + b) = q2 Var (x)
4. Var (x ± Y) = Var (x) ± var (Y), (if x and y are independent)
STANDARD DEVIATION
The standard deviation of a random variable X is the square root of the variance,
given by;
σ = √σ2
29
ii. Variance of X;
Var (x) = Ʃ (x - μ)2 P(x)
= (0-1.3)2 (0.1) + (1-1.3)2 (0.5) + (2-1.3)2 (0.4)
= (1.69) (0.1) + (0.09)(0.5) + (0.49) (0.4)
= 0.410
iii. Standard deviation of
σ =√ σ 2
¿ √ Ʃ(x−µ)2 P (x)
σ = √ 0.410
= 0.6403
0.64 ﻜ
THEOREM ON SOME PROPERTIES OF EXPECTATION AND
VARIANCE
Theorem 1:
For any random variable x and constants a and b;
i. E (a x +b) = q E (x) + b
ii. Var (a x +b) = q2 v (x)
Proof:
i. E (a x + b)_= Ʃ (a x +b) P(x)
from E(x) = Ʃ x P04, we will have
= Ʃ [(a x) P(x) + bp(x)]
= Ʃ a x p(x) + Ʃ b p(x)
= a Ʃ x p (x) + b Ʃ P(x)
Where Ʃ xp(x) = E(x),
and Ʃ P(x) = 1
Thus:
E(a x +b) = aE(x) + b(1)
30
= aE(x) + b
ii. V (a x +b) = E[C a x +b] – E(a x +b) ]2
iii. from Var (x) = E(x - v)2 we will have
= E[ax +b – (aE(x) +b)]2
= E[a x – q E (x) ]2
= E[q2 – q E(x) )]2
=
q2 E [(x – E (x))2]
= q2 Var (x)
Theorem 2:
If X is a random variable with a mean
µ then;
Var (x) = E (x2) - µ2 .
Proof:
The proof is as follows:
Starting with definition of µ.
Then,
V (x) = E(x -µ)2
= E [(x -µ) (x + ∝) ]
Using the concept of difference of two square.
Using the concept of this square of a difference
= E (x2 – 2 x φ ) + E (φ 2)
= E (X2) - 2φ (φ ) + φ 2
= E (X2) - 2φ E (x) + φ 2
= E(X2) - 2φ (φ ) + φ 2
= E (X2) - 2φ 2 + φ 2
= E(X2) - φ 2
Example 1:
31
The manager of a stockroom in a factory knows from his study of records that the
daily demand (number of times used) for a certain tolls has the following
probability distribution;
Deman 0 1 2
Probability 0.1 0.5 0.4
If X denotes the daily demand; using the theorems above, find the variance of X
Solution:
From the previous example (work), we see that:
E(x)=1.3
E(x2) =Ʃ x2 P(x)
= (0)2 (0.1) + (1)2 (0.5) + (2)2 (0.4)
= 0 (0.1) + 1 (0.5) + 4 (0.4)
= 0+0.5 + 1.6
= 2.1
By the theorem above (2)
V(x) = (X2) - φ 2
= 2.1 – (1.3)2
= 2.1 – 1.6.9
= 0.41
EXPECTATION AND VARIANCE OF DISCRETE PROBABILITY
DISTRIBUTIONS.
1 – THE BERNOULLI DISTRIBUTIONS
The probability mass function (P.d.f) of a Bernoulli distribution is given by;
F(x; p) = Px q1-x , if x = 0 or 1
i. Expectation of the random variable X.
E (X) = Ʃ x p (x)
32
Where the values of x are (0.1) so that:
E(x) = 0 x q + 1xp
=0+p
=P
or
E(x) = 0 (1- P) + 1 (P)
=0+P
=P
ii. variance of X
V(x) = E (X2) – [ E (x) ]2
= Ʃ x2 P(x) – P2
Where P2 is the square of the expected value
or E (X2) = 02 x q + 12 x P
=P
Hence,
Var (x) = E (x2) – [ E (X) ]2
= P –P2
= P (1 - P)
= Pq
2 – THE BINOMIAL DISTRIBUTION
The P.M.F of the binomial distribution if given by:
n
F(x:n, p) = ( x ) px qn-x , for x = 0,1,2,… n
33
n
E (x) ƩN x ( x ) px qu-x
x=0
= ƩN x n(n-1)! P6 qn-x
x! (n-x)!
= Ʃn x n (n-1)! P.Px-1 qn-x
x (x-1) (n-x)!
= NP Ʃn (n - 1)! Px-1 qn-x
(x-1)! (n-x)!
If we let j = 1
and if j = 0
Then 0 = x = 1, x = 1
By substitution, we will have;
E(x) = np Ʃny (n -1) Pj qn-1y
j=0 j
(Where j = x - 1)
Taking the upper summation and lower summation E(x) = np (p+q) n-1
= np
Variance Of X
E (X2) = E(x (x-1) + E (x))
Where:
E (X (x-1) = Ʃn x (x - 1) n! Px qn-x
X! (N-X)!
E (xc x - 1) = Ʃn x (x - 1) n (n - 1) (n - 2)! Px qn-x
x (x-1) (x - 2)! (n -x)!
= n (n - 1) p2 Ʃn (n - 2)! Px qn-x + np Ʃn (n-1)! Px qn-x
x=2 (x-2)! (n-x)! x =1 (x- 1)! (n- x)!
2
= n (n - 1) p + np
Hence;
Var (x) = n(n-1) P2 + np – (np)2
34
= n2 p2 + np – n2 p2
= n2 - p2 = np - n2 p2
= np – np2 + n2 p2 - n2 p2 (rearranging)
np – np2 + 0
= np – np2
By factorizing, we have:
Vor (x) = np (1 - p)
But 1- p = q
Vor (x) = npq.
3 – The Geometric Distribution
The probability mass function of a geometric random variable is given by:
P(x) = P(x= x) =Pq x-1 for x = 1,2,3,..
where q = (1 - p)
Therefore: P(x) = P(1-p)x-1, for x = 1,`2,…
Expectation Of X
The expected value of a geometric random variable for first kind is;
E(x) = Ʃ x p(1 - p)x-1
x =1
= P Ʃ x (1 - p) x-1
x–1
=P 1 1-p=q
(1 - q)
where q is the common ratio
= p
P2
= 1
P
Um of a G.P
Su = a (rn - 1) if as
ry
And where q is the common ratio Sn = q (1-rn) if r < 1
35
1 –r
SN = q
1-r
eg 1 +x + x2 + +x3 +….
and;
E(x (x - 1)) = Ʃ x (x - 1) PQ x-2
x =2
x
= 2P (1-P) Ʃ∞ ( 2 ) q x-2
x=2
= 2pq
( 1 – q )3
= 2q
P2
So that
E (x2) = E(x (x - 1) + E (x)
1
= 2 (1 - p) + p
P2
= 2–p
P2
Variance of X
Var (x) = E(x2) – (E (x) )2
1
= 2–p - p
P2
q
= p
= (1 - p)
P2
4 - THE HYPERGEOMETRIC DISTRIBUTION
The probability mass function (P.M.F) of the hypergeometric random variable is
given by;
36
D N −D
( x ) ( n−x ) , x = 0, 1, 2,…
N
(n )
Expectation
D N −D
E (x) = Ʃn x ( x ) ( n−x )
N
(n )
D N −D
= n . n Ʃn ( D−1
x−1
) ( n−x
)
N −1
( n−1 )
D
=n. N
Variance
D−1 N −D N −1
Since Ʃn ( x−1 ) ( n−x ) = ( n−1 )
NON:
E (x2) = E (x(x -1) + E(x)
But,
N N −D D−2 N −D
E (x(x - 1) ( x ) ( n−x ) = n (n - 1) ( D ¿ ¿) Ʃn ( x−2 ) ( n−x )
N −2
( n−2 )
= n (n-1) D (D - 1)
N (N - 1)
Hence;
Var (x) = E (x2) – (E (x))2
nd 2
= n (n - 1) D (D - 1) + nD - ( )
N
N (N - 1) N
37
= N (n2 D2) – n2 D – nd2 + nD) + NnD (N - 1) – (nD)2 (N - 1)
N2 (N - 1)
= nD (N - D) (N - n)
N2 (N - 1)
5 – THE NEGATIVE BINOMIAL DISTRIBUTION:
The probability mass function of a negative binomial distribution with parameters
r, p is given by;
F(x: r, p) = r + x – 1 Pr qx , x = 0, 1, 2,…
x
Expectation:
r + x−1
E(x) = Ʃ x ( x ) pr qx
x=0
r + x−1
=Ʃx ( x ! ( r −1 ) ! ) pr qx
x=0
r + x−1
= r q pr Ʃ ( ( x−1 ) ! r ! )! pr qx
x=1
r + x−1
= r q pr Ʃ ( ( x−1 ) ) pr qx
x=1
= r q pr (1 - q) – (r +1)
= rq
P
ii. Variance:
Var (x) E (X2) – (E (x))2
But:
38
r + x−1
E(X (x - 1) ) = Ʃ x (x - 1) ( x ) pr qx
x=1
= Ʃ (x - 1) (r +x - 1) ! pr qx
x=1 x! (r - 1)!
= r (r +1) q2 pr Ʃ∞ (r = x - 1)! qx - 2
x = 2 (x - 2)! (R +1)!
r + x−1
= r (r + 1) q2 Ʃ∞ ( x−2 ) qx - 2
x =2
= r (r + 1) q2 pr (1 -q) – (r +2)
= r (r + 1) q2
p2
Also
E (x2) = E(x(x - 1)) + E(X)
= r (r + 1) q2 + rq
p2 P
= rq (r +1) q + p )
39
The probability mass function of a poisson distribution with parameter λ > 0 is
given by;
F (x; λ ) = e –λ λx , x = 0, 1, 2,…
x!
Expectation:
E (x) = Ʃ x F (x; λ )
x=0
E (x) = Ʃ x e -λ λx
x=1 x!
∞
=λ Ʃ e -λ λx – 1
x=0
Let y = x -1
Then we obtain;
E (x) = λ ∞
Ʃ e -λ λ y
y=0 y!
= λ.
or
The mean of the poisson distribution is easily derived formally of one remembers a
simple Taylor series expansion of ex , namely;
ex = 1+x + x2 + x3 + ..
2! 3!
Then;
E (Y) = Ʃ y p (y)
y
40
= ∞
Ʃ y λy e –λ
y=1 y!
=λ e e-λ ∞ λ y-1
Ʃ (y - 1)!
y=1
= λ e –λ (1 + λ + λ 2 + λ 3 +…..)
2! 3!
= λ e –λ (1 + λ + λ 2 + λ 3 +…..)
2! 3!
= λ e –λ e λ
=λ
ii – Variance:
Since, ∞ e –λ λ y = 1
Ʃ y!
y=0
= ∞ x (x -1 ) e –y λ x-2
Ʃ (x - 2) !
x=2
∞ e –λ λ 2-2
= λ2 Ʃ (x - 2)
Let Y = x – 2
∞ e –λ λ y
So that, E (x (x-1) ) = λ 2 Ʃ Y!
y=0
2
=λ
NOW,
E (x2) = E (x (x - 1) ) + E (X)
= λ2 - λ
41
and hence
Var (x) = E (x2) – (E(x) )2
= λ2 + λ - λ2
= λ.
42
PROBABILITY
Probability is the study of random or nondeterministic experiments. The
probability of an event represents the proportion of times under identical
conditions that the outcome can be expected to occur. For instance, if a dice or coin
is tossed or thrown, it is certain that the dice or coin will fall down but it is not
certain which side will fall.
FUNCTION
A function is defined as a relation between a set of inputs having one outoput each.
In simple words, a function is a relationship between imputs where each imput is
related to exactly one output. Every function has a domain and codemian or range.
A function is generally defined by F(x), where x is the imput.
PROBABILITY FUNCTION
This is used as a measure of the probability of an event and written PC?, It could
be defined as a set function with a domain and counter domain [ 0, 1] which
satisfies the axiom of probability.
RANDOM EXPERIMENTS
An experiment is said to be random if its outcome cannot be predicted with
certainty prior to the performance of the experiment. Its outcome is determined by
chance alone. some examples are the tossing of a coin and observing the face that
turns up, planting a crop and observing its yields
SAMPLE SPACE
A sample space of a random experiment denoted by S or is the collection of all
possible outcomes of the experiment. A sample space may be finite or infinite. It is
the number of elements in the space can be counted. Otherwise it is infinite. It may
also be discrete or continuous. A sample pointed is particular outcome in a sample
space.
43
AN EVENT:
An event can be defined as an appropriate subset of the sample space. An empty
set ∅ E S is an event defined which is sometimes called an impossible event with
probability zero while the event of the sample space is a sure event with
probability one.
DISCRETE SAMPLE SPACE
A sample space S is said to be discrete if it contains a finite number of sample
points (i.e in counting the number of sample points, the counting process can come
to an end) or countable infinite sample points. eg.
a. Ѕ = { 0, 1, 2, 3, 4, 5,…} set of non negative integers
b. Ѕ = { 0, 2, 4, 6,…} the set of each number.
c. Ѕ = {…, - 2, - 1, 0, 1, 2,…} set of all integer
discrete sample space may contain a finite or infinite number of elements. The
elements of a discrete sample space are isolated points on the real line, between
any two elements of Ѕ, which do not belong to Ѕ .
CONTINUOUS SAMPLE SPACE
A sample space S that has elements (all the points) in an interval, or all the points
in a union of intervals, on the real line is called continuous. Some examples are:
a. x: 0≤ x ≤ 10
b. x: 0 ≤ x ≤ 1 or 2 ≤ x ≤ 3
A continuous sample space always has an infinite number of elements: Lence it
contain an uncountable or non denumerable number of points.
RANDOM VARIABLE:
Suppose S is a sample space of some random experiment. This outcomes of the
experiment, i.e the sample points of S, need not be numbers.
44
A function which assigns a real number with each experimental outcome (sample
point) is called a random variable. A random variable is therefore a function which
assigns a real number to each element o S. It has the sample space as its domain.
Random variables are usually denoted by upper case letters such as x, y, z, etc.,
while their corresponding realizable values are denoted by lower case letters such
as x, y, z. etc.
The number of heads obtained when a coin is thrown two times, the outcome when
a die is tossed, the time (in hours) it takes for a high bulb to burn out, are all
examples of random variables. A random variables could be discrete or continuous
depending on its range.
DISCRETE RANDOM VARAIBLES
A random variable X is called discrete if the range is finite or countably infinite.
The range of a random variable X is a set of points which it’s assumes. the range of
X is said to be countable if there exist a finite set of real numbers such that X takes
values only in that set.
In other words, a discreet random variable X has countable number of possible
values (or example, the integers). Their probability distribution is given by a
probability math function, which directly maps each value of the random variable
to a probability.
PROBABILITY DISTRIBUTION
A probability distribution is a statistical function that describes all the possible
values and likelyhoods that a random variable can take within a given range. The
range. The range will be bounded between the minimum and the maximum
possible values, but precisely to be plotted on the probability distribution depends
on the number of factor. These factors include, the distribution’s mean (average),
standard deviation, skewness and kurtosis.
45
Examples of discrete probability distributions include: Binomial distribution,
poisson, Bernoulli, hexametric, hypergeoometric, uniform discrete distribution etc.
PROBABILITY MASS FUNCTION
In probability and statistic, a probability mass function (PMF) is a function that
gives the probability that a discrete random variable is exactly equal to some
value. Sometimes it is also known as the discrete density function. The value of the
random variables having the highest or largest probability mass value is called the
mode. the P.M.F is defined thus:
If X is a discrete random variable, then the function denoted by P x (x) and defined
as: Px (x) = P (x = xj) if x = xj, j = 1, 2, …, n is called probability mass function of
x.
The above function is called probability mass function (P.M.F) of X. the P.M.F of
the discrete random variable X is a function which associates a real, number to
each value of X. hence Px (x) is a function with domain the real line and counter
domain the interval [ 0, 1].. for instance, the probability that the random variable X
equals 1, i.e P (x = 1), is referred to as the probability mass function of X evaluated
at 1.
CONTINUOUS RANDOM VARIABLE:
A random variable X is said to be continuous if its range contains interval of real
numbers. continuous random variables represent measured data such as weight,
leight, tempreture, time, blood pressure, etc. If X is continuous, the probability of it
taking a singl value is zero. Hence we write:
P(X = a) = 0
PROBABILITY DENSITY FUNCTION (P.D.F)
Let X be a continuous random variable, the p.d.f of X is denoted by F (x) and
defined as:
46
Fx (x) = dfx (x)
dx
The probability density function (PDF) defied the probability function representing
the density of a continuous random variable lying between a specific range of
values. In other words. The probability density function produces the likelihood of
values of the continuous random variable.
The p.d.f of a continuous random variable is constructed so that area under its
curve bounded by the X – axis is equal to one (1).
For Fx (x) to be valid, it must be entirely above the x – axis since probabilities are
non-negative quantities. Unlike a discrete random variable, the probability
associated with a continuous random variable is evaluated using integral calculus.
For example, P(a< x< b) is evaluated as follows:
P(a < x < b) = Ѕba Fx (x) dx
Properties of p.d.f
Difference between PDF and CDF of continuous random variable
PDF is the probability that a random variable (say x) will take value exactly equal
to the random variable(x), where CDF is the probability that a random variable
(X) will take a value less than or equal to the random variable(x)
1. Fx (x) ≥
2. ᶘ∞ Fx (x) dx = 1
3. Fx (x) is piecewise continuous
4. P (a < x < b) = ᶘba Fx (x) dx.
Example 1:
Let X be a random variable with p.d.f
F (x) = k x2, 0 ≤ x ≤ 2.
Find the values of K that makes F(x) a p.d.f
47
Solution
For F(x) to be a p.d.f,
ᶘ∞ Fx (x) dx = 1
So that:
ᶘ2 K x2 dx = 1
x3
K [ 3 ]20 = 1
23 03
K [ 3 - 33 ]20 = 1
8k
3
=1
3
K= 8
3x 2
Hence F(x) = 8 , 0 ≤ x ≤ 2.
0, elsewhere
Hence, the real number ‘a’ and ‘b’ called the parameter of the distribution are such
that a<b.
The uniform distribution thus, is the most basic form of probability distributions.
It is a rectangular distribution with constant probability and implies the fact that
each range of values that has equal probability of occurrence.
In summary uniform distribution is a distribution function in which every possible
result is equally likely; that is the probability of each occurring is the same.
PROPERTIES OF UNIFORM DISTRIBUTION
The following are the key characteristics of the uniform distribution:
i. The density function integrates to unity
ii. Each of the imputs that go in to form the function have equal weighting
iii. Mean of the uniform function is given by:
49
N= (a + b)
2
iv. The variance is given by the equation:
Var (x) = (b - a)2
12
PLOT OF THE GRAPH OF UNIFORM DISTRIBUTION:
1
area = Widith x Leight = (a - b) x b−q = 1
Note:
The location of the interval has little influence in deciding if the uniform
distribution variable falls within the fixed length. Two factors that influence this
the most are the interval size and the fact that the interval falls within the
distribution support.
AREAS OF APPLICATION:
1. It is useful for sampling from arbitrary distribution.
2. Computer scientists use it for random number generation within a range.
3. A general method is the inverse transform sampling method which uses the
cumulative distribution function of the target random variable.
50
RELATIONSHIP WITH OTHER DISTRIBUTIONS
If the range is restricted to be between (0, 1) it is called standard uniform
distribution.
Example 2:
Suppose a random variable X has a uniform distribution on the interval (1,b) find :
i. P (2 ≤ x ≤ 4)
ii. P (x ≤ 2)
iii. P(x ≤ 3)
Solution:
1
F(x) = b−q for a ≤ x ≤ 6
0, otherwise
But a = 1 and b = 6
Thus:
1
b−q for 1 x 6
F(x) = ≤ ≤
0 otherwise
1
F (x) = 5 , fr 1 ≤ x ≤ 6
0, otherwise
4
1
i. P (2 ≤ x 4) = ∫ 5 dx
≤
2
x
[ 5 ]42
4 2
[ 5 - [ 5 ]42
51
4−2
( 5
)
2
=5
2
1
ii. P (x ≤ 2) = ∫ dx
1 5
x
= [ 5 ]2
2 1 2−1
= [5 - 5 ] = [ 5
]
1
=5
2
1
iii. P (X ≤ 3) = ∫ dx
1 5
x
= [ 5 ]63
6 3 6−3
=[ 5 - 5 ]=[ 5
]
3
=5
Example 3:
The failure of a circuit board interrupts work computing systems until a new board
is delivered. The delivering time X is uniformly distributed over the interval 1 to 5
days.
The cost C of this failure and interruption consists of a fixed cost C and for the
new part and a cost that increases proportional to X2, so that C = C0 + C1 X2.
Solution
0, otherwise,
which is
1 1
F (x) = 5−1 = 4 , 1 ≤ x 5
0 otherwise
Thus:
2
P (x ≥ 2) ∫ 14 dx
1
x
= [ 4 ]5
5 2
= [ 4 −¿ 4 ]
5−2
= 4
3
Pr (x) = 4
We know that:
E (C) = C0 + C1 E (X2)
So, it remains to find E (X 2). This could be found directly from the definition or by
using the variable and the fact that:
53
E (X2) = V (X) +
(5 - 1)2 + (1 + 5)
12 2
31
3
Thus:
31
E (C) = C0 + G ( 3 )
54
intervals on either side of N enclose appropriximately a total probability of:
68.27% for SD, 95.45% for 1.96 Sd, and 99.73% for 2.58 SD
Areas covered by each standard deviation about the mean are shown below:
The figure above is called; the probability interval under the normal distribution
curve.
STANDARD NORMAL DISTRIBUTION
The standard normal distribution is a normal distribution with a mean of zero and
standard deviation 1. The standard normal distribution is centered at zero and the
degree to which a given measurement deviates from the mean is given by the
standard deviation.
The standard normal distribution is one whose mean is zero and standard deviation
as one along the abscissa, instead of x we have a transformation of X is called the
standard score Z. Thus, the Z – score really tells us how many standard deviation
from the mean, a particular Z-score is. Any distribution of a normal variable can be
transformed to a distribution of Z by taking each X value subtracting from it mean
of X and dividing this deviation of X from its mean by the standard deviation.
The random variable Z is said to have a standard normal distribution if its p.d.f is:
1
ф (Z) = F (Z: 0, 1) = e - x2 , - ∞<x<∞
√2
2
Determining Probability For A Standard Normal Distribution:
56
Suppose Z is a standard normal random variable, the probability associated with Z
have been tabulated and can be found in statistical tables.
It is advisable that in finding probabilities associated with Z, to draw the graph of
the standard normal distribution. This will assist in locating the appropriate
probabilities. Observed that, due to the summary of the curve of ∅ (x), the
following relations are helpful.
i. P (Z ≤ 2) = ∅ (Z)
ii. P (Z ≥ ȝ) = 1 - ∅ (ȝ)
iii. P (Z ≥ ȝ) = P (Z ≥ - ȝ) or P (Z ≥ ȝ ) = ∅ (-ȝ).
iv. ∅ (ȝ) + ∅ (-ȝ) = 1
Example 1:
Let Z be a random variable with the standard normal distribution. Find;
i. P (Z ≥ 1.13)
ii. P (0.65 ≤ Z ≤ 1.26)
iii. P (- 1.3) ≤ Z ≤ 2.01
iv. P (0.00 ≤ Z ≤ 1.42)
v. P (- 0.73 ≤ Z ≤ 0.0)
vi. P (0.00 ≤ Z ≤ 1.42)
Solution
P (Z ≥ 1. 13) = 1 - ∅ (1. 13)
but from the standard normal table
The sketch:
57
P (0.65 ≤ Z ≤ 1.26) = ∅ (1. 16) - ∅ (0.65)
but from tables:
∅ (1.26) = 0.8962 and ∅ (0.65) = 0.1540
= 0. 1540
The sketch
= 0.8925
The sketch:
1.37 0 2.01
so that:
= 0. 4222
The sketch:
so that:
= 0.2674
The sketch:
- 0.73
so that:
= 0. 2579
The sketch:
-1.79 – 0.540
X=σ Z+μ
X - ɤ = σ z +ɤ −ɤ
X ɤ1 = σz
X-ɤ = σ /z
σ σ
Z=X - ɤ
60
σ
Hence:
x−ɤ
P (Z) = P ( σ
)
This means that the probability concerning X can be determined by the above
linear transformation of X (standardization of X).
Remember, we have already established that the P (a < X <b) F (x) dx
q−ɤ X−ɤ b−ɤ
P (a < x <b) = P ( σ < σ < σ )
=P
X−ɤ
but σ =Z
so that
q−ɤ b−ɤ
P (a < x < b) = P ( σ <Z< σ )
Example 2:
If X has a normal distribution with mean 10 and standard deviation 5. Find:
i. P (X < 11)
ii. P (X > 11)
iii. P (X < 5)
iv. P (X > 5)
v. P (5 < x < 11)
Solution:
x−ɤ
Let Z = σ
61
here: X = 11, σ = 5 and φ = 10
so that;
X−ɤ 11−10
P (X < 11) = P ( σ ∝
5
)
1
= P (Z < 5 )
∅ (0.2 )
X−ɤ 11−10
P (X > 11) = P (( σ ¿
5
)
1
= P (Z > 5 )
= P (Z > 0.2)
= 1 - ∅ (0.2)
from tables ; ∅ (0.2) = 0.5793
Therefore;
P(X > 11) = 1 – 0.5793
= 0.4207
The sketch:
62
X−ɤ 5−10
P (X < 5) = P ( σ ¿
5
)
5
= P (Z < - 5 )
= P (Z < - 1.0)
= ∅ (- 1.0)
but from table ; ∅ (- 1.0) = 0.1587
P (X < 5 ) = 0.1587
The sketch :
-1.0
X−ɤ 5−10
P (X > 5 ) = P ( σ
¿
5
)
5
=( Z>- 5 )
= (Z > - 1.0)
= 1 - ∅ (- 1.0)
from table ; ∅ = 0.1587
so that ; P (Z > - 1.0 ) = 1 – 0.1597
= 0.8413
The sketch
63
= P (- 1.0 < Z < 0.2)
= ∅ (0.2) - ∅ (- 1.0)
= 0.4206.
The sketch.
Example 2:
if X has a normal distribution with mean 6, and variance 25, i.e X N (6,25). Find;
i. P (1 X – 61 < 5)
ii. P (- 2 < X 0) P ( - 2< X < 0)
iii. P (1 X – 61 < 15)
iv. P (1 X – 61 < 10)
Solution:
i. P (1 X – 61 < 5) = P (-5 < x – 6 < 5)
5 x−6 5
= P (- 5 ) < 5 < 5 )
64
−15 X−6 15
P (1 X – 61 < 15) = P ((- 5 < 5
< 3
)
and standard deviation 3.330 C. Find the probability that the tempreture is between
41.11oC and 46.66oC.
Solution
Let T be the random variable representing tempreture:
Then; T N (40oc, 3.33oc);
Therefore:
P (41.11oc < T < 46.66o c);
41.11−40 46.66−40
=P( 3.33
≤
3.33
)
P (0.33 ≤ Z ≤ 2.00)
= ∅ (2.0) - ∅ (0.33)
= 0.4772 – 0.1293
65
= 0.3479.
Example 1:
A certain type of electric bulbs has a mean life span of 810 hours and variance of
1600 hours. Assume that the bulbs life spans are normally distributed. Find the
probability that;
i. a bulb burns between 788 and 844 hours
ii. Less than 834 hours
iii. More than 788 hours.
Solution:
Let the life span of the light bulbs be represented by X:
Then X ∼ N (810, 1600)
288−810 X−φ 844−810
i. P (788 ≤ X ≤ 844)= ( 40
≤
σ
≤
40
)
= P (-0.55 ≤ Z 0.85)
= ∅ (0.85) - ∅ (-0.55)
= 0.8023 – 0.2912
= 0.5111
x−∝ 834−810
ii. P (X < 834) = ( σ < 40
)
= P (Z < 0.6)
= ∅ (0.6)
= 0.7257
x−ɤ 788−810
iii. P (X > 788) = P ( σ < 40
)
= P (Z > - 0.55)
66
= 1 - ∅ (- 0.55)
= 1 – 0.2912
= 0.7088
The approximation is best obtained when n is large and P is not extremely close to
0 or 1, the approximation is still fairly adequate.
Since the binomial is a discrete random variable and the normal is a continuos
random variable, the best approximation is obtained by employing the correction
for continuity.
The correction for continuity is used as follows:
Suppose P (X = x) is desire, then;
Example 1:
Suppose a random variable X has a binomial distribution with n = 40 and pp = 0.5.
Find:
i. P (X = 20)
ii. P (X < 20)
iii. P (X > 20)
Solution
Since X is binomial then
µ = np = 20
σ 2 = npq = 10.
68
X−20 20.5−20
=P( < )
√10 √10
= P (Z > 0.5)
= 1 ∅ (0.5)
= 0. 3085
70
b. The probability that on a given day at least 65 kidney transplants will be
performed is:
P (X ≥ 65) = 1 – P (X ≤ 64.5)
= 1 – P (X ≤ 64.5)
Exponential Distribution:
The exponential distribution is a right-skewed continuous probability that models
variables in which small values occur move frequently than higher values. It is a
unimodal distribution where small values have relatively high probabilities which
consistently decline as data values increase.
In probability theory and statistics, the exponential distribution is a continuous
probability distribution that often concerns
The amount of time until some specific event happens. It is a process in which
events happen continuously and independently at a constant average rate. The
exponential distribution has the key property of memorylesss. The exponential
72
random variables can be either more small value or fewer larger values. for
example, the amount of money spent by the customer on one trip to the
supermarket follows an exponential distribution.
The continuous random variable, say X is said to have an exponential distribution,
if it has the following probability density function:
Fx (x :λ ) = λe - λx , for x > 0
0 , for x ≤ 0
Where:
λ is called the distribution rate properties of exponential distribution:
i. Right skewed shape:
The exponential distribution is right skewed, mening that it has long tail
on the right side of distribution.
ii. Non –negative:
The exponential distribution is always non-negative since the time
between events can never be negative.
iii. Memoryless:
From the point of view of waiting time until arrival of a customer, the
memoryless property means that it does not matter how long you have
waited so far.
Example 1:
Cars arrive at a certain round about according to an exponential distribution
(process) with parameter λ = 5 per hour. if an observer stands at the round about
for a specified period of time. What is the probability that it is;
i. at least 15 minutes until the next car arrives
ii. it is not more than 10 minutes
Solution:
Let the random variable X be the waiting time, the.
73
1 1
E (X) = λ = 12
1 x
i. P (X ≥ 15) = ᶘ∞ 12 e - 12 dx
15
1 x
= 12 ᶘ∞ e - 12 dx
15
x
- 12
∞
1
=2 [e ] 15
15
- 12
1
=2
= 0. 2865
x
- 12 10
= -e
0
x
- 12
=1- e
=1- 0.4346
= 0.5654
Example 2:
Suppose the waiting time in minutes on a queue follows exponential distribution
1
with λ = 10 . Find the probability that an arriving customer will wait:
74
ii. Between 10 and 20 minutes
Solution
Let X denote the waiting time of customers
Then,
x
∞ - 10
1
i. P (X > 10) = ʃ 10
e dx
10
x
- 10 ∞
=-e 10
-1
= e
= 0.368
x
20 -
10
1
ii. P (10 < x 20) = ʃ 10
e dx
10
x
- 20
10
= -e
10
-1 -2
= e e
= 0.36788 – 0.13533
= 0.23254
= 0. 233
75
THE EXPECTATION AND VARIANCE OF CONTINUOUS RANDOM
VARIABLE:
EXPECTATION:
Let X be a continuous random variable with p.d.f F x (x). The expectation or
expected value of X is denote by E (x) and defined by:
E (X) =ʃ x Fx (x) dx
Rx
Example 1:
Let X be random variable with p.d.f F (x) = Kx2, 0 ≤ x ≤ 2. Find:
i. The value of K that makes F(x) a p.d.f
ii. the expectation of X.
Solution
i. For F(x) to be a p.d.f,
ʃ F(x) dx = 1
Rx
So that:
ʃ2 Kx2 dx = 1
0
3 2
x
K [ ]0 =1
3
8k
and =1
3
3
=K = 8
2
3x
Hence F(x) = 8 , 0 ≤ x ≤ 2
E (X) = ʃ x F(x) dx
Rx
76
3
= ʃ2 x( x2
) dx
8
0
3x
= ʃ2 8
dx
0
3
3 x
=8 ʃ2
8
dx
4 2
3 x
=8 4
0
3 x 2
= 8 4
0
4 4
3 (2 ) 3(0)
= -
32 32
3 (16 ) 3(0)
= -
32 32
49 0
= 32 - 32
48
= 32
3
=2
ii. VARIANCE:
77
Var (X) = E (X2) - [ E (X) ]2
and E (X2) = ʃRx x2 Fx (x) dx
then; Var (x) = ʃRx x2 Fx (x) dx -
Example 1:
Let X be a random variable with p.d.f
Fx (x) = 4x3, 0 ≤ x ≤ 1
Find the:
Variance of x
Solution
Var (X) = E (X2) – [ E (X) ]2
1
but E (X) = ʃ x Fx (x) dx
0
here Fx (x) = 4x3
So that: 1
E (X) = ʃ x (4x3) dx
0
1
=ʃ 4x 4 dx
0
1
=4ʃ x4 dx
0
5 1
4x
= 5 ]
0
5 1
4 (0) 4 (0)
= -
5 5
78
4 0
= 5
- 5
E (X) = 4/5
1
2
Now, E (X ) = ʃ x2 4x3 dx
0
1
= ʃ 4x5 dx.
0
1
=ʃ 4x5 dx.
0
1
=ʃ 4x5 dx
0
1
=4ʃ x5 dx
0
6 1
4x
=[ 6 ]
0
6 6
4 (1) 4 (1)
= -
6 6
4 0
=6 - 6
4
=6
so that:
4 4
Var (X) = 6 - ( 6 )2
4 16
=6 - 25
79
100−96
= 50
4
= 150
4
= 75
Example 2:
For a lathe in a machine shap, let X denote the percentage of time out of a 40 –
hour week that the lathe is actually in use suppose X has probability density
function given by
80
3x2
F(x) =
0≤x ≤1
0 , elsewhere
Solution:
i. Mean:
h∞
E (x) = ʃ x F(x) dx.
∞
but F (x) = 3x2
Where a = 0 and b = 1
1
E (x) = ʃ x (3x2) dx.
0
1
=ʃ 3x3 dx
0
1
= 3ʃ x3 dx
0
4
3x
= [ 4 ]1
0
4 4
1 1
= 3( 4 ) - = 3( 4 )
3
=4
E (X) = 0.75
81
Thus, on the average, the lathe is in use 75% of the time.
ii. To compute Var (x), we first fund E (X2) by
∞
E (X2) = ʃ x2 F(x) dx
-∞
1
=ʃ x2 F (x) dx
0
1
=ʃ x2 (3x2) dx
0
1
=ʃ 3 x4 dx
0
1
= 3 ʃ x4 dx
0
5 1
3x
=[ 5 ]
o
5 5
1 1
= 3 (5 ) - 3 (5 )
3 0
=5 - 5
3
= 5 = 0.60
So that:
Var (x1) = E (X2) – φ 2
= 0.60 – (0.75)2
= 0.60 – 0.5625
Var (X) = 0.0375
82
= 0.04 to 2d.p.
Example 3:
The weekly demand X, for kerosene at certain supply station has a density function
by: x 0 ≤x ≤ 1
1
2
1<x ≤2
F(x) =
0 elsewhere
Solution
To find E(X) has different nonzero forms over two disjoint regions. Thus.
∞
E (X) = ʃ x f (x) dx
∞
1 2
= ʃ x (x) dx + ʃ x ½ dx.
0 1
1 2
2
= ʃ x dx + ½ ʃ x dx.
0 1
3 1 2 2
x x
= 3
+ ½ 2
1
0
83
1 2
X X
= 3
+ 4
1
0
3 3 2 2
1 0 2 1
= ( 3 - 3 ) + (4 - 4 )
1 4 1
=3 + (4 - 4)
1 1
=3 + 4
4+ 3
= 612
7
= 6
EXPECTATION AND VARIANCE OF A CONTINUOUS UNIFORM
DISTRIBUTION:
Expectation:
If X is a continuous uniform distribution
then,
b
x
E (X) = ʃ b−q
a
84
1
= ʃ x b−q
dx
a
b
1 1
= b−q ʃ x b−q
dx
a
2 b
1 x
= b−q
[ 2 ]
a
2 2
1 b a
= b−q (2 -2)
2 2
1 b−a
= b−c ( 2 )
2 2
b−a
E (X) = ( 2 )
( b−q ) (b+q)
E (X) =
2(b−q)
b+q
E (X) = 2
85
(2) Variance:
b
1
but E (X2) = ʃ x2 ( b−q ) dx
q
b
1
= b−q ʃ x2 dx
q
3 3
1 b q
= b−q (3 3)
3 3
b−E
= 3(b−q)
2 2
b+ab +q
= 3
So that:
2 2 2 2
b+ab +q b+a
= 3
- ( 2 )2
2 2 2 2
b+ab +q b+a
= 3
- ( 4 )2
86
2 2 2 2
b+ab +q b+a
= 4 3
- ( 4 )2
2 2 2 2
4 b +4 ba+4 a−3 b−6 ab−3 q
= 12
2 2 2 2
4 b +4 ba+4 a−3 b−6 ab−3 q
= 12
2 2
b−2ab +q
= 12
2
x−∝
- 2r2
87
1
but F (x) = e
r √2 ¿
¿
so that:
x−p
N - ( 2r )
1
E (X) = ʃ x e dx ----(2
τ √R ¿
¿
- N)
x−φ
Now, let: t = ( r ) - - - -
=σ t=x-φ
= σ t + φ = x -------
= x = σ t + φ --------
dx
= σ
dt
= dx = σ dt ----
Then:
∞ - ½ t2
88
1
E (X) = ʃ σ (σ t = N ) e
σ √ 2˄
∞
∞ ∞
N σ
= ʃ e -1/2 t2 dt + ʃ t e -1/2 t2 dt
√ 2¿ ¿ √ 2¿ ¿
∞ -∞
= φX1+0
= φ
2 – VARIANCE
recall that:
Var (X) = E [ (X - ∝) 2]
1 ( x−φ)2
= σ ( 2 ˄) ʃ (x - φ )2 e - dx
√ 2σ 2
Substitute:
x−∝
t= ( σ )
= ( x -φ ) = σ t
so that: dx = σ dt
89
Hence:
∞ -1/2 t2
1
Var (X) = σ 2˄ ʃ t2 e dt
∞
∞ - ½ t2
σ2
= ʃ t2 e dt
√2 ˄
∞
σ2
= x √ 2¿ ¿
√2 ˄
= σ2
1- EXPECTATION
If the random variable X has an exponential distribution; then;
∞
E (X) = ʃ x F(x) dx --------------------- (1)
- ∞
-dx
= ʃ x λe dx ----- (2)
du
From (3): dx
=1
90
= du = dx ---- (5)
-dx
V= λe
dv -dx
= dx = -e
λ
= dv = - e -dx dx
λ
∞
λ xH
= ( λ ) e –dx
∞
1
= λ
2 – VARIANCE:
91
but E (X2) = ʃ x2 F (x) dx
∞
∞ -dx
2
=ʃ x λe dx --- (3)
∞
Let U = x2
du
then dx = 2x
dv −e−d x
then dx = λ
e−d x
= dv = - λ
dx ----- (5)
so that:
∞
−x 2 e−dx
E (X2) = - λ
+2 ʃ x e -dx
dx
∞
∞ Q
−x e−dx e−d x
=0+2 λ
- X
∞ 0
= 2/¿λ2
92
Then :
1
but E(X) = λ ------ (7)
1
and [E (X) ]2 = λ ------ (8)
= 2-1
λ2
= 1/λ2
MOMENTS IN STATISTICS:
93
E (X5)
Notes:
The 3rd moment (skewness) measures the asymmetry of distribution, while the 4 th
moment (kurtosis) measures how heavy the tail values are. physicists generally use
the higher-order moments in application of physics.
The moments of a continuous probability distribution are often used to describe the
shape of the probability density function (p.d.f) The first four moments (if they
exist) are well known because they correspond to familiar descriptive statistics.
For a continuous probability distribution for density function F (X), the raw
moment (also called the moment about zero) is defined as:
∞
M1n = ʃ xn F(x) dx
-∞
The mean is defined as the first raw moment. The most famous central moment
is the second central moment, which is the variance. The second central moment is
usually
94
Denoted by σ 2 to emphasize that the variance is a positive quantity.
Similar definitions exist for discrete distributions. Technically, the moments are
defined by using the notion of the expected value of a random variable. Loosely
speaking, One can replace the integrals by summations. For example if X is a
discrete random variable, with a countable set of possible values, (X 1 X2 X3 ) That
have probabiblity [ P1, P2 P3] of excreting respectively, then the raw nth moment
for X is the Sum:
The moment generating function (mgt) is a function often used to characterize the
distribution of a random variable.
Definition
95
The following are the three basic characteristics (properties of mgf):
i. If x and y are independent, then Mx+y (t) = Mx(t) My(t), which is true on
the common interval where both mgf’s exist.
ii. MqX+b(t) = e e(tb) Mx(at)
iii. If X
How is mgf applied?
Example 1:
Solution:
- λx
F(x)= λ e
Now;
Mx (t) + E [ e tx ]
96
= Mx (e) = ʃ λ e tx e –λx dx
∞
∞
=λ ʃ e tx e -λx
dx
∞
N
=λ ʃ e tx + (- λx) dx
0
= N
=λ ʃ e -tx + (- λx) dx
0
∞ - x (λ - t)
= λʃ e dx
0
λ
= λ−t [0 - 1]
λ
= λ−t , for t < λ , or λ > t
λ
M1 (t) = ( λ−t)2
2λ
and Mii (t) = λ−t
97
2λ
E (X2) = Mx (o) = λ−t
Therefore
2λ
Var (x) = λ−t - (1/λ)
2 1
= λ - ( λ)
2−1
= λ2
1
= λ
Furthermore, the above expected value exists and is finite for any t E [- h, h],
provided 0 < h < λ. AS a consequence, X possesses a mgf:
λ
Mx (E) = λ−t .
Note
The moment generating function (mgf) takes it name by the fact that, it can be used
to derived the moment of X, as stated thus:
If a random variable X possess a mgf M x (t), then nth moment of X denoted by (n)
exists and is finite for any nEN. Furthermore.
d n M x (t)
φ x (n) = E [Xn] = t =0
dtn
98
d n M x (t)
where t = 0 is the nth derivative of Mx (t) with respect to t, evaluated
dtn
at the point t = o.
Example 2:
If X is a discrete random variable having R x = 0, 1 Derive the moment generating
function of X if it exists.
Solution
Recall that, the probability mass function of a Bernoulli distribution is given by:
F(x) = Px q1- x for x = 0 or 1
By mgf; we will have:
tx
Mx (t) = E [e ]
= Ʃ e tx . Px (x)
xERx
= e (t) X P + 1X (1 - P)
= 1 – P + Pet
Example 3
99
Solution:
We can use the formula below to compute the variance:
Var (X) = E[X2] - [E (X)]2
The expected value of X is computed by taking the first derivative of the moment
generating function:
dMx(t)
= ½ exp (t) = ½ et
dt
and evaluating it at t = 0:
d Mx (t)
E[X] = = 1.2 e 0 = ½
dt
and evaluating at t = 0:
E (X2) = d2 Mx (t) = ½ e0 = ½ x 1 = ½
2
dt t =0
Therefore:
100
1 1
=2 - ( 2 )2
1 1
=2 -4
1
=4
Example 4:
Solution:
This can easily be done by first, computing the moment generating functions of a
unit random variable with parameter 0 and 1. letting Z such a random variable, we
have;
Mx (t) = E[ e tz]
∞
1
= ʃ e tx - (x-∝)2
√2 ˄ σ 2
-∞ e σ 2dx
∞
1
= √2 ⌅ ʃ e - (x - t)2
-∞ 2 dx
∞
1
= ʃ e { - (x - t)2 + t2 }
√2 ⌅
-∞ 2 2
101
Since e t2/2 is not involving x, so it can be written as;
∞
1
Mx (t) = e t2 /2 ʃ e - (x - t)2
√2 ⌅
-∞ 2 dx
t2/2 ∞ - (x - t)2
e
= ʃ e 2 dx
√2 ⌅
-∞
t2 /2 ∞ - y2/2
e
= ʃ e dx
√2 ⌅
-∞
dy
If: y = x – t and dx =1 = dx = dy
(t)
Mx = e t2 /2 X √ 2⌅
√ 2⌅
X = ∝ + σZ
Z = X - φ
σ
102
tx
Mx (t) =E[e ]
= E [ e t (φ + σ z) ]
= E[ e (tφ + tσ z) ]
= E [ e tφ . e trz]
trz
= e tφ . E [ e ]
= e tφ . M(tσ )
(tσ ) 2/2
= e tφ . e
= e t2 σ 2/2 + tφ
So that:
1
Mx (t) = (2σ + φ ) e (t2 σ 3 + t φ )
2 2
(σ 2 t2 + φ t)
2
= (φ + σ 2 t) e
And:
103
M11 (t) = (∝ + σ 2)2 eσ 2
t2 + ∝t
2 + σ2 e t2 r 2 + ∝t
2
But:
= φ 2 + σ 2 - (φ )2
= φ2 + σ 2 - φ2
= σ2
104