[go: up one dir, main page]

0% found this document useful (0 votes)
16 views104 pages

Sta Statistical Theory

The document provides an overview of random variables in statistical theory, defining them as numerical descriptions of experimental outcomes. It classifies random variables into discrete and continuous types, detailing their characteristics and providing examples. Additionally, it discusses probability distributions, including the probability mass function and various discrete probability distributions such as Bernoulli, Binomial, and Negative Binomial distributions.

Uploaded by

silas danladi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views104 pages

Sta Statistical Theory

The document provides an overview of random variables in statistical theory, defining them as numerical descriptions of experimental outcomes. It classifies random variables into discrete and continuous types, detailing their characteristics and providing examples. Additionally, it discusses probability distributions, including the probability mass function and various discrete probability distributions such as Bernoulli, Binomial, and Negative Binomial distributions.

Uploaded by

silas danladi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 104

STA 122: STATISTICAL THEORY

LECTURE NOTE

MR D.P SALAMA

MONDAY (0-12)

1
CHAPTER ONE:

Random Variable:

Remember that in the previous work, we defined the concept of an experiment and
its associated experimental outcomes. A random variables provides a means for
describing experimental outcomes using numerical values. Random variables must
assume numerical values.

Definition:

A random variable is a numerical description of the outcome of an event.

In effect, a random variable associates a numerical value with each possible


experiment outcome. The particular numerical value of the random variable
depends on the outcome of the experiment.

In another form, suppose S is a sample space of some random experiment the


outcomes of the experiment i.e sample points of S need not be number. A function
which assigns a real number with each experimental outcome (sample point) is
called; a random variable. A random variable is therefore a function which assigns
a real number to each element of S (the sample space). It has the sample space as
its domain. In simplest mathematical expression, we valued unction whose domain
is a sample space.

Random variable will be denoted by upper case letters such as X, Y and Z. The
actual memerical values that a random variable can assume will be denoted by
lower case letters such.as x, y and z.

Im[ortantly, we can talk about the “probability that X takes on values x” as


P(X=x), denoted by P(x) or F(x).

2
A random variable can be classified as being either discrete or continuous
depending on the numerical values it assumes.

DISCRETE RANDOM VARIABLES:

A random variable that may assume either of values such as 0,1,2…. is referred to
as a discrete random variable.

Examples of discrete random variables.

Experiment Random Variable (X) Possibl Values for the


random variable (x)
a. Contact five a. Number if a. 0, 1, 2, 3, 4, 5
customers customers who
place an order
b. Inspect a shipment b. Number of defective b. 1, 2, 3, …, 48, 49, 50
of 50 radios radios
c. Operate a restaurant c. Number of c. 0, 1, 2, 3, …
for one day customers
d. Sell an automobile d. Gender of the customer d. 0 if male, 1 if female

CONTINUOUS RANDOM VARIABLES:

A random variable that may assume any numerical value in an interval or


collection of intervals is called a continuous random variable. Experimental
outcomes based on measurement scales such as time, weight, distance, and
temperature can be described by continues random variables.

3
Examples of continues random variables:

Experiment Random Variable (X) Possible Values for the


Random Variable (x)
1. Operate a Bank 1. Time between 1. x z o
customer arrivals in
minutes
2. Fill a soft drink 2. Number of kg. 2. 0 ¿ x ¿ 12.1
tank (max =12.1
kg)
3. Construct a new 3.Percentage of 3. 0 ¿ x ¿ 100
library project complete after six
months
4. Test a new 4. Temperature when the . .
4. 150 F ¿ x ¿ 212 F
chemical desired reaction takes

.
place (Min 150 F, max
212 F

Example 1:

Consider the experiment of testing a coin two time. Let X be the random variable
giving the number of tails obtained and Y be the sum of head obtained.

find: (i) the value of X

(ii) the value of Y.

4
Solution:

The sample space S is given by:

S HH, HT, TH, TT

(i) X (HH) = 0, X (HT) = X (TH) = 1 and X (TT) = 2


Therefore, the values of X are:
X = 0, 1, 2
(ii) Y (HH) = 2, Y (HT) = Y (TH) = 1; and Y (TT) = 0
Therefore, the values of Y are:
Y = 0, 1, 2

Example 2

A fair die is tossed

Let X denotes 0 or 1 an even number or an odd number occurs.

Find the values of X.

Solution:

The sample space S is given by:

S = 1, 2, 3, 4, 5, 6

Then we have:

X (2, 4, 6) = 0; and X (1, 3, 5) = 1

The value of X are:

X = Ѕ 0, 1

5
Discrete probability distributions:

The probability distribution for a random variable describes how probabilities are
distributed over the values of the random variable. For a discrete random variable
X, the probability distribution is defined by a probability function, denoted by F
(x). The probability function provides the probability for each value of the random
variable.

conditions required for a discrete probability distributions are as follows:

(i) F (x) ¿ 0. or P (X= x) = P(x) ¿ 0


(ii) Ʃf (x) = 1 or Ʃ P(X=x) = 1, where the sum is over all possible values of x

Probability Mass Function (P.M.F):

The probability mass function, sometimes called the probability mass function of
X, is used to denote the idea that a mass of probability is piled up at discrete points.
It is often very convincing to list the probability for a discrete random variable on a
table.

In other words, if X is a discrete random variable, then the function denoted by F x


(x) or Px (x) and defined by: Fx (x) = P [X= x ], if x = x j; J = 1,2,… n…, O if x =
xj.

The function above is called the probability mass variable X is a function which
associate a real number to each value of X. Hence F x (x) is a function with domain
the interval [O, 1]

Some properties of P.M.F:

1. 0 ¿ Fx (x) ¿ 1
2. Ʃ Xj ¿ 0 Fx (x) = 1

6
3. Fx (x) = 0, if x = xj

In probability and statistics, a probability mass function is a function that gives


the probability that a discrete random variable is exactly equal to some value.
Sometimes it is also known as the discrete probability density function.

Example 1:

Suppose X is a discrete random variable with P.M.F; F (x) = C x, x = 0, 1, 2

find the value of C for which the P.M.F is valid.

Solution

Using property 2

Ʃ
Fx (x) = 1
x j¿

Ʃ
Cx=1
x j¿

Ʃ
C x j ¿ x =1

So that C (0+1+2) = 1

3c = 1

1
Hence: c = 3

x
Hence the P.M.F is: Fx (x) = 3 X = 0, 1, 2.

Example 2:

7
An experiment consists of tossing a fair win three times. Let the random
variable X be the sum of heads obtain. Find the probability mass function
(P.M.F) of X.

Solution:

The Sample Space is:

S = [ HHH, HHT, HTH,THH,TTH,THT,HTT,TTT]

Then

X (TTT) = 0, X (TTH, THT, HTT) = 1,

X (HHT,HTH,THH) = 2: X (HHH) = 3

The P.M.F is:

1 3 3 1
F1 (0) = 8 F2 (1) = 8 F3 (2) = 8 F4 (3) = 8

and in tabular form, the P.M.F can be summarized this:

x 0 1 2 3
Fx (x) 1 3 3 1
8 8 8 8

Some Common Discrete Probability Functions:

The commonest discrete probability functions are the:

i. Bernoulli Distribution (trial)


ii. Binomial Distribution
iii. Hyper geometric Distribution

8
iv. Poission Distribution
v. Discrete uniform Distribution
vi. Negative Binomial Distribution

Now:

1. THE BERNOULLI DISTRIBUTION:

A Bernoulli trial is an experiment with two possible outcomes which is label


“success” and “failure”. Examples of Bernoulli trials are;

The result of shipped coin (head or tail);


The sex of a new born child (boy or girl),
The outcome of a rolled die (even or odd)

It is a discrete probability distribution named after a SWISS scientist Jacob


Bernoulli (1654-1705)

If X is a random variable which takes the value 1 when the outcome is a success
and value 0 when the outcome is failure with probability P. and q = 1-p,
respectively. Then X is called a Bernoulii random variable with P.M.F:

F (x:P) = Px q1-x if x = 0 or 1

o. otherwise (elsewhere)

or P (x) = Px (1 - p), x = o, 1

THE BINOMIAL DISTRIBUTION:

Instead of inspecting a single item as we do with the Bernoulli random variable,


suppose we now independently inspect in items and record values for x 1 x2 xn xi

9
(I = 1,2, …,n0 if the experiment is a success and X i = 0 if the experiment result in
a failure.

An experiment that consist of finite repeated independent Bernoulli trails is called


a binomial experiment. If we let the random variable X be the number of success
obtained in the n independent trials, then X is called a binomial random variable,
with parameters (n,p). it is a discrete random variable representing the number of
success in a sequence of n independent Bernoulli trials each of which yields a
success with probability p and a failure with probability 2=1-p.

The p.m.f of X called the binominal distribution. if the random variable follows
the binominal distribution with parameters n and p, we write. X B(n,p). the
probability of getting exactly x success out of n trials is given by the probability
mass function (P.MF) as:

n
f (x:n, p) = ( x ) Px q n-x

n
Where the binomial coefficient is ( x ) or C (n, x), or n Cx

When n =1, then the binomial distribution become the Bernoulli distribution.

PROPERTIES OF BINOMIAL EXPERIMENT:

1. The experiment consists of a sequence of n identical trials


2. Two outcomes are possible on each trial. We refer to the outcome as a success
and the other outcome as a failure.
3. The probability of a success, denoted by P, does not change from trial to trial.
consequently, the probability of a failure, denoted by 1-p, does not change
from trial to trial.
4. The trials are independent.

10
Example 1:

Let X be the number of boys in a family of 3 find the distribution of X if Boys


and girls are equally likely.

Solution

1
n = 3, P = 2

Then;

n
p (x= 0) = x Px qn-x

n
Where x = n Cx; x = 0 and n = 3

1 1
Thus: P(x = 0) = 3C0 ( 2 )0 ( 2 )3

1
=8

1 1 3
P(x = 1) = 3C, ( 2 )1 ( 8 )2 = 8

1 1 3
P (x = 2) = 3C2 ( 8 )2 ( 2 )1 = 8

Hence the distribution is:

X 0 1 2 3
P (x) 1 3 3 3
8 8 8 8

11
Example 2:

Given that there are five children in a family find the probability that there are:

i. More boys than girls


ii. Fewers boys than girls. Assume equal probability for boys and girls

Solution

Let X be the number of boys

1
Then n = 5, and P = q = 2

i. There are more boys than girls if X = 5 or X = 4 or X = 3


Hence; P (X = 5 or 4 or 3)= P (x=5) + P(x=4)+P(x=3)
1 1 1
= ( 2 )5 + 5( 2 )5 + 10 ( 2 )5
1
=2

ii. There are fewer boys than girls if:


X = 0 , or 1 or 2.
Then; P(x = 0 or 1 or 2) = P(x=0)+P(x=1)+P(x=2)
1 1 1
= ( 2 )5 + 5 ( 2 ) + 10( 2 )5
1
=2

4 – THE NEGATIVE BINOMIAL DISTRIBUTION

The probability mass function of a negative binomial distribution with parameters


r, p is given by:

F (x: r, p) (r +x – 1) Pr qx, x =0,1,2,…n

12
Suppose a Bernouli trial is repeated until r successes are observed. Let the random
variable X be the number of failures before the rth success occurs. Then rt x
denoted the total number of trials needed to get exactly r successes and x failures.
If the probability of success on each trial is P and failure q= 1 – p, then X is called
a negative binomial random variable. This regular distribution is also called pascal
distribution, since r is restricted to be a integer.

5-THE GEOMETRIC DISTRIBUTION

Suppose a Bernoulli trial is repeated several independent times. This gives rise to
an infinite sequence of Bernoulli trials. Let the random variable X be the number
of trials needed to get the first success. if the probability of a success on each trial
is p and a failure is q = 1 –p, then x is called a geometric random variable.

The probability mass function (P.M.F) of X for this first kind is:

F(x) = P(x=x) + Pqx-1

if however, the random variable is for x = 1,2,3,… We assume in sample members


divided into two groups (K members and M-K members) . Let r members be
selected at random from m without replacement such that S of members are from
the group (k). Then the probability of selecting p member is defined as

k m−5
P(Y=P)= ( s ) ( r−5 )

m
=K M-K
r
C C
S
MC
Defined as the number of trial before the first success, then the P.M.F of this
second kind is:

13
F(x) = Pqx , for x = 0,1,2,…
6- THE HYPERGEAMETRIC DISTRIBUTION
This is a discrete probability distribution that describes the number of success in a
sequence of n draws from a finite population N without replacement.
If detectives and N-D non-defectives. Suppose a sample of n distinct objects is
drawn from the batch and interest is on the probability of selecting exactly x
defective from the D defective and n-x non-defectives from N-D non-defectives.
This is known as hypergeametric experiment. The probability distribution of the
hypergeametric random variable X is called the hypergeametric distribution. it will
be denoted by F(x: N,n,D) where the parameters are N,D and n.
The probability mass function of the hypergeametric random variables is given by:
D
F(x;N,D,n) = ( x ) (N-D)
n=k , for x = o,1,2,…
N
(n)
0 otherwise.
This distribution possess a unique property in that:
F(x; N.D.n) = F (n – x; N, N-D, n)
i.e. The probability of getting x successes out of n samples drawn, equals the
probability of getting (n-x) failures out of the same number of samples.
7- THE POISSON DISTRIBUTION
If an experiment involves the number of occurrences of an event in a given time
interval or in a specified region, then it is called a poisson experiment. The given
time interval may be any length such as minute, hour, a day, a week, a month, or
even a year while the specified region could be a line segment, an area, or volume.
Let the random variable X be the number of occurrences in a poisson experiment,
then x is called a poisson random variable.

14
Poisson distribution is one of the simplest and perhaps most frequently used
probability distributions to model the time instants at which events occurs/ it is
used in calculating the probability of the number of events occurring in a fixed
period of time. If these events occur with known average rate and are independent
of the time since the last event occurred.
The probability that there are exactly x occurrence of an event during such time
interval has the probability mass functions.
F(x; x) = e -ʎ ʎx , x = 0,1,2,…
x!
Where v is the average number of occurrence in the given time interval or
specified region.
If event in a poisson process occur at an average rate of ʎ per unit, then the
expected number of occurrence X in the interval of length t has the poisson P.d-f:
F(x; ʎ) = e -ʎ ʎt (ʎt)x , x = 0,1,2,…
x!
Some properties of poisson distribution:
1. The average number of success ʎ, called the rate of events if known.
2. The number of successes occurring in no overlapping intervals is
independent.
3. The probability of exactly one success in a sufficiently short interval is
proportional to the length of the interval
4. The probability of two or more successes in a sufficiently short interval is
zero
8- THE DISCRETE UNIFORM DISTRIBUTION
Suppose a random variable X can assumes finite values and all the possible values
are equally probable. Then X is said to have a discrete uniform distribution. It is
the simplest of all discrete probability distributions. If a random variable has K

15
possible outcomes where the probability of any occurrence (outcomes) is 1/k, then
it is discrete uniform distribution with probability mass function (P.M.F) as:
F(x) 1/k for x = 1,2,…, k
0 otherwise
it has the parameter K.
N
in general, the value of k in the P.M.F of a uniform distribution is given by ( n )

when selecting a subset of size n from a finite sample space of size N.


GENERAL EXAMPLES ON THE DISTRIBUTION DISCUSSED
Example 1:
A basketball player hits on 75% of his shots from the free-throw line. What is the
probability that he makes his first hit on the fourth throw?
(Apply P.M.F of geometric distribution)
Solution
Let X be the number of throws needed to make the first hit.
75 4
P = 100 = 4
100−75 25 1
q= 100
= 100 = 4

Also , x = 4, recalled that


F (x) = P(X=x) Pqx-1, for x =1,2,3,…
3 1
Then P(X=4) = ( 4 ) ( 4 ) 4-1
3 1
= 4 x ( 4 )3
3 1 3
= 4 x 64 = 365

Example 2:

16
A basketball player makes repeated shot from the free throw line until the second
basket is hit. If the probability of hitting a basket is 0.3. find the probability of
making the second basket on the twelve throw (Apply negative binomial)
Solution
Let X be the number of non-baskets, and r the number of hits.
Recalled the negative binomial distribution and hits P.M.F;
F(x: r, p)= (r + x -1) pr qx , x = 0,1,2,…
x
where
r = 2, x = 10, p = 0.3, and q= 0.7
by substitution;
F(10.2, 0.3) = (2+10) (0.3)2 (0.7)10
10

= 11 C10 (0.3)2 (0.7)10


= 11 (0.09) (0.0282)
= 0.0279
= 0.03
Example 3:
A fair die is tossed until the second five is observed. Find the probability of
observing the second five at the tenth toss. (Apply negative binomial concept)
Solution
Let X be the number of non-fives, and r the number of fives
Then;
1 5
x = 8, r =2 p = 6 , q = 6

So that:

F (x; r, p) = (r + x -1) pr qx

17
x
1 1 5
F (8;,2, 6 ) = (2 + 8 - 1) ( 6 )2 ( 6 )8
8
1 5
= 9C8 ( 6 )2 ( 6 )8

= 9 (0.1667)2 (0.8333)8
= 9 (0.0278) (0.2326)
= 0.0582
= 0.06
Example 4:
A bag contains three green ball and five white balls. two balls are drawn at
random without replacement from the bad. If X denotes the number of green balls
in the sample then
i. Find the P.M.F of x.
ii. Find the probability distribution of X.
iii. Find P(x = 1 or 2)
(Apply hypergeametric distribution concept).
Solution
If X ‫ ﻜ‬Hypergeametric distribution with:
N= 8, D =3, n =2,
then, recall the P.M.F of geometric distribution.
D N −D
F (x: N,D,n) = ( x ) ( n−x )
N
(n)
Therefore:
D N −D
i. F(x: 8,2,3) = ( x ) ( n−x )
8
(2)

18
3 5
= ( x ) ( 2−x ) , x = 0,1,2.
8
(2)

ii. P (x = 0) = F (0,8,2,3)

3 5 5
iii. ( 0 ) ( 2 ) = 14
8
iv. (2)

P(x=1) = F(1:, 8,2,3)


3 5
1 1
8
2
15
= 28

P(x =2) = F (2; 8,2,3)

3 5
2 0
8
2
3
= 28

iii. P(X = 1 or 2) = F(1: 8,2,3) or F(2: 8,2,3)

15 3
= 28
+ 28
9
= 14

Example 5:

19
In a large hospital, the probability of giving birth to a male child is 0.03. What is
the probability that out of 48 women, less than two will give birth to a male child?
(Apply binomial distribution)

Solution

The number of male births follows a binomial distribution. Since a birth is either a
male with probability P = 0.03 or a female.

P(x - 1) = F(1:, 8,2,3)

3 5
= ( 1) ( 1)
8
2
15
= 28

P(x - 2) = F(2: 8,2,3)

3 5
= (2) (0)
8
(2)
3
= 28
iiii. P(x – 1 or 2) = F(1: 8,2,3) or F(2: 8,2,3)

15 3
= ( 28 ) + ( 28 )
8
2
9
= 14

Example 5:

20
In a large hospital, the probability of giving birth to a male child is 0.03. What is
the probability that out of 48 women less than two will give birth to a male child?
(Apply binomial distribution)
Solution
The number of male births follows a binomial distribution. Since a birth is either a
male with probability P= 0.03 or a female with probability q = 0.97, and n = 48
Hence;
P(X <2) = P(X = 0) + P(X = 1)
= 0.97 48 + 48 c, (0.97)47 (0.03)
observe that without a calculator this calculation is very tedious. to simplify it, we
therefore need to use the poisson distribution which gives a good approximation to
the binomial distribution for large n and small p.
NOW:
Let ʎ = np
= 48 x 0.03
= 1.44
Then
P(X = 0) + P(X= 1) = e -1.44 + 1.44 e-1.44
= 0.5781
CUMULATIVE DISTRIBUTION FUNCTION FOR DISCRETE RANDOM
VARIABLE:
We sometimes variables by looking at their cumulative probability. That is for any
random variable X we may look at ( X ≤ b) for any real number b. This is the
cumulative probability for X evaluated at b. thus we can define a function.
F (b) as:
F (b) = P(X ≤ b).
And if X is discrete
21
F(b) = Ʃ b P(x)
x=∞
Where P(x) is the probability function.
The distribution function is often called the cumulative distribution function.
In other words, the cumulative distribution function of a random variable X
denoted by F5 (x) is defined to be that function with domain the real line and
counter domain the real line and counter domain the interval [ 0, 1].
It is defined as:
Fx (x) = P(X≤ x) = Ʃ Px Px (xj)
xj ≤ x
It gives the probability that the random variable X takes on a value less than or
equal to a given number x.
Properties of C.d.f
i. 0 ≤ Fx (x) ≤1
ii. Fx (x1) ≤ Fx (x2) if x1 <x2
iii. lim Fx (x) = Fx (∞ ) = 0. x → ∞
iv. Lim Fx (x) = Fx (- ∞ ) = 0. x → ∞
v. Fx (x) is defined uniquely for each random variable.
Example 1
Let be the number of boys in a family of 3. Find the distribution of X and the
cumulative distribution of X
Solution
Let ‘b’ denote boys and ‘g’ denote girls.
Then the sample space is given by:
S = ς bbb, bbg, bgb, gbb, ggn, gbg, bgg, ggg
To find the distribution of X,
Let X; X(ggg) = 0, X(ggb, gbg, bgg) = 1
The values of X are:

22
X = {0, 1, 2, 3}

Hence the distribution of X is given by;


x 0 1 2 3
Fx (x) 1 3 3 1
8 8 8 8

The cumulative distribution is then obtained as seen below:

0 if x < 0
1
8
if 0 ≤ x ¿ 1
4
Fx (x) = 8
if 1 ≤ x < 2
7
8
if 2 ≤ x < 3
8
if x ≥ 3
8
Example 2
Consider the experiment of tossing a fair disce. Find the cumulative distribution.
Solution
Let x denotes the outcome of the toss
Then the distribution of X is given by;
Values of X = {1, 2, 3, 4, 5, 6, }
The distribution is presented as
x 1 2 3 4 5 6
Fx (x) 1 1 1 1 1 1
6 6 6 6 6 6

23
The cumulative distribution function is obtained as shown below:

0 if x < 1
1
6
if 1 ≤ x ¿ 2
1
ƩFx (x) = 3
if 2 ≤ x < 3
1
2
if 3 ≤ x < 4
2
if 4 ≤ x ¿ 5
3
5
if 5 ≤ x < 6
6
6
6
if x ≥ 6

EXPECTATIONS AND VARIANCE:


EXPETATION (EXPECTED VALUE)
The expected value or mean of a random variable is a measure of the central
location for the random variable.
Let X be a discrete random variable with probability mass function P(x). the mean
or expected value of X is denoted by E(x) or and defined as:
E(X) = Ʃ x P(x)
(the sum is over all values of x for which)
P(x > 0)
Sometimes the notation; E(x) =
The mean of a random X is obtained by mutinying each value x by its
corresponding probability and summuing the product. It tells us where the value

Example 1
Let X be the number of boys in a family of 3. Find:

24
i. The distribution of X
ii. the expected value of X

Solution
(i) Let ‘b’ denote boys and ‘g’ denote girls then the sample space is given
by:
S = { bbb, bbg, bgb, gbb, ggb, gbg, bgg, ggg}
To find the distribution of X
Let: X(ggg) = 0; X (ggb, gbg, bgg) = 1;
X(bbg, bgb, gbb) = 2; X(bbb) = 3.
The values of X are;
X = 0, 1, 2, 3
Hence the distribution of X is summarize as follows:
x 0 1 2 3
Fx (x) 1 3 3 1
8 8 8 8

Expected value of X:
E(x) = Ʃ x P(x)
1 3 3 1
= (0 x 8 ) + (1 x 8 ) + (2 x 8 ) + (3 x 8 )
0 3 6 3
=8+8+8+8

= 0 + 3 +6+3
8
12
= 8
3
- 2

25
1
=1 2

Example 2
Consider an experiment of tossing a fair die once and let x be the number that turns
up.
Find the expected value of X.
Solution
The sample space is
S = {1, 2, 3, 4, 5, 6}
Hence;
1
P(x = 1) = P(x = 2) = P(x =3) = … = P(x = 6) = 6

Then;
1 1 1 1 1 1
E(x) = (1 x 6 ) + (2 x 6 ) + (3 x 6 ) + (4 x 6 ) + (5 x 6 ) +(6 x 6 )
1 2 3 4 5 6
=6+6+6+6+6+6

= 1+2+3+4+5+6
6
1
=2 6
1
=3 2

PROPERTIES OF EXPECTATION
1. E (C) = C (where c is a constant)
2. E (a x) = a E(x)
3. E(ax +6) = a E(x) +b
4. E (X+ Y) = E(x) ± E (Y)
26
5. E(x Y) = E(x) E(Y)
VARIANCE
The variance of a rabndom variable X with a probability mass function P(x) and
expected value V is denoted by ver (x) or δ 2 = E(x - v)2
Sometimes the notation below is used
E(x - v)6 = δ 2
The variance of X shows how the values of X are dispsersed about the mean. If the
values of X are close to the mean, then the variance of x will be small and vice
verse. The positive square root of the variance gives the standard deviation. It has
the same unit of measurement as the values of X, unlike the variance.
The variance of X is usually evaluated using the formula below;
Var (x) = Ʃ ( - V)2 P(x)
Example 1:
below is the probability distribution of the random variable X denoting the number
of boys in a family of 3
x 0 1 2 3
P(x) 1 3 3 1
8 8 8 8

Find the variance of X.


Solution
Recall that;
Var (x) = E (x -φ )2 P(x)
Here, the values of x are
And the value of µ = Ʃ x P(x)
1 3 3 1
= (0 x 8 ) + (1 x 8 ) = (2 x 8 ) + (3 x 8 )

27
3 6 3
=0+ 8 + 8 + 8

= 0+3=6=3
8
12
= 8 = 1.5

NON variance will be;


var (x) = Ʃ (x -µ)2 P(x)
1 3 3 1
= (0- 1.5)2 8 + (1-1.5)2 8 + (2-1.5)2 ( 8 ) + (3-1.5)2 ( 8 )
1 3 3 1
= (- 1.5)2 ( 8 ) +(0.5)2 ( 8 ) + (0.5)2 ( 8 ) + (1.5)2 ( 8 )

= (2.25) (0.125) + (0.25) (0.375) + (0.25) (0.375) + (2.25) (0.125)


= 0.28125 + 0.9375 + 0.09375 + 0.28125
= 0.75
or
Var (x) = E(x2) - µ2
From the values:
1 3 3 1
E(x2) = (02 x 8 ) + (12 x 8 ) + (22 x 8 ) + (33 x 8 )
0 3 12 9
=8+8+ 8 +8
4
=2 8 =3

Hence
3
Var (x) = 3 – ( 2 )2
3
=4

= 0.75
PROPERTIES OF VARIANCE

28
1. Var (c) = 0 (where c is constant)
2. Var (a x) = q2 var (x)
3. Var (a x + b) = q2 Var (x)
4. Var (x ± Y) = Var (x) ± var (Y), (if x and y are independent)
STANDARD DEVIATION
The standard deviation of a random variable X is the square root of the variance,
given by;
σ = √σ2

But σ 2 = (x - y)2 P (x)


= √ E (x µ)2 P(x)
Example 2
The manager of a stockroom in a factory knows from his study of records that the
daily demand (number of times used) for a certain tool has the following
probability.
Demand (x) 0 1 2
Probability P(x) 0.1 0.5 0.4

if X denotes the daily demand;


i. Find E(x)
ii. Var (x)
iii. Standard deviation of X.
Solution
i. E(x) = Ʃ x P(x)
= 0 (0.1) + 1 (0.5) + 2 (0.4)
= 0 + 0.5 + 0.8
= 1.3
That is, the tool is used an average of 1.3 times per day.

29
ii. Variance of X;
Var (x) = Ʃ (x - μ)2 P(x)
= (0-1.3)2 (0.1) + (1-1.3)2 (0.5) + (2-1.3)2 (0.4)
= (1.69) (0.1) + (0.09)(0.5) + (0.49) (0.4)
= 0.410
iii. Standard deviation of
σ =√ σ 2

¿ √ Ʃ(x−µ)2 P (x)
σ = √ 0.410

= 0.6403
0.64 ‫ﻜ‬
THEOREM ON SOME PROPERTIES OF EXPECTATION AND
VARIANCE
Theorem 1:
For any random variable x and constants a and b;
i. E (a x +b) = q E (x) + b
ii. Var (a x +b) = q2 v (x)
Proof:
i. E (a x + b)_= Ʃ (a x +b) P(x)
from E(x) = Ʃ x P04, we will have
= Ʃ [(a x) P(x) + bp(x)]
= Ʃ a x p(x) + Ʃ b p(x)
= a Ʃ x p (x) + b Ʃ P(x)
Where Ʃ xp(x) = E(x),
and Ʃ P(x) = 1
Thus:
E(a x +b) = aE(x) + b(1)
30
= aE(x) + b
ii. V (a x +b) = E[C a x +b] – E(a x +b) ]2
iii. from Var (x) = E(x - v)2 we will have
= E[ax +b – (aE(x) +b)]2
= E[a x – q E (x) ]2
= E[q2 – q E(x) )]2
=
q2 E [(x – E (x))2]
= q2 Var (x)
Theorem 2:
If X is a random variable with a mean
µ then;
Var (x) = E (x2) - µ2 .
Proof:
The proof is as follows:
Starting with definition of µ.
Then,
V (x) = E(x -µ)2
= E [(x -µ) (x + ∝) ]
Using the concept of difference of two square.
Using the concept of this square of a difference
= E (x2 – 2 x φ ) + E (φ 2)
= E (X2) - 2φ (φ ) + φ 2
= E (X2) - 2φ E (x) + φ 2
= E(X2) - 2φ (φ ) + φ 2
= E (X2) - 2φ 2 + φ 2
= E(X2) - φ 2
Example 1:
31
The manager of a stockroom in a factory knows from his study of records that the
daily demand (number of times used) for a certain tolls has the following
probability distribution;
Deman 0 1 2
Probability 0.1 0.5 0.4

If X denotes the daily demand; using the theorems above, find the variance of X
Solution:
From the previous example (work), we see that:
E(x)=1.3
E(x2) =Ʃ x2 P(x)
= (0)2 (0.1) + (1)2 (0.5) + (2)2 (0.4)
= 0 (0.1) + 1 (0.5) + 4 (0.4)
= 0+0.5 + 1.6
= 2.1
By the theorem above (2)
V(x) = (X2) - φ 2
= 2.1 – (1.3)2
= 2.1 – 1.6.9
= 0.41
EXPECTATION AND VARIANCE OF DISCRETE PROBABILITY
DISTRIBUTIONS.
1 – THE BERNOULLI DISTRIBUTIONS
The probability mass function (P.d.f) of a Bernoulli distribution is given by;
F(x; p) = Px q1-x , if x = 0 or 1
i. Expectation of the random variable X.
E (X) = Ʃ x p (x)
32
Where the values of x are (0.1) so that:
E(x) = 0 x q + 1xp
=0+p
=P
or
E(x) = 0 (1- P) + 1 (P)
=0+P
=P
ii. variance of X
V(x) = E (X2) – [ E (x) ]2
= Ʃ x2 P(x) – P2
Where P2 is the square of the expected value
or E (X2) = 02 x q + 12 x P
=P
Hence,
Var (x) = E (x2) – [ E (X) ]2
= P –P2
= P (1 - P)
= Pq
2 – THE BINOMIAL DISTRIBUTION
The P.M.F of the binomial distribution if given by:
n
F(x:n, p) = ( x ) px qn-x , for x = 0,1,2,… n

Where the binomial coefficient is.


n
or c (n, x) or n (x)
x
Expectation of X

33
n
E (x) ƩN x ( x ) px qu-x
x=0
= ƩN x n(n-1)! P6 qn-x
x! (n-x)!
= Ʃn x n (n-1)! P.Px-1 qn-x
x (x-1) (n-x)!
= NP Ʃn (n - 1)! Px-1 qn-x
(x-1)! (n-x)!
If we let j = 1
and if j = 0
Then 0 = x = 1, x = 1
By substitution, we will have;
E(x) = np Ʃny (n -1) Pj qn-1y
j=0 j
(Where j = x - 1)
Taking the upper summation and lower summation E(x) = np (p+q) n-1
= np
Variance Of X
E (X2) = E(x (x-1) + E (x))
Where:
E (X (x-1) = Ʃn x (x - 1) n! Px qn-x
X! (N-X)!
E (xc x - 1) = Ʃn x (x - 1) n (n - 1) (n - 2)! Px qn-x
x (x-1) (x - 2)! (n -x)!
= n (n - 1) p2 Ʃn (n - 2)! Px qn-x + np Ʃn (n-1)! Px qn-x
x=2 (x-2)! (n-x)! x =1 (x- 1)! (n- x)!
2
= n (n - 1) p + np
Hence;
Var (x) = n(n-1) P2 + np – (np)2

34
= n2 p2 + np – n2 p2
= n2 - p2 = np - n2 p2
= np – np2 + n2 p2 - n2 p2 (rearranging)
np – np2 + 0
= np – np2
By factorizing, we have:
Vor (x) = np (1 - p)
But 1- p = q
Vor (x) = npq.
3 – The Geometric Distribution
The probability mass function of a geometric random variable is given by:
P(x) = P(x= x) =Pq x-1 for x = 1,2,3,..
where q = (1 - p)
Therefore: P(x) = P(1-p)x-1, for x = 1,`2,…
Expectation Of X
The expected value of a geometric random variable for first kind is;
E(x) = Ʃ x p(1 - p)x-1
x =1
= P Ʃ x (1 - p) x-1
x–1
=P 1 1-p=q
(1 - q)
where q is the common ratio
= p
P2
= 1
P
Um of a G.P
Su = a (rn - 1) if as
ry
And where q is the common ratio Sn = q (1-rn) if r < 1

35
1 –r
SN = q
1-r
eg 1 +x + x2 + +x3 +….
and;
E(x (x - 1)) = Ʃ x (x - 1) PQ x-2
x =2
x
= 2P (1-P) Ʃ∞ ( 2 ) q x-2
x=2
= 2pq
( 1 – q )3
= 2q
P2

So that
E (x2) = E(x (x - 1) + E (x)
1
= 2 (1 - p) + p
P2
= 2–p
P2
Variance of X
Var (x) = E(x2) – (E (x) )2
1
= 2–p - p
P2
q
= p
= (1 - p)
P2
4 - THE HYPERGEOMETRIC DISTRIBUTION
The probability mass function (P.M.F) of the hypergeometric random variable is
given by;

36
D N −D
( x ) ( n−x ) , x = 0, 1, 2,…
N
(n )
Expectation
D N −D
E (x) = Ʃn x ( x ) ( n−x )
N
(n )
D N −D
= n . n Ʃn ( D−1
x−1
) ( n−x
)
N −1
( n−1 )
D
=n. N

Variance
D−1 N −D N −1
Since Ʃn ( x−1 ) ( n−x ) = ( n−1 )

NON:
E (x2) = E (x(x -1) + E(x)
But,
N N −D D−2 N −D
E (x(x - 1) ( x ) ( n−x ) = n (n - 1) ( D ¿ ¿) Ʃn ( x−2 ) ( n−x )
N −2
( n−2 )

= n (n-1) D (D - 1)
N (N - 1)
Hence;
Var (x) = E (x2) – (E (x))2
nd 2
= n (n - 1) D (D - 1) + nD - ( )
N
N (N - 1) N
37
= N (n2 D2) – n2 D – nd2 + nD) + NnD (N - 1) – (nD)2 (N - 1)
N2 (N - 1)
= nD (N - D) (N - n)
N2 (N - 1)
5 – THE NEGATIVE BINOMIAL DISTRIBUTION:
The probability mass function of a negative binomial distribution with parameters
r, p is given by;
F(x: r, p) = r + x – 1 Pr qx , x = 0, 1, 2,…
x

Expectation:
r + x−1
E(x) = Ʃ x ( x ) pr qx
x=0

r + x−1
=Ʃx ( x ! ( r −1 ) ! ) pr qx
x=0

r + x−1
= r q pr Ʃ ( ( x−1 ) ! r ! )! pr qx
x=1

r + x−1
= r q pr Ʃ ( ( x−1 ) ) pr qx
x=1

= r q pr (1 - q) – (r +1)

= rq
P
ii. Variance:
Var (x) E (X2) – (E (x))2
But:

38
r + x−1
E(X (x - 1) ) = Ʃ x (x - 1) ( x ) pr qx
x=1

= Ʃ (x - 1) (r +x - 1) ! pr qx
x=1 x! (r - 1)!

= r (r +1) q2 pr Ʃ∞ (r = x - 1)! qx - 2
x = 2 (x - 2)! (R +1)!

r + x−1
= r (r + 1) q2 Ʃ∞ ( x−2 ) qx - 2
x =2

= r (r + 1) q2 pr (1 -q) – (r +2)

= r (r + 1) q2
p2

Also
E (x2) = E(x(x - 1)) + E(X)
= r (r + 1) q2 + rq
p2 P

= rq (r +1) q + p )

By substitution, variance of X becomes;


Var (x) = E(x2) – (E (X) )2
= rq (r + 1) q + p - (rq)2
P2 P2
= (rq)2 + rq2 + rqp – (rq)2
P2
= rq
P2
6 – THE POISSON DISTRIBUTION:

39
The probability mass function of a poisson distribution with parameter λ > 0 is
given by;
F (x; λ ) = e –λ λx , x = 0, 1, 2,…
x!
Expectation:
E (x) = Ʃ x F (x; λ )
x=0

E (x) = Ʃ x e -λ λx
x=1 x!

=λ Ʃ e -λ λx – 1
x=0

Let y = x -1
Then we obtain;
E (x) = λ ∞
Ʃ e -λ λ y

y=0 y!

= λ.

or
The mean of the poisson distribution is easily derived formally of one remembers a
simple Taylor series expansion of ex , namely;
ex = 1+x + x2 + x3 + ..
2! 3!

Then;
E (Y) = Ʃ y p (y)
y

40
= ∞
Ʃ y λy e –λ
y=1 y!

=λ e e-λ ∞ λ y-1

Ʃ (y - 1)!
y=1
= λ e –λ (1 + λ + λ 2 + λ 3 +…..)
2! 3!
= λ e –λ (1 + λ + λ 2 + λ 3 +…..)
2! 3!
= λ e –λ e λ

ii – Variance:
Since, ∞ e –λ λ y = 1
Ʃ y!
y=0

But E(x(x-1) ) = ∞ x (x - 1) F (x; λ )


Ʃ
x=0

= ∞ x (x -1 ) e –y λ x-2
Ʃ (x - 2) !
x=2

∞ e –λ λ 2-2
= λ2 Ʃ (x - 2)
Let Y = x – 2
∞ e –λ λ y
So that, E (x (x-1) ) = λ 2 Ʃ Y!
y=0
2

NOW,
E (x2) = E (x (x - 1) ) + E (X)
= λ2 - λ
41
and hence
Var (x) = E (x2) – (E(x) )2

= λ2 + λ - λ2
= λ.

42
PROBABILITY
Probability is the study of random or nondeterministic experiments. The
probability of an event represents the proportion of times under identical
conditions that the outcome can be expected to occur. For instance, if a dice or coin
is tossed or thrown, it is certain that the dice or coin will fall down but it is not
certain which side will fall.
FUNCTION
A function is defined as a relation between a set of inputs having one outoput each.
In simple words, a function is a relationship between imputs where each imput is
related to exactly one output. Every function has a domain and codemian or range.
A function is generally defined by F(x), where x is the imput.
PROBABILITY FUNCTION
This is used as a measure of the probability of an event and written PC?, It could
be defined as a set function with a domain and counter domain [ 0, 1] which
satisfies the axiom of probability.
RANDOM EXPERIMENTS
An experiment is said to be random if its outcome cannot be predicted with
certainty prior to the performance of the experiment. Its outcome is determined by
chance alone. some examples are the tossing of a coin and observing the face that
turns up, planting a crop and observing its yields
SAMPLE SPACE
A sample space of a random experiment denoted by S or is the collection of all
possible outcomes of the experiment. A sample space may be finite or infinite. It is
the number of elements in the space can be counted. Otherwise it is infinite. It may
also be discrete or continuous. A sample pointed is particular outcome in a sample
space.

43
AN EVENT:
An event can be defined as an appropriate subset of the sample space. An empty
set ∅ E S is an event defined which is sometimes called an impossible event with
probability zero while the event of the sample space is a sure event with
probability one.
DISCRETE SAMPLE SPACE
A sample space S is said to be discrete if it contains a finite number of sample
points (i.e in counting the number of sample points, the counting process can come
to an end) or countable infinite sample points. eg.
a. Ѕ = { 0, 1, 2, 3, 4, 5,…} set of non negative integers
b. Ѕ = { 0, 2, 4, 6,…} the set of each number.
c. Ѕ = {…, - 2, - 1, 0, 1, 2,…} set of all integer
discrete sample space may contain a finite or infinite number of elements. The
elements of a discrete sample space are isolated points on the real line, between
any two elements of Ѕ, which do not belong to Ѕ .
CONTINUOUS SAMPLE SPACE
A sample space S that has elements (all the points) in an interval, or all the points
in a union of intervals, on the real line is called continuous. Some examples are:
a. x: 0≤ x ≤ 10
b. x: 0 ≤ x ≤ 1 or 2 ≤ x ≤ 3
A continuous sample space always has an infinite number of elements: Lence it
contain an uncountable or non denumerable number of points.
RANDOM VARIABLE:
Suppose S is a sample space of some random experiment. This outcomes of the
experiment, i.e the sample points of S, need not be numbers.

44
A function which assigns a real number with each experimental outcome (sample
point) is called a random variable. A random variable is therefore a function which
assigns a real number to each element o S. It has the sample space as its domain.
Random variables are usually denoted by upper case letters such as x, y, z, etc.,
while their corresponding realizable values are denoted by lower case letters such
as x, y, z. etc.
The number of heads obtained when a coin is thrown two times, the outcome when
a die is tossed, the time (in hours) it takes for a high bulb to burn out, are all
examples of random variables. A random variables could be discrete or continuous
depending on its range.
DISCRETE RANDOM VARAIBLES
A random variable X is called discrete if the range is finite or countably infinite.
The range of a random variable X is a set of points which it’s assumes. the range of
X is said to be countable if there exist a finite set of real numbers such that X takes
values only in that set.
In other words, a discreet random variable X has countable number of possible
values (or example, the integers). Their probability distribution is given by a
probability math function, which directly maps each value of the random variable
to a probability.
PROBABILITY DISTRIBUTION
A probability distribution is a statistical function that describes all the possible
values and likelyhoods that a random variable can take within a given range. The
range. The range will be bounded between the minimum and the maximum
possible values, but precisely to be plotted on the probability distribution depends
on the number of factor. These factors include, the distribution’s mean (average),
standard deviation, skewness and kurtosis.

45
Examples of discrete probability distributions include: Binomial distribution,
poisson, Bernoulli, hexametric, hypergeoometric, uniform discrete distribution etc.
PROBABILITY MASS FUNCTION
In probability and statistic, a probability mass function (PMF) is a function that
gives the probability that a discrete random variable is exactly equal to some
value. Sometimes it is also known as the discrete density function. The value of the
random variables having the highest or largest probability mass value is called the
mode. the P.M.F is defined thus:
If X is a discrete random variable, then the function denoted by P x (x) and defined
as: Px (x) = P (x = xj) if x = xj, j = 1, 2, …, n is called probability mass function of
x.
The above function is called probability mass function (P.M.F) of X. the P.M.F of
the discrete random variable X is a function which associates a real, number to
each value of X. hence Px (x) is a function with domain the real line and counter
domain the interval [ 0, 1].. for instance, the probability that the random variable X
equals 1, i.e P (x = 1), is referred to as the probability mass function of X evaluated
at 1.
CONTINUOUS RANDOM VARIABLE:
A random variable X is said to be continuous if its range contains interval of real
numbers. continuous random variables represent measured data such as weight,
leight, tempreture, time, blood pressure, etc. If X is continuous, the probability of it
taking a singl value is zero. Hence we write:
P(X = a) = 0
PROBABILITY DENSITY FUNCTION (P.D.F)
Let X be a continuous random variable, the p.d.f of X is denoted by F (x) and
defined as:

46
Fx (x) = dfx (x)
dx
The probability density function (PDF) defied the probability function representing
the density of a continuous random variable lying between a specific range of
values. In other words. The probability density function produces the likelihood of
values of the continuous random variable.
The p.d.f of a continuous random variable is constructed so that area under its
curve bounded by the X – axis is equal to one (1).
For Fx (x) to be valid, it must be entirely above the x – axis since probabilities are
non-negative quantities. Unlike a discrete random variable, the probability
associated with a continuous random variable is evaluated using integral calculus.
For example, P(a< x< b) is evaluated as follows:
P(a < x < b) = Ѕba Fx (x) dx
Properties of p.d.f
Difference between PDF and CDF of continuous random variable
PDF is the probability that a random variable (say x) will take value exactly equal
to the random variable(x), where CDF is the probability that a random variable
(X) will take a value less than or equal to the random variable(x)
1. Fx (x) ≥
2. ᶘ∞ Fx (x) dx = 1
3. Fx (x) is piecewise continuous
4. P (a < x < b) = ᶘba Fx (x) dx.
Example 1:
Let X be a random variable with p.d.f
F (x) = k x2, 0 ≤ x ≤ 2.
Find the values of K that makes F(x) a p.d.f

47
Solution
For F(x) to be a p.d.f,
ᶘ∞ Fx (x) dx = 1
So that:
ᶘ2 K x2 dx = 1
x3
K [ 3 ]20 = 1

23 03
K [ 3 - 33 ]20 = 1

8k
3
=1
3
K= 8
3x 2
Hence F(x) = 8 , 0 ≤ x ≤ 2.

CUMULATIVE DISTRIBUTION OF A CONTINUOUS RANDOM


VARIABLES:
For a continuous random variable X, its C.D.F is denoted by Fx (x) and defined as:
Fx (x) P(x ≤ x)
= ᶘ∞ Fx (x) dx.
The cumulative distribution function describes the distribution of values of the
random variable. Fx (x) is a cumulative distribution function since it gives the
distribution of the values of the continuous random variable X in a cumulative
form. It gives the probability that the random variable X takes on a values less than
or equal to a given number x.
PROPERTIES OF CUMULATIVE DENSITY FUNCTION (c.d.f)
1. 0 ≤ Fx (x) ≤ 1
2. Fx (x) ≤ Fx (x2), if x1 < x2
48
3. lim Fx (x) = Fx (∞ ) = 1
x→∞
4. lim Fx (x) = Fx (- ∞ ) = 0
x→∞
5. Fx (x) is defined uniquely for each random variable.
THE CONTINUOUS UNIFORM DISTRIBUTION:
Let the random variable X represent the outcome of an experiment when a point is
selected at random from interval [ a , b] in such a way that the probability that X
will belong to any subinterval of [ a , b] is proportional to the length of the
subinterval, i.e the probability that X assumes any value in the interval [ a , b] is
the ratio of the interval to the length of the range ( b - a). The random varaible X is
said to have a uniform distribution with p.d.f:
1
F(x) = b−q for a ≤ X ≤ b

0, elsewhere
Hence, the real number ‘a’ and ‘b’ called the parameter of the distribution are such
that a<b.
The uniform distribution thus, is the most basic form of probability distributions.
It is a rectangular distribution with constant probability and implies the fact that
each range of values that has equal probability of occurrence.
In summary uniform distribution is a distribution function in which every possible
result is equally likely; that is the probability of each occurring is the same.
PROPERTIES OF UNIFORM DISTRIBUTION
The following are the key characteristics of the uniform distribution:
i. The density function integrates to unity
ii. Each of the imputs that go in to form the function have equal weighting
iii. Mean of the uniform function is given by:

49
N= (a + b)
2
iv. The variance is given by the equation:
Var (x) = (b - a)2
12
PLOT OF THE GRAPH OF UNIFORM DISTRIBUTION:

1
area = Widith x Leight = (a - b) x b−q = 1

Note:
The location of the interval has little influence in deciding if the uniform
distribution variable falls within the fixed length. Two factors that influence this
the most are the interval size and the fact that the interval falls within the
distribution support.
AREAS OF APPLICATION:
1. It is useful for sampling from arbitrary distribution.
2. Computer scientists use it for random number generation within a range.
3. A general method is the inverse transform sampling method which uses the
cumulative distribution function of the target random variable.

50
RELATIONSHIP WITH OTHER DISTRIBUTIONS
If the range is restricted to be between (0, 1) it is called standard uniform
distribution.
Example 2:
Suppose a random variable X has a uniform distribution on the interval (1,b) find :
i. P (2 ≤ x ≤ 4)
ii. P (x ≤ 2)
iii. P(x ≤ 3)
Solution:
1
F(x) = b−q for a ≤ x ≤ 6

0, otherwise
But a = 1 and b = 6
Thus:
1
b−q for 1 x 6
F(x) = ≤ ≤

0 otherwise

1
F (x) = 5 , fr 1 ≤ x ≤ 6

0, otherwise
4
1
i. P (2 ≤ x 4) = ∫ 5 dx

2

x
[ 5 ]42

4 2
[ 5 - [ 5 ]42

51
4−2
( 5
)

2
=5

2
1
ii. P (x ≤ 2) = ∫ dx
1 5

x
= [ 5 ]2

2 1 2−1
= [5 - 5 ] = [ 5
]

1
=5
2
1
iii. P (X ≤ 3) = ∫ dx
1 5

x
= [ 5 ]63

6 3 6−3
=[ 5 - 5 ]=[ 5
]

3
=5

Example 3:

The failure of a circuit board interrupts work computing systems until a new board
is delivered. The delivering time X is uniformly distributed over the interval 1 to 5
days.

The cost C of this failure and interruption consists of a fixed cost C and for the
new part and a cost that increases proportional to X2, so that C = C0 + C1 X2.

i. Find the probability that the density time is 2 or more days.


52
ii. Find the expected cost of a single failure in terms of C0 G

Solution

i. The delivery time X is distributed uniformly 1 to 6 days, which gives


1
F (x) = b−q q≤X≤b

0, otherwise,
which is

1 1
F (x) = 5−1 = 4 , 1 ≤ x 5

0 otherwise

Thus:
2

P (x ≥ 2) ∫ 14 dx
1

x
= [ 4 ]5

5 2
= [ 4 −¿ 4 ]

5−2
= 4

3
Pr (x) = 4

We know that:

E (C) = C0 + C1 E (X2)

So, it remains to find E (X 2). This could be found directly from the definition or by
using the variable and the fact that:
53
E (X2) = V (X) +

Using the latter approach

E (X2) = (b - q)2 + (a + b)2


12 2

(5 - 1)2 + (1 + 5)
12 2

31
3

Thus:
31
E (C) = C0 + G ( 3 )

THE NORMAL DISTRIBUTION:

A random variable X is said to be normally distributed with parameters ⱷ 2 and σ2 if


1 x−ⱷ
its probability density function is: exp ( 2 t 2 )2, < 2< 2, ∞ σ> 0 if a
σ √ 2˄
random variable X is normally distributed with mean (N) and variance (σ 2) is
written as; X N (ⱷ, σ 2 ).
The normal distribution is also called Gaussian curve, is one that is unimodal with
the total area under the curve to be 100% or a unit. It is said to be symmetrical
about N, with mean, mode and median equal and lie on the same axis. It is
characterized by population mean, N and Variance, σ 2 and for a constant σ , a
change in ⱷ, to N2 shifts the curve along the x – axis to the right, if N 1 < N2 and to
the left when N1 > N2. For a constant N, a change in σ from r1 to r2 alters the
peakness of the curve. It is more peaked (taller or thinner) if σ 1 < σ 2 and less
leaked (fatter or flatter) if σ 1 < σ 2. The X –axis asymptote to the curve while the

54
intervals on either side of N enclose appropriximately a total probability of:
68.27% for SD, 95.45% for 1.96 Sd, and 99.73% for 2.58 SD

The Normal Distribution Curve

Properties Of A Normal Distribution


i. It is bell shaped, that is unimodal
ii. Total area under the curve is 1 (100%) or a unit.
iii. It is symmetrical about N, i.e. can be bisected into two equal symmetrical
halves.
iv. mean, mode and median coincide (equal), i.e lie on the same axis.
v. characterized by population mean, N and variance σ2
vi. X-axis is an asymptote to the curve.
vii. The intervals on either sides of enclose approximately a total probability
of;
a. 68.27% for 1 SD (standard deviation)
b. 95.45% for 1.96 SD
55
c. 99.73% for 2 SD

Areas covered by each standard deviation about the mean are shown below:

The figure above is called; the probability interval under the normal distribution
curve.
STANDARD NORMAL DISTRIBUTION
The standard normal distribution is a normal distribution with a mean of zero and
standard deviation 1. The standard normal distribution is centered at zero and the
degree to which a given measurement deviates from the mean is given by the
standard deviation.
The standard normal distribution is one whose mean is zero and standard deviation
as one along the abscissa, instead of x we have a transformation of X is called the
standard score Z. Thus, the Z – score really tells us how many standard deviation
from the mean, a particular Z-score is. Any distribution of a normal variable can be
transformed to a distribution of Z by taking each X value subtracting from it mean
of X and dividing this deviation of X from its mean by the standard deviation.
The random variable Z is said to have a standard normal distribution if its p.d.f is:
1
ф (Z) = F (Z: 0, 1) = e - x2 , - ∞<x<∞
√2
2
Determining Probability For A Standard Normal Distribution:

56
Suppose Z is a standard normal random variable, the probability associated with Z
have been tabulated and can be found in statistical tables.
It is advisable that in finding probabilities associated with Z, to draw the graph of
the standard normal distribution. This will assist in locating the appropriate
probabilities. Observed that, due to the summary of the curve of ∅ (x), the
following relations are helpful.
i. P (Z ≤ 2) = ∅ (Z)
ii. P (Z ≥ ȝ) = 1 - ∅ (ȝ)
iii. P (Z ≥ ȝ) = P (Z ≥ - ȝ) or P (Z ≥ ȝ ) = ∅ (-ȝ).
iv. ∅ (ȝ) + ∅ (-ȝ) = 1

Example 1:
Let Z be a random variable with the standard normal distribution. Find;
i. P (Z ≥ 1.13)
ii. P (0.65 ≤ Z ≤ 1.26)
iii. P (- 1.3) ≤ Z ≤ 2.01
iv. P (0.00 ≤ Z ≤ 1.42)
v. P (- 0.73 ≤ Z ≤ 0.0)
vi. P (0.00 ≤ Z ≤ 1.42)
Solution
P (Z ≥ 1. 13) = 1 - ∅ (1. 13)
but from the standard normal table

∅ (1. 13) = N. 3708:

P (Z ≥ 1.13) = 1 – 0.3 708


= 0.1292.

The sketch:
57
P (0.65 ≤ Z ≤ 1.26) = ∅ (1. 16) - ∅ (0.65)
but from tables:
∅ (1.26) = 0.8962 and ∅ (0.65) = 0.1540

so that: p (0.65 ≤ Z ≤ 1.26) = 0. 3/ 962 – 0 2/422

= 0. 1540

The sketch

P (- 1. 37 ≤ Z ≤ 2.01) = ∅ (2.01) - ∅ (-1.37)

from table: ∅ (2.01) = 0.9778


∅ (- 1.37) = 0.0853

Therefore: P (-1.37 ≤ Z ≤ 2.01) = 0.9778 – 0.0853

= 0.8925

The sketch:

1.37 0 2.01

P (0.0 ≤ Z ≤ 1.42) = ∅ (1.42 ) - ∅ (0.0)


58
nut ∅ (1.42) = 0.9222

and ∅ (0.0) = 0.5000

so that:

P (0.0 ≤ Z ≤ 1.42) = 0.9222 – 0.5000

= 0. 4222

The sketch:

P (- 0.73 ≤ Z ≤ 0.0) = ∅ (0.0) - ∅ (- 0.75)

but from tables: ∅ (0.0) = 0.5000


∅ ( - 0.73) = 0.2674

so that:

P (- 0.73 ≤ Z ≤ 0.0) = 0.5000 – 0.2674

= 0.2674

The sketch:

- 0.73

P (- 1.79 ≤ Z ≤ 0.54) = ∅ ( - 0.54) - ∅ (- 1.79)


59
but from tables: ∅ (- 0.54) = 0.2946

and ∅ (- 1.79) = 0.0367

so that:

P (- 1.79 ≤ Z ≤ - 0.54) = 2946 – 0.0367

= 0. 2579

The sketch:

-1.79 – 0.540

Evaluation of probability in normal random variables:

Is a random variable X follows a normal distribution, the probability of X lying


between any two values say ‘a’ and ‘b’ is found by integrating over that range.
However, by standardization, we proceed as follows:

X=σ Z+μ

To make Z the subject;

We subtract ɤ from both sides:

X - ɤ = σ z +ɤ −ɤ

X ɤ1 = σz

Divide both sides by σ

X-ɤ = σ /z
σ σ

Z=X - ɤ
60
σ

Hence:
x−ɤ
P (Z) = P ( σ
)

This means that the probability concerning X can be determined by the above
linear transformation of X (standardization of X).
Remember, we have already established that the P (a < X <b) F (x) dx
q−ɤ X−ɤ b−ɤ
P (a < x <b) = P ( σ < σ < σ )

=P
X−ɤ
but σ =Z

so that
q−ɤ b−ɤ
P (a < x < b) = P ( σ <Z< σ )

= P (Z1 < Z < Z2)


= ∅ (Z1 31) - ∅ (Z2 32).

Example 2:
If X has a normal distribution with mean 10 and standard deviation 5. Find:
i. P (X < 11)
ii. P (X > 11)
iii. P (X < 5)
iv. P (X > 5)
v. P (5 < x < 11)
Solution:
x−ɤ
Let Z = σ

61
here: X = 11, σ = 5 and φ = 10
so that;
X−ɤ 11−10
P (X < 11) = P ( σ ∝
5
)
1
= P (Z < 5 )

∅ (0.2 )

and from table: ∅ (0.2) = 0.5793


Therefore:
P (X < 11) = 0.5793
The sketch:

X−ɤ 11−10
P (X > 11) = P (( σ ¿
5
)
1
= P (Z > 5 )

= P (Z > 0.2)
= 1 - ∅ (0.2)
from tables ; ∅ (0.2) = 0.5793
Therefore;
P(X > 11) = 1 – 0.5793
= 0.4207
The sketch:

62
X−ɤ 5−10
P (X < 5) = P ( σ ¿
5
)
5
= P (Z < - 5 )

= P (Z < - 1.0)
= ∅ (- 1.0)
but from table ; ∅ (- 1.0) = 0.1587
P (X < 5 ) = 0.1587
The sketch :

-1.0

X−ɤ 5−10
P (X > 5 ) = P ( σ
¿
5
)
5
=( Z>- 5 )

= (Z > - 1.0)
= 1 - ∅ (- 1.0)
from table ; ∅ = 0.1587
so that ; P (Z > - 1.0 ) = 1 – 0.1597
= 0.8413
The sketch

5−15 X−∝ 11−10


P (5 < X < 11) = P ( 5
< σ
< 5
)

63
= P (- 1.0 < Z < 0.2)
= ∅ (0.2) - ∅ (- 1.0)
= 0.4206.
The sketch.

Example 2:
if X has a normal distribution with mean 6, and variance 25, i.e X N (6,25). Find;
i. P (1 X – 61 < 5)
ii. P (- 2 < X 0) P ( - 2< X < 0)
iii. P (1 X – 61 < 15)
iv. P (1 X – 61 < 10)
Solution:
i. P (1 X – 61 < 5) = P (-5 < x – 6 < 5)
5 x−6 5
= P (- 5 ) < 5 < 5 )

= P (- 1.0 < Z < 1.0)


= 2P (0.0 < Z < 1.0)
= 2 [ ∅ (1.0) - ∅ (0.0) ]
= 0.6826.
2−6 X−6 0−6
P (-2< x ≤ 10) = P (- 5 < 5
< 5
)

= P ( - 1.6 < Z < - 1.2)


= ∅ (- 1.2) - ∅ (- 1.0)
= 0. 0603

64
−15 X−6 15
P (1 X – 61 < 15) = P ((- 5 < 5
< 3
)

= P (- 15 < x – 6 < 15)


= P ( - 3.0 < Z < 3.0)
= 2P (0.0 < Z < 3.0)
= 2 [ ∅ (3.0) - ∅ (0.0) ]
= 0.9974
P (1 x – 61 < 10) = P (-10 < x – 6 < 10)
10 X−6 10
= P(- 5 < 5
< 5
)

= P (-2.0> Z < 2.0)


= 2p (0.0 < Z < 2.0)
= 2 [ ∅ (2.0) - ∅ (0.0) ]
= 0.9544
Example 3
If the record of tempreture in a given month is normally distributed with mean 40 c
0

and standard deviation 3.330 C. Find the probability that the tempreture is between
41.11oC and 46.66oC.
Solution
Let T be the random variable representing tempreture:
Then; T N (40oc, 3.33oc);
Therefore:
P (41.11oc < T < 46.66o c);
41.11−40 46.66−40
=P( 3.33

3.33
)

P (0.33 ≤ Z ≤ 2.00)
= ∅ (2.0) - ∅ (0.33)
= 0.4772 – 0.1293

65
= 0.3479.

Example 1:
A certain type of electric bulbs has a mean life span of 810 hours and variance of
1600 hours. Assume that the bulbs life spans are normally distributed. Find the
probability that;
i. a bulb burns between 788 and 844 hours
ii. Less than 834 hours
iii. More than 788 hours.
Solution:
Let the life span of the light bulbs be represented by X:
Then X ∼ N (810, 1600)
288−810 X−φ 844−810
i. P (788 ≤ X ≤ 844)= ( 40

σ

40
)

= P (-0.55 ≤ Z 0.85)
= ∅ (0.85) - ∅ (-0.55)
= 0.8023 – 0.2912
= 0.5111
x−∝ 834−810
ii. P (X < 834) = ( σ < 40
)

= P (Z < 0.6)
= ∅ (0.6)
= 0.7257
x−ɤ 788−810
iii. P (X > 788) = P ( σ < 40
)

= P (Z > - 0.55)

66
= 1 - ∅ (- 0.55)
= 1 – 0.2912
= 0.7088

The Normal Distribution As An Approximation To The Binomial Distribution


The normal distribution can be used as an approximation to binomial distribution.
It can be shown that if X is a binomial random variable representing the total
number of success in n independent trials with mean µ = np and variance σ 6 = npq
then the limiting form of the standardized normal distribution Z as n → ∞ is ;
X−np
Z=
√npq

The approximation is best obtained when n is large and P is not extremely close to
0 or 1, the approximation is still fairly adequate.
Since the binomial is a discrete random variable and the normal is a continuos
random variable, the best approximation is obtained by employing the correction
for continuity.
The correction for continuity is used as follows:
Suppose P (X = x) is desire, then;

P (X = x) = P (x – 0.3 < x + 0.5)


x−0.5−np 0.5−np
=P( <Z<x+ )
√ npq √ npq
= P (Z1 < Z < Z2)
Also;
P (X > x) = P (X > x + 0.5)
and P (X < X) = p (X < x – 0.5)
67
P (x < x < x) = P (x – 0.5 < x < x + 0.5)

Example 1:
Suppose a random variable X has a binomial distribution with n = 40 and pp = 0.5.
Find:
i. P (X = 20)
ii. P (X < 20)
iii. P (X > 20)
Solution
Since X is binomial then
µ = np = 20
σ 2 = npq = 10.

i. P (X = 20) = P (20 – 0.5 < X < 20 + 0.3)


19.50−20 X−20 20.50−20
=P( < < )
√10 √10 √10
= P (-0.16 < Z < 0.16)
= ∅ (0.16) - ∅ (-0.16)
= 0.1272
ii. P (X < 20) = P (X < 20 – 0.5)
X−20 20−0.5
=P( < )
√10 √10
19.5−20
= p (2 < )
√10
= P (Z < - 0.5)
= ∅ (- 0.5)
= 0.3085
4364
iii. P (X > 20) = P (X > 20 + 0.5)

68
X−20 20.5−20
=P( < )
√10 √10
= P (Z > 0.5)
= 1 ∅ (0.5)
= 0. 3085

Normal approximation to poisson distribution:


For large value of the λ (mean of poisson variate), the poisson distribution can be
well approximated by a normal distribution with the same mean and variance.
Let be a poisson distributed random variable with mean λ.
The mean of X is µ = E (x) = λ and
variance of X is σ 2 = v (X) = λ
The general rule of thumb to use normal approximation to poisson distribution is
that λ is sufficiently large (i.e λ ≤ 5). for sufficiently large (i.e λ ≥ 5), X N (φ , σ 2).
That is
X−φ X−D
Z= = N (0, 1)
σ √λ

Formula For Continuity Correction


Poisson distribution is a discrete distribution whereas normal distribution is a
continuous distribution. When we are using the normal approximation to poisson
distribution, we need to make correction while calculating various probabilities.
i. P ( X = x) = P (x – 0.5 < x < x + 0.5)
ii. P (x < x) = P (x < - 0.3)
iii. P (x ≥ x) = P (x < x + 0.5)
iv. P (x, < X ≤ x2) = P (x1) – 0.5 < X < x2 + 0.5
v. P (x1 ≤ X < x2) = P (x1 - 0.5 < X < x2 – 0.5)
vi. P (x1 ≥ X ≤ x2) = P (X1 – 0.5 < X < x2 + 0.5)
69
Example 1:
The mean number of kidney transplants performed per day in the unifed states in a
recent year was about 45. find the probability that on a given day/
a. exactly 50 kidney transplant will be performed
b. at least 65 kidney transplants will be performed; and
c. no more than 0 kidney transplants will be performed.
Solution
Let X denote the member of kidney transplants per day. The mean number of
kidney transplants performed day in the united states in a recent year was about 45,
λ = 45.
Since λ = 45 is large enough, we use normal approximation to posson distribution.
That is:
X− λ
Z= → (0, 1) for large λ
√λ
We now use continuity correction
a. The probability that on a given day, exactly 50 kidney transplants will be
performed is:
P (X = 50) = P (50 – 0.5 < X < 50 +0.5)
(using continuity correction)
49.5−45 X− λ 5.05−49.5
=P( < < )
√ 45 √X √ 45
= P (0.67 < Z < 0.82)
= P (Z < 0.82) – P (Z < 0.67)
from normal table;
= 0.7939 – 0.7486
= 0.0453.

70
b. The probability that on a given day at least 65 kidney transplants will be
performed is:
P (X ≥ 65) = 1 – P (X ≤ 64.5)
= 1 – P (X ≤ 64.5)

Using continuity correction:


x−λ 64.5−45
=1–p( < )
√ɤ √ 45
= 1 – P (Z ≤ 3.06)
= 1 – 0.9989
(Using normal tables)
= 0.0011
c. The probability that on a given day no more than 40 kidney transplants will
be performed is:
P (X < 40) = P (X < 40 – 0.5)
= P (X < 39.3)
(Using continuity correction:)
X− λ 39.5−45
=( < )
√λ √ 45
= P (Z < - 0.82)
= P (Z < - 0.82) (from) table is 0.2061
P (X < 40) = 0.2061
Example 2:
A radioactive element disintegrates such that is follows a poisson distribution if the
mean number of particles (∝ ) emitted is recorded in 1 second interval as 69,
evaluate the probability of:
a. Less than 60 particles are emitted in 1
71
b. Between 65 and 75 particles inclusive are emitted in 1 second.
Solution:
Let X denote the number of particle emitted in a 1 second interval. The mean
number of a particles emitted per second 69. Thus λ = 69 , and given that the
random variables X follow’s poisson distribution. That is;
X− λ
Z= → N (0, 1) for large λ
√λ
a. The probability that less than 60 particles are emitted in 1 second is:
P (X < 60) = P (X < 60 – 0.5)
= P (X < 59.5)
(Using continuity correction)
X− λ 59.5−69 74.5−69
P (X < 60 )= P ( < < 69 )
√λ √69
= P (-0.54 < Z < 0.78 )
From Normal Tables;
P (65 ≤ 75) = 0. 78 23 – 0.2946
= 0. 4877

Exponential Distribution:
The exponential distribution is a right-skewed continuous probability that models
variables in which small values occur move frequently than higher values. It is a
unimodal distribution where small values have relatively high probabilities which
consistently decline as data values increase.
In probability theory and statistics, the exponential distribution is a continuous
probability distribution that often concerns
The amount of time until some specific event happens. It is a process in which
events happen continuously and independently at a constant average rate. The
exponential distribution has the key property of memorylesss. The exponential
72
random variables can be either more small value or fewer larger values. for
example, the amount of money spent by the customer on one trip to the
supermarket follows an exponential distribution.
The continuous random variable, say X is said to have an exponential distribution,
if it has the following probability density function:
Fx (x :λ ) = λe - λx , for x > 0
0 , for x ≤ 0
Where:
λ is called the distribution rate properties of exponential distribution:
i. Right skewed shape:
The exponential distribution is right skewed, mening that it has long tail
on the right side of distribution.
ii. Non –negative:
The exponential distribution is always non-negative since the time
between events can never be negative.
iii. Memoryless:
From the point of view of waiting time until arrival of a customer, the
memoryless property means that it does not matter how long you have
waited so far.
Example 1:
Cars arrive at a certain round about according to an exponential distribution
(process) with parameter λ = 5 per hour. if an observer stands at the round about
for a specified period of time. What is the probability that it is;
i. at least 15 minutes until the next car arrives
ii. it is not more than 10 minutes
Solution:
Let the random variable X be the waiting time, the.
73
1 1
E (X) = λ = 12
1 x
i. P (X ≥ 15) = ᶘ∞ 12 e - 12 dx
15

1 x
= 12 ᶘ∞ e - 12 dx
15

x
- 12

1
=2 [e ] 15

15
- 12
1
=2
= 0. 2865

ii. P (X ≤ 10) = ʃ10


0

x
- 12 10

= -e
0
x
- 12
=1- e

=1- 0.4346

= 0.5654

Example 2:
Suppose the waiting time in minutes on a queue follows exponential distribution
1
with λ = 10 . Find the probability that an arriving customer will wait:

i. More than 10 minutes

74
ii. Between 10 and 20 minutes
Solution
Let X denote the waiting time of customers

Then,
x
∞ - 10
1
i. P (X > 10) = ʃ 10
e dx
10

x
- 10 ∞

=-e 10

-1
= e

= 0.368

x
20 -
10
1
ii. P (10 < x 20) = ʃ 10
e dx
10

x
- 20
10
= -e
10
-1 -2
= e e

= 0.36788 – 0.13533
= 0.23254
= 0. 233
75
THE EXPECTATION AND VARIANCE OF CONTINUOUS RANDOM
VARIABLE:
EXPECTATION:
Let X be a continuous random variable with p.d.f F x (x). The expectation or
expected value of X is denote by E (x) and defined by:
E (X) =ʃ x Fx (x) dx
Rx

Example 1:
Let X be random variable with p.d.f F (x) = Kx2, 0 ≤ x ≤ 2. Find:
i. The value of K that makes F(x) a p.d.f
ii. the expectation of X.
Solution
i. For F(x) to be a p.d.f,
ʃ F(x) dx = 1
Rx
So that:

ʃ2 Kx2 dx = 1
0

3 2
x
K [ ]0 =1
3

8k
and =1
3

3
=K = 8
2
3x
Hence F(x) = 8 , 0 ≤ x ≤ 2

E (X) = ʃ x F(x) dx
Rx

76
3
= ʃ2 x( x2
) dx
8
0

3x
= ʃ2 8
dx
0

3
3 x
=8 ʃ2
8
dx

4 2
3 x
=8 4
0

3 x 2
= 8 4
0

4 4
3 (2 ) 3(0)
= -
32 32

3 (16 ) 3(0)
= -
32 32

49 0
= 32 - 32

48
= 32
3
=2
ii. VARIANCE:

Let X be a continuous random variable with p.d.f Fx (x)


The variance of X is defined by var (x) and denoted by;

77
Var (X) = E (X2) - [ E (X) ]2
and E (X2) = ʃRx x2 Fx (x) dx
then; Var (x) = ʃRx x2 Fx (x) dx -

Example 1:
Let X be a random variable with p.d.f
Fx (x) = 4x3, 0 ≤ x ≤ 1
Find the:
Variance of x
Solution
Var (X) = E (X2) – [ E (X) ]2
1
but E (X) = ʃ x Fx (x) dx
0
here Fx (x) = 4x3

So that: 1
E (X) = ʃ x (4x3) dx
0

1
=ʃ 4x 4 dx
0

1
=4ʃ x4 dx
0

5 1
4x
= 5 ]
0

5 1
4 (0) 4 (0)
= -
5 5

78
4 0
= 5
- 5

E (X) = 4/5

1
2
Now, E (X ) = ʃ x2 4x3 dx
0

1
= ʃ 4x5 dx.
0

1
=ʃ 4x5 dx.
0

1
=ʃ 4x5 dx
0

1
=4ʃ x5 dx
0

6 1
4x
=[ 6 ]
0

6 6
4 (1) 4 (1)
= -
6 6

4 0
=6 - 6

4
=6

so that:

4 4
Var (X) = 6 - ( 6 )2

4 16
=6 - 25

79
100−96
= 50

4
= 150

4
= 75

Example 2:
For a lathe in a machine shap, let X denote the percentage of time out of a 40 –
hour week that the lathe is actually in use suppose X has probability density
function given by

80
3x2
F(x) =
0≤x ≤1

0 , elsewhere

Find the: i. mean of X


iii. Variance of X

Solution:
i. Mean:
h∞
E (x) = ʃ x F(x) dx.

but F (x) = 3x2
Where a = 0 and b = 1
1
E (x) = ʃ x (3x2) dx.
0

1
=ʃ 3x3 dx
0

1
= 3ʃ x3 dx
0

4
3x
= [ 4 ]1
0

4 4
1 1
= 3( 4 ) - = 3( 4 )

3
=4

E (X) = 0.75

81
Thus, on the average, the lathe is in use 75% of the time.
ii. To compute Var (x), we first fund E (X2) by

E (X2) = ʃ x2 F(x) dx
-∞
1
=ʃ x2 F (x) dx
0

1
=ʃ x2 (3x2) dx
0

1
=ʃ 3 x4 dx
0

1
= 3 ʃ x4 dx
0

5 1
3x
=[ 5 ]
o

5 5

1 1
= 3 (5 ) - 3 (5 )

3 0
=5 - 5

3
= 5 = 0.60
So that:
Var (x1) = E (X2) – φ 2
= 0.60 – (0.75)2
= 0.60 – 0.5625
Var (X) = 0.0375

82
= 0.04 to 2d.p.
Example 3:
The weekly demand X, for kerosene at certain supply station has a density function
by: x 0 ≤x ≤ 1

1
2
1<x ≤2

F(x) =
0 elsewhere

Find the expected weekly demand

Solution

To find E(X) has different nonzero forms over two disjoint regions. Thus.

E (X) = ʃ x f (x) dx

1 2
= ʃ x (x) dx + ʃ x ½ dx.
0 1

1 2
2
= ʃ x dx + ½ ʃ x dx.
0 1

3 1 2 2
x x
= 3
+ ½ 2
1
0

83
1 2

X X
= 3
+ 4
1
0

3 3 2 2
1 0 2 1
= ( 3 - 3 ) + (4 - 4 )

1 4 1
=3 + (4 - 4)

1 1
=3 + 4

4+ 3
= 612

7
= 6
EXPECTATION AND VARIANCE OF A CONTINUOUS UNIFORM
DISTRIBUTION:

Expectation:
If X is a continuous uniform distribution
then,
b
x
E (X) = ʃ b−q
a

84
1
= ʃ x b−q
dx
a

b
1 1
= b−q ʃ x b−q
dx
a

2 b
1 x
= b−q
[ 2 ]
a

2 2
1 b a
= b−q (2 -2)

2 2
1 b−a
= b−c ( 2 )

2 2
b−a
E (X) = ( 2 )

but b2 - a2 = (b-q) (b+q)

difference of two squares

( b−q ) (b+q)
E (X) =
2(b−q)

b+q
E (X) = 2

85
(2) Variance:

Var (X) = E (X2) - [ E (X) ]2

b
1
but E (X2) = ʃ x2 ( b−q ) dx
q
b
1
= b−q ʃ x2 dx
q

3 3
1 b q
= b−q (3 3)

3 3
b−E
= 3(b−q)

2 2
b+ab +q
= 3

So that:

Var (X) E (X2) - [ E (X)]2

2 2 2 2
b+ab +q b+a
= 3
- ( 2 )2

2 2 2 2
b+ab +q b+a
= 3
- ( 4 )2

86
2 2 2 2
b+ab +q b+a
= 4 3
- ( 4 )2

2 2 2 2
4 b +4 ba+4 a−3 b−6 ab−3 q
= 12

2 2 2 2
4 b +4 ba+4 a−3 b−6 ab−3 q
= 12

2 2
b−2ab +q
= 12

by factorization (difference of two squares)


that is:
2
(a−q)
b2 - 2ab+q2
12

EXPECTATION AND VARIANCE OF NORMAL DISTRIBUTION


1 – EXPECTATION:

Let X be a continuous and normal random variable; then


E (X) = ʃ x F(x) dx ---- (1)
RX

2
x−∝
- 2r2

87
1
but F (x) = e
r √2 ¿
¿

so that:
x−p
N - ( 2r )
1
E (X) = ʃ x e dx ----(2
τ √R ¿
¿

- N)

x−φ
Now, let: t = ( r ) - - - -

by cross multiplication, we will have:

=σ t=x-φ

φ from both sides

= σ t + φ = x -------

= x = σ t + φ --------

Differentiate with respect to

dx
= σ
dt

= dx = σ dt ----

Substitute dx for σ dt in equation (3)


Also 1σ T +φ FOR X IN (3)

Then:
∞ - ½ t2

88
1
E (X) = ʃ σ (σ t = N ) e
σ √ 2˄

∞ ∞
N σ
= ʃ e -1/2 t2 dt + ʃ t e -1/2 t2 dt
√ 2¿ ¿ √ 2¿ ¿
∞ -∞

= φX1+0

= φ

2 – VARIANCE
recall that:

Var (X) = E [ (X - ∝) 2]

1 ( x−φ)2
= σ ( 2 ˄) ʃ (x - φ )2 e - dx
√ 2σ 2

Substitute:
x−∝
t= ( σ )

= ( x -φ ) = σ t

so that: dx = σ dt

89
Hence:
∞ -1/2 t2
1
Var (X) = σ 2˄ ʃ t2 e dt

∞ - ½ t2
σ2
= ʃ t2 e dt
√2 ˄

σ2
= x √ 2¿ ¿
√2 ˄

= σ2

EXPECTATION AND VARIANCE OF EXPONENTIAL DISTRIBUTION:

1- EXPECTATION
If the random variable X has an exponential distribution; then;

E (X) = ʃ x F(x) dx --------------------- (1)
- ∞

-dx
= ʃ x λe dx ----- (2)

Using integration by parts:

Let U = x ------- (3)


-dx
and V = λe ---- (4)

du
From (3): dx
=1

90
= du = dx ---- (5)

Similar: from (4)

-dx
V= λe

dv -dx
= dx = -e
λ

= dv = - e -dx dx
λ

Then: duv = V du + vdu

= udv = duv - vdu


-dx ∞ ∞
λxH −e−dx
E (X) = λ [ λ e ] dx +[- λx
]
∞ ∞


λ xH
= ( λ ) e –dx

1
= λ

2 – VARIANCE:

Recall the Var (X) of variance of X:

Var (x) = E (X2) - [ e (X)]2 ---- (1)


91
but E (X2) = ʃ x2 F (x) dx

∞ -dx
2
=ʃ x λe dx --- (3)

Let U = x2

du
then dx = 2x

and du = 2x dx ------ (4)


Also: V = λ e - d x

dv −e−d x
then dx = λ

e−d x
= dv = - λ
dx ----- (5)

so that:

−x 2 e−dx
E (X2) = - λ
+2 ʃ x e -dx
dx

∞ Q

−x e−dx e−d x
=0+2 λ
- X

∞ 0

= 2/¿λ2

92
Then :

Var (X) = E (X2) - [ E (X) ]2

1
but E(X) = λ ------ (7)

1
and [E (X) ]2 = λ ------ (8)

by substituting (7) and (8) into (1) we will have:

Var (X) = 2/λ2 ---- 1/λ2

= 2-1
λ2

= 1/λ2

MOMENTS IN STATISTICS:

Moments in statistics are quantitative (or a set of statistical parameters) that


describes the specific characteristics of a probability distribution. In imple terms,
the moment is a way to measure how spread out or concentrated the number in a
data-set is around the central value, such as the mean. The moments; of a random
variable (or its distribution) are expected values of powers or related functions of
the random variable. Let say the random variable of our interest is X then,
moments are defined as the X’s expected values. for example: E(X), E (X4),s

93
E (X5)

in statistics four moments are commonly used:

i. First (1st ) moment, which implies mean or average.


ii. Second (2nd ) moment i.e the variance
iii. Third (3rd ) moment i.e. skewness
iv. Fourth (4th) moment i.e kurtosis

Notes:

The 3rd moment (skewness) measures the asymmetry of distribution, while the 4 th
moment (kurtosis) measures how heavy the tail values are. physicists generally use
the higher-order moments in application of physics.

MOMENTS OF CONTINUOUS RANDOM VARIABLES ABOUT THE


ORIGIN AND THE MEAN

The moments of a continuous probability distribution are often used to describe the
shape of the probability density function (p.d.f) The first four moments (if they
exist) are well known because they correspond to familiar descriptive statistics.

For a continuous probability distribution for density function F (X), the raw
moment (also called the moment about zero) is defined as:


M1n = ʃ xn F(x) dx
-∞

The mean is defined as the first raw moment. The most famous central moment
is the second central moment, which is the variance. The second central moment is
usually

94
Denoted by σ 2 to emphasize that the variance is a positive quantity.

MOMENTS FOR DISCRETE DISTRIBUTIONS:

Similar definitions exist for discrete distributions. Technically, the moments are
defined by using the notion of the expected value of a random variable. Loosely
speaking, One can replace the integrals by summations. For example if X is a
discrete random variable, with a countable set of possible values, (X 1 X2 X3 ) That
have probabiblity [ P1, P2 P3] of excreting respectively, then the raw nth moment
for X is the Sum:

E [(X - φ )n] = (xi - φ )n Pi

MOMENT GENERATING FUNCTION

The moment generating function (mgt) is a function often used to characterize the
distribution of a random variable.

Definition

Let X be a random variable if the expected number t belonging to a closed interval


[-h, h] CR, with h> o, then we say that X possess a moment generating function
and the function given by:

Mx (t) = [exp (t)] = [e tx] = ʃ e tx F(x) dx

Is called moment generating function of X not all random variables posse is a


moment generating function. However, all random variables.

Process a characteristic function, another transform that enjoys properties similar


to those enjoyed by the mgf.

PROPERTIES OF MOMENT GENERATING FUNCTION

95
The following are the three basic characteristics (properties of mgf):

i. If x and y are independent, then Mx+y (t) = Mx(t) My(t), which is true on
the common interval where both mgf’s exist.
ii. MqX+b(t) = e e(tb) Mx(at)
iii. If X
How is mgf applied?

The moment generating function has great practical relevance because:

a. It can be used to easily derive moments;


Its derivatives at zero are equal to the moment of the random variable.
b. A probability distribution is uniquely determine by is mgf. This coupled
with the analytical tractability of mgfs, makes them a handy tool for solving
several problems, such as deriving the distribution of a sum of two or more
random variables.
SOLVING EXAMPLES WITH MGF’S

Examples covered both discrete and continuous random variables.

Example 1:

Complete the moment generating function (mgf) of exponential random


variables:

Solution:

The p.d.f of an exponential distribution with perimeter λ > 0 and Rx = [0, ∞ ] is


given by

- λx
F(x)= λ e

Now;

Mx (t) + E [ e tx ]

but recall that E(x) = ʃ x F(x) dx

96
= Mx (e) = ʃ λ e tx e –λx dx


=λ ʃ e tx e -λx
dx

N
=λ ʃ e tx + (- λx) dx
0

= N
=λ ʃ e -tx + (- λx) dx
0
∞ - x (λ - t)
= λʃ e dx
0

λ
= λ−t [0 - 1]

λ
= λ−t , for t < λ , or λ > t

Var (x) is given by:

Var (x) = E (X2) - [E (X)]


Derivation of Mx (t) yelds
λ
Mx (t) = λ−t , t = p

λ
M1 (t) = ( λ−t)2


and Mii (t) = λ−t

97

E (X2) = Mx (o) = λ−t

Therefore

Var (x) = λ−t - (1/λ)

2 1
= λ - ( λ)

2−1
= λ2

1
= λ

Furthermore, the above expected value exists and is finite for any t E [- h, h],
provided 0 < h < λ. AS a consequence, X possesses a mgf:

λ
Mx (E) = λ−t .

Note
The moment generating function (mgf) takes it name by the fact that, it can be used
to derived the moment of X, as stated thus:
If a random variable X possess a mgf M x (t), then nth moment of X denoted by (n)
exists and is finite for any nEN. Furthermore.
d n M x (t)
φ x (n) = E [Xn] = t =0
dtn

98
d n M x (t)
where t = 0 is the nth derivative of Mx (t) with respect to t, evaluated
dtn
at the point t = o.

Example 2:
If X is a discrete random variable having R x = 0, 1 Derive the moment generating
function of X if it exists.
Solution
Recall that, the probability mass function of a Bernoulli distribution is given by:
F(x) = Px q1- x for x = 0 or 1
By mgf; we will have:
tx
Mx (t) = E [e ]

= Ʃ e tx . Px (x)
xERx

= e t(0) Px (0) + e t(0) Px (0)

= e (t) X P + 1X (1 - P)
= 1 – P + Pet

Obviously, the moment generating function exists and it is well-defined because


the above expected value exists for any t ER.

Example 3

Let x be a random variable with moment generating function


Mx (t) = ½ (1 + et)

Derive the variance of X.

99
Solution:
We can use the formula below to compute the variance:
Var (X) = E[X2] - [E (X)]2
The expected value of X is computed by taking the first derivative of the moment
generating function:

dMx(t)
= ½ exp (t) = ½ et
dt

and evaluating it at t = 0:

d Mx (t)
E[X] = = 1.2 e 0 = ½
dt

The second moment of X is computed by taking the second derivative of the


moment generating function:

d2 Mx (t) = ½ exp (t) = ½ et


dt2

and evaluating at t = 0:

E (X2) = d2 Mx (t) = ½ e0 = ½ x 1 = ½
2
dt t =0

Therefore:

Var (X) = E (X2) - [ E(X) ]2

100
1 1
=2 - ( 2 )2

1 1
=2 -4
1
=4

Example 4:

If X (φ , 02), derive the first and second moments of the random


variable X.

Solution:

This can easily be done by first, computing the moment generating functions of a
unit random variable with parameter 0 and 1. letting Z such a random variable, we
have;
Mx (t) = E[ e tz]


1
= ʃ e tx - (x-∝)2
√2 ˄ σ 2
-∞ e σ 2dx


1
= √2 ⌅ ʃ e - (x - t)2
-∞ 2 dx


1
= ʃ e { - (x - t)2 + t2 }
√2 ⌅
-∞ 2 2

101
Since e t2/2 is not involving x, so it can be written as;

1
Mx (t) = e t2 /2 ʃ e - (x - t)2
√2 ⌅
-∞ 2 dx

t2/2 ∞ - (x - t)2
e
= ʃ e 2 dx
√2 ⌅
-∞

t2 /2 ∞ - y2/2
e
= ʃ e dx
√2 ⌅
-∞

dy
If: y = x – t and dx =1 = dx = dy

(t)
Mx = e t2 /2 X √ 2⌅
√ 2⌅

Now , to obtain the moment generating functions of an arbitrary normal random


variable, recall that:

X = ∝ + σZ

Z = X - φ
σ

We will have a normal distribution with parameter φ and σ 2 , where Z is a unit


normal random variable (standard normal random variable).
Hence the moment generating function of such a random variable is given by:

102
tx
Mx (t) =E[e ]

= E [ e t (φ + σ z) ]

= E[ e (tφ + tσ z) ]

= E [ e tφ . e trz]

trz
= e tφ . E [ e ]

= e tφ . M(tσ )

(tσ ) 2/2
= e tφ . e

= e t2 σ 2/2 + tφ

So that:

1
Mx (t) = (2σ + φ ) e (t2 σ 3 + t φ )
2 2

(σ 2 t2 + φ t)
2
= (φ + σ 2 t) e

And:

103
M11 (t) = (∝ + σ 2)2 eσ 2
t2 + ∝t
2 + σ2 e t2 r 2 + ∝t
2

But:

E (X) = M1x (t) = M1x (o) = ∝

E (X2) = M11x (0) = ∝ 2 + σ 2

Var (X) = E (X2) - [ E (X) ]2

= φ 2 + σ 2 - (φ )2

= φ2 + σ 2 - φ2

= σ2

104

You might also like