PROBABILITY DISTRIBUTIONS
BINOMIAL, POISSON, NORMAL
Probability and Statistics
Probability is the chance of an outcome in an experiment (also called event).
Event: Tossing a fair coin
Outcome: Head, Tail
Probability deals with predicting the likelihood of future events.
Statistics involves the analysis of the frequency of past events
A random variable is a rule that assigns a numerical value to an outcome of interest.
A probability distribution is a definition of probabilities of the values of random
variable.
D ISTRIBUTION
Frequency Distribution: It is a listing of observed /
actual frequencies of all the outcomes of a n
experiment t h at actually occurred when experiment
was done.
Probability Distribution: It is a listing of the
probabilities of all the possible outcomes t hat could
occur if the experiment was done.
It can be described as:
A diagram (Probability Tree)
A table
A mathematical formula
2
Types of Probability Distributions
Discrete probability distributions
Binomial distribution
Multinomial distribution
Poisson distribution
Hypergeometric distribution
Continuous probability distributions
Normal distribution
Standard normal distribution
Gamma distribution
Exponential distribution
Chi square distribution
Lognormal distribution
Weibull distribution
5
T YPES OF P ROBABILITY D ISTRIBUTION [PD]
Probability
Distrib ution
Continuous
Discrete PD
PD
Binomial
Distribution Normal
Distribution
Poisson 3
Distribution
Discrete Probability Distributions
A discrete random variable is a variable that can assume only a
countable number of values
Many possible outcomes:
number of complaints per day
number of TV’s in a household
number of rings before the phone is answered
Only two possible outcomes:
gender: male or female
defective: yes or no
spreads peanut butter first vs. spreads jelly first
Continuous Probability Distributions
A continuous random variable is a variable that can assume any
value on a continuum (can assume an uncountable number of
values)
thickness of an item
time required to complete a task
temperature of a solution
height, in inches
These can potentially take on any value, depending only on the
ability to measure accurately.
P ROBABILITY D ISTRIBUTION
Discrete Distribution: Random Variable can take
only limited number of values. Ex: No. of heads
in two tosses.
Continuous Distribution: Random Variable can
take any value. Ex: Height of students in the
class.
4
T REE DIAGRAM –
A FAIR COIN IS TOSSED TWICE
1st 2nd
H HH
T HT Possible
Outcomes
H TH
T
T TT
Attach probabilities
1st 2nd
H HH P(H,H)=½x½=¼
½
½ H
½
T HT P(H,T)=½x½=¼
½ H TH P(T,H)=½x½=¼
½ T
½
T TT P(T,T)=½x½=¼
INDEPENDENT EVENTS – 1st spin has no effect on the 2nd spin
Calculate probabilities
1st 2nd
½
H HH P(H,H)=½x½=¼
*
½ H
½
T
H
HT P(H,T)=½x½=¼
*
*
½ TH P(T,H)=½x½=¼
½ T
½
T TT P(T,T)=½x½=¼
Probability of at least one Head?
Ans: ¼ +¼+¼=¾
D ISCRETE PD – E XAMPLE (TABLE )
Tossing a coin three times:
S = 𝐻𝐻𝐻, 𝐻𝐻𝑇,𝐻𝑇𝐻,𝐻𝑇𝑇,𝑇𝐻𝐻,𝑇𝐻𝑇,𝑇𝑇𝐻,𝑇𝑇𝑇
Let X represents “No. of heads”
X Frequency P (X=x)
0 1 1/8
1 3 3/8
2 3 3/8
3 1 1/8
8
B INOMIAL D ISTRIBUTION
There are certain phenomena in nat ur e which can be
identified as Bernoulli’s processes, in which:
There is a fixed number of n trials carried out
Each trial has only two possible outcomes say success or failure,
true or false etc.
Probability of occurrence of any outcome remains same over
successive trials
Trials are statistically independent
9
B INOMIAL D ISTRIBUTION
Binomial distribution was discovered by James Bernoulli (1654-1705).
Let a random experiment be performed repeatedly and the occurrence of an
event in a trial be called as success and its non-occurrence is failure.
Consider a set of n independent trails (n being finite), in which the probability p
of success in any trail is constant for each trial. Then q=1-p is the probability of
failure in any trail.
9
THE BINOMIAL DISTRIBUTION
BERNOULLI RANDOM VARIABLES
Imagine a simple trial with only two possible outcomes
Success (S)
Failure (F)
Examples
Toss of a coin (heads or tails)
Sex of a newborn (male or female)
Survival of an organism in a region (live or die)
Suppose that the probability of success is p
What is the probability of failure?
q=1–p
Binomial Distribution Settings
A manufacturing plant labels items as either
defective or acceptable
A firm bidding for a contract will either get the
contract or not
A marketing research firm receives survey
responses of “yes I will buy” or “no I will not”
New job applicants either accept the offer or
reject it
THE BINOMIAL DISTRIBUTION
OVERVIEW
What is the probability of obtaining x successes in n trials?
Example
What is the probability of obtaining 2 heads from a coin that
was tossed 5 times?
P(HHTTT) = (1/2)5 = 1/32
Condition for Binomial distribution
We get the binomial distribution under the
following experimentation conditions -
1. The number of trial n is finite
2. The trials are independent of each
other.
3. The probability of success p is constant
for each trial.
4. Each trial must result in a success or
failure.
5. The events are discrete events.
Properties
1. If p and q are equal, the given binomial
distribution will be symmetrical.
If p and q are not equal, the distribution will
be skewed distribution.
2. Mean = E(x) = np
3. Variance =V(x) = npq (mean>variance)
x2 = Variance (X) = np(1-p)
4. Standard Deviation of BD: σ = 𝑛𝑝𝑞
x =SD (X)= np (1 p)
B INOMIAL D ISTRIBUTION
Binomial distribution is a discrete PD which
expresses the probability of one set of alternatives –
success (p) and failure (q)
P(X = x) = 𝒏𝑪𝒓𝒑𝒓𝒒𝒏−𝒓 (Prob. Of r successes in n trials)
n = no. of trials under t a k en
r = no. of successes desired
p = probability of success
q = probability of failure
The shape of the binomial distribution depends on the
values of p and n 9
Application
1. Quality control measures and sampling
process in industries to classify items
as defectives or non-defective.
2. Medical applications such as success or
failure, cure or no-cure.
P RACTICE Q UESTIONS – BD
Four coins are tossed simultaneously. W hat is the probability
of getting:
No head 1/16
No tail 1/16
Two heads 3/8
The probability of a bomb hitting a target is 1/5. Two bombs are
enough to destroy a bridge. If six bombs are fired a t the bridge,
find the probability t h a t the bridge is destroyed. (0.345)
If 8 ships out of 10 ships arrive safely. Find the probability t h a t
a t least one would arrive safely out of 5 ships selected a t
random. (0.999)
10
P RACTICE Q UESTIONS – BD
A pair of dice is thrown 7 times. If getting a total of 7 is
considered as success, find the probability of getting:
No success (5/6)7
6 successes 35. (1/6)7
At least 6 successes 36. (1/6)7
Eight-tenths of the pumps were correctly filled. Find the
probability of getting exactly three of six pumps correctly filled.
(0.082)
11
M EASURES OF C ENTRAL T ENDENCY AND
D ISPERSION FOR T H E B INOMIAL D ISTRIBUTION
Mean of BD: µ = np
S t andard Deviation of BD: σ = 𝑛𝑝𝑞
The mean of BD is 20 and its SD is 4. Find n, p, q.
(100, 1/5, 4/5)
The mean of BD is 20 and its SD is 7. Comment.
12
F ITTING OF B INOMIAL D ISTRIBUTION
Four coins are tossed 160 times and the following results
were obtained:
No. of heads 0 1 2 3 4
Frequency 17 52 54 31 6
Fit a binomial distribution und er the assumption t h a t the
coins are unbiased.
Fit a binomial distribution to the following data:
X 0 1 2 3 4
f 28 62 46 10 4
13
T PHE OISSON D ISTRIBUTION
When there is a large number of trials, but a small
probability of success, binomial calculation becomes
impractical
Example: Number of deaths from horse kicks in the Army in
different years
The mean number of successes from n trials is µ = np
Example: 64 deaths in 20 years from thousands of
soldiers
P OISSON D ISTRIBUTION
Characteristics of the Poisson Distribution:
The outcomes of interest are rare relative to the possible
outcomes
The average number of outcomes of interest per time or space
interval is
The number of outcomes of interest are random, and the occurrence
of one outcome does not influence the chances of another outcome
of interest
The probability of that an outcome of interest occurs in a given
segment is the same for all segments
Condition for Poisson distribution
Poisson distribution is the limiting case of binomial
distribution under the following assumptions.
1. The number of trials n should be indefinitely large ie.,
n->8
2. The probability of success p for each trial is
indefinitely small.
3. np= λ, should be finite where λ is constant.
Properties
1. Poisson distribution is defined by single parameter λ.
2. Mean = λ
3. Variance = λ.
Mean and Variance are equal.
P OISSON D ISTRIBUTION
When there is a large number of trials, but a small
probability of success, binomial calculation becomes
impractical
If λ = mean no. of occurrences of an event per unit
interval of time/space, then probability that it will occur
exactly ‘x’ times is given by
P(x) = 𝝀𝒙 𝒆−𝝀 where e is napier constant & e = 2.7182
𝒙!
14
THE POISSON DISTRIBUTION
If we substitute µ/n for p, and let n tend to infinity, the
binomial distribution becomes the Poisson distribution:
e -µµx
P(x) =
x!
Poisson distribution is applied where random events in space or
time are expected to occur
Deviation from Poisson distribution may indicate some degree
of non-randomness in the events under study.
Application
1. It is used in quality control statistics to count the number of defects of an item.
2. In biology, to count the number of bacteria.
3. In determining the number of deaths in a district in a given period, by rare
disease.
4. The number of error per page in typed material.
5. The number of plants infected with a particular disease in a plot of field.
6. Number of weeds in particular species in different plots of a field.
P RACTICE P ROBLEMS – P OISSON
D ISTRIBUTION
On a road crossing, records show t h a t on a n average, 5 accidents
occur per month. What is th e probability t h a t 0, 1, 2, 3, 4, 5 accidents
occur in a month? (0.0067, 0.0335, 0.08425, 0.14042, 0.17552, 0.17552)
In case, probability of greater t h a n 3 accidents per month exceeds 0.7, then
road must be widened. Should the road be widened? (Yes)
If on a n average 2 calls arrive a t a telephone switchboard per minute, what is
the probability t h a t exactly 5 calls will arrive during a randomly selected 3
minute interval? (0.1606)
It is given t h a t 2% of the screws are defective. Use PD to find the probability
t h a t a packet of 100 screws contains:
No defective screws (0.135)
One defective screw (0.270)
Two or more defective screw (0.595) 15
C HARACTERISTICS OF P OISSON
D ISTRIBUTION
It is a discrete distribution
Occurrences are statistically independent
Mean no. of occurrences in a unit of time is
proportional to size of unit (if 5 in one year, 10 in 2
years etc.)
Mean of PD is λ = np
S t andard Deviation of PD is 𝜆= 𝑛𝑝
It is always right skewed.
PD is a good approximation to BD when n > or = 20
and p< or = 0.05
16
THE Gaussian or NORMAL DISTRIBUTION
Discovered in 1733 by de Moivre as an approximation to the
binomial distribution when the number of trails is large
Derived in 1809 by Gauss
Importance lies in the Central Limit Theorem, which states that the
sum of a large number of independent random variables (binomial,
Poisson, etc.) will approximate a normal distribution
Example: Human height is determined by a large number of
factors, both genetic and environmental, which are additive in
their effects. Thus, it follows a normal distribution.
Normal distribution
Continuous Probability distribution is normal distribution.
It is also known as error law or Normal law or Laplacian law or Gaussian
distribution. Many of the sampling distribution like student-t, f distribution and
χ2 distribution.
Definition
A continuous random variable x is said to be a normal distribution with
parameters µ and σ2 , if the density function is given by the probability law .
The mean m and standard deviation s are called the parameters of
Normal distribution.
The normal distribution is expressed by X ~ N(m, σ2)
Condition of Normal Distribution
i) Normal distribution is a limiting form of the binomial distribution under the
following conditions.
a) n, the number of trials is indefinitely large and
b) Neither p nor q is very small.
iii) Constants of normal distribution are mean = m, variation =s2 , Standard
deviation = s.
Properties of normal distribution
1. The normal curve is bell shaped and is symmetric at x = m.
2. Mean, median, and mode of the distribution are coincide
i.e., Mean = Median = Mode = m
3. It has only one mode at x = m (i.e., unimodal)
4. The points of inflection are at x = m ± s
5. The maximum ordinate occurs at x = m and its value is =
6. Area Property P(m - s < ´ < m + s) = 0.6826
P(m - 2s < ´ < m + 2s) = 0.9544
P(m - 3s < ´ < m + 3s) = 0.9973
The Normal Distribution
‘Bell Shaped’
Symmetrical
Mean, Median and Mode are Equal
Location is determined by the mean, µ
Spread is determined by the standard deviation, σ σ
µ
The random variable has an infinite theoretical
Mean
range: + to
= Median
= Mode
N ORMAL D ISTRIBUTION
It is a continuous PD i.e. random variable can take on any
value within a given range. Ex: Height, Weight, Marks etc.
Developed by eighteenth century mathematician – astronomer
Karl Gauss, so also called Gaussian Distribution.
It is symmetrical, unimodal (one peak).
Since it is symmetrical, its mean, median and mode all
coincides i.e. all three are same.
The tails are asymptotic to horizontal axis i.e. curve goes to
infinity without touching horizontal axis.
X axis represents random variable like height, weight etc.
Y axis represents its probability density function.
Area und er the curve tells the probability.
The total area und er the curve is 1 (or 100%)
Mean = µ, SD = σ 17
D EFINING A N ORMAL D ISTRIBUTION
Only two parameters are considered: Mean &
S tan d ard Deviation
Same Mean, Different Standard Deviations
Same SD, Different Means
Different Mean & Different Standard Deviations
18
68-95-99.7 RULE
68% of
the data
95.5% of the data
99.7% of the data
AREA U NDER T HE C URVE
The mean ± 1 st andard deviation covers
approx. 68% of the area under the curve
The m ean ± 2 s t andard deviation covers approx.
95.5% of the area under the curve
The m ean ± 3 s t andard deviation covers 99.7% of
the area under the curve
21
S TANDARD N ORMAL PD
In stand ard Normal PD, Mean = 0, SD = 1
𝑥− 𝜇
Z=
𝜎
Z = No. of std. deviations from x to mean. Also called Z Score
Let X be random variable which follows normal distribution with mean m and
variance s2 .The standard normal variate is defined as which follows
standard normal distribution with mean 0 and standard deviation 1 i.e., Z ~
N(0,1).
22
P RACTICE P ROBLEMS – N ORMAL
D ISTRIBUTION
Mean height of gurkhas is 147 cm with SD of 3 cm. W hat is
the probability of:
Height being greater t h a n 152 cm. 4.75%
Height between 140 a nd 150 cm. 83.14%
Mean demand of a n oil is 1000 ltr per month with SD of 250
ltr.
If 1200 ltrs are stocked, Wha t is the satisfaction level? 78%
For a n assurance of 95%, wha t stock mus t be kept? 1411.25 ltr
Nancy keeps bank balance on a n average a t Rs. 5000 with SD
of Rs. 1000. W hat is the probability t h a t her account will have
balance of :
Greater t h a n Rs. 7000 0.0228 23
Between Rs. 5000 a nd Rs. 6000