0 ratings0% found this document useful (0 votes) 110 views34 pagesUnit4 Probability Distributions
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
Probability Distributions
See ee ee
4.0 LEARNING OUTCOMES
After completion of this unit the students will be able to
*
*
°
*
*
Understand the concept of Random Variables
Distinguish between discrete and continuous random variable
Understand and apply the concept of Probability Distribution
Write probability distribution of discrete random variable
Calculate the mathematical expectation and variance of a discrete random
variable
Understand and apply the concept of Binomial Distribution
Calculate the mathematical expectation and variance for a binomial distribution
Understand and apply the concept of Poisson Dist
ution
Calculate the mathematical expectation and variance for a Poisson distribution
Understand and apply the concept of Normal Distribution
Calculate the mathematical expectation and variance for a normal distribution
Calculate Z-Score and Use Z-Table to interpret normal distribution data set
4.0.0 BEFORE YOU START, YOU SHOULD KNOW
1.
A
3,
4.
5,
6
7.
Random experiment, Sample space, Event associated with a sample space
Mutually exclusive and mutually exhaustive events
. Independent and dependent events
Multiplication theorem of Probability
. Addition theorem of Probability
). Total Probability
. Bayes’ theorem
Probab Oistibuton4.1 CONCEPT MAP
Z-Score Z-Table
4.2 INTRODUCTION
Suhani has two black sweaters and a white sweater in her cupboard. She takes out a sweater
at random, notes the colour and puts it back in the cupboard. She repeated the process once more
before making up her mind.
What shall be the sample space of the situation stated above?
Let us consider the sweaters as B,, B, and W,
For the selection of two assignments,
the sample space is S = [{ B,B,, B, B,, B,B,, B,B,, B,W,, B,W,, W,B,, W,By, W,W,]
Clearly these draws are of a random experiment with random outcomes that cannot be predicted.
Let X represent the number of white sweaters drawn in this situation, in that case what can you
say about the value of X?
Here, X (B,B,) = X (B; B,)
sweater.
Also, X(B,W,) = X(B,W,) = X(W,B,) = X(W,B,) = 1 as the sample element has one white sweater
And, X(W,W,) = 2 as the sample element has two white sweaters
X (B,B,) = X(B,B,
as the sample element does not have any white
= X can take values 0, 1 or 2
Here, X is a function whose domain is the set of possible outcomes (or sample space) of a random
experiment. Also, the variable X take any real value, therefore, its co-domain is the set of real
numbers
In such a case X is considered as a random variable
Definition: A random variable is a real valued function whose domain is the sample space of
a random experiment
Let us consider an experiment of tossing a coin two times in succession.
Clearly the sample space of this experiment is S = (HH, HT, TH, TT].
If X represents the number of heads obtained in this situation,
TEE‘Then X(HH) = 2
let Y represent the number of tails minus the number of heads for each random outcome of the
above sample space $
‘Then Y(IT) = 2
Y(TH) = Y(HT)
And Y(HH) = 2
In this case, Y is a random variable which can take values 2, 0 or - 2
Please note that more than one random variable can be defined on a given sample space. In both
the situations above, we shall assume that each random outcome is equally likely to be selected.
Example 1
Rajat is playing a game of rolling a die with his friends.
According to the game rules, he will win Rs 5 for
rolling an even number and for getting an odd digit on
the die, he looses €2. If X represents the amount of
money Rajat wins or loses. Show that X is a random
variable and also represent it as a function on the
sample space of the game play.
Solution: Sample space of the game play S = [1, 2, 3, 4,5, 6)
As X represents the amount of money Rajat wins or loses => X is a function whose values are
defined on the basis of random outcomes, therefore it is a random variable,
X (2) = X (4) = X (6) = 1x 5 = Rs 5 as Rajat wins Rs 5 when he rolls an even digit on the die,
X(1) = X(3) = X(5) = 1x (-2) = -2 as Rajat loses Rs 2 on rolling an odd digit on the die
Thus, for each element of the sample space, X takes a unique value, hence, X is a function on
the sample space whose range is (+5, -2}
4.2.1 Discrete and Continous Random Variables
Recall that a variable is a quantity that keeps varying.
Let us consider toss of a fair coin and let X be the random variable defined as
O,if coin toss result in head
X= (ty coin toss result in tail
Here, the random variable is taking two distinct and countable (measurable) values.
Hence X in this case has distinct and countable outcomes with no number in between these
values, therefore it is a discrete random variable.
A wrist watch with only hour and minutes display shows time as 12:00, then
12.01, 12:02, and so on and there is no time shown in between. In this case the
random time change is distinct and countable. Therefore the change in time in this
case is discrete random variable
Each possible value of the discrete random variable can be associated with a non-
zero probability.
Probably Disrbution =Whereas a wrist watch displaying the seconds count as well shows time {
22:31:25 pm and 22:32:17 pm and the elapsed time in between as well. A random
variable whose value is obtained by measuring and it takes many values between.
two values, is called a continuous random variable.
In other words, a continuous random variable is a random variable with a set
of possible infinite and uncountable values (known as the range).
4.2.2 Probability Distribution of Discrete Random Variable
Let us consider another random variable X defined as sum of digits on rolling of two dice.
The following grid shows the sample space of this random experiment:
+[i]f2];s3;4/[ 5/6
1{[2{3]4]s5 | 6 | 7
2{[3[4]s5s]6|[7]|8
3/4 /[5]6]7]/ 8] 9
4/5 [6/7] s8| 9 | 10
s5{[6|7]{s]9|w|n
6 | 7 | 8 | 9 | 10| 1 | 12
Clearly, X will take values 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12 which are distinct and countable,
hence X is a discrete random variable in this case.
Now let us find the probability for each random outcome
Xx 2 3 4 5 6 7 8 9 10, an 12
1 2 3 4 5 6 5 4 3 2 1
POO >| 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36
Table (i)
Observe that in table (i); for all possible values of the discrete random variable X, all elements of
the sample space are covered.
This table of possible outcomes and their respective probability is called Probability distribution
fable for the given random variable X. A probability distribution table links every possible outcome
of the random experiment with the probability of the event to occur.
In a probability distribution table, the sum of all the probabilities is one. (Refer table (i) )
Example 2
A coin is tossed thrice and outcomes are recorded. Prepare the probability distribution table for the
number of heads.
Solution: Let a random variable X denote the number of heads in
three throws of a die
Here the sample space S = (HHH, HHT, HTH, THH, HTT,
THT, TTH, TIT}
Pople MathesWhich means that X can attain values 0 for no heads, 1 for one head and two tails, 2 for two
heads and one tail or 3 for three heads
= X=0,1,2and3
‘The probability of no heads ie. P (TTT) which can also be written as
P(K = 0) = $x} x5
(Recall multiplication theorem of probability from class XI)
probability of obtaining one head and two tails i.e. P(HTT, THT, TTH) is denoted by
PK = 1) = 3x3x3xi
3
e
probability of obtaining two head and one tail ie. P(HHT, HTH, THH) is denoted by
2
dytyt
P(X = 2) = 3xExixd =3
And, probability of obtaining three heads i.e. P (HHH) is written as
P(X = 3) = 3x} x
Therefore, the probability distribution table is:
xo 0 1 a a)
Ea EI
8 8
pa
8
4.3 MATHEMATICAL EXPECTATION OF DISCRETE PROBABILITY DISTRIBUTION
Recall that mean is a measure of central tendency as it locates a rough estimation of a middle
or average value of a random variable in an experiment,
Definition: In an experiment, for a given random variable X whose possible finite values x,, X,,
Xy « X, occur with probabilities p,, Py Ps ---P, respectively such that Lp; = 1
Then the mathematical expectation is the weighted average of the possible values of X given by:
7
E(X) = xpi + x22 + aps ++ AyPa =D (%Py)
isl
In a nutshell, the mathematical expectation, also known as expected value for a random variable X
is the summation of product of all possible values for the given random variable X and their
respective probabilities.
Example 3
A coin is tossed twice and outcomes are recorded. Prepare the probability distribution table for
random variable X which represents the number of heads in the experiment, Also calculate the
mathematical expectation of X.
Solution: Let a random variable X denote the number of heads in two throws of a die
= the sample space § = { HH, HT, TH,TT]
Probab isributonClearly X = 0, 1 and 2
The probability of occurrence of a head = probability of occurrence of a tail
The probability distribution table-
Ee 0 1 2
Sample event 1 HH
1 1
PU) = p> n zn
1a
oxt=0 pee
XP > a2
Note that Zp, = 1
Therefore, EX) = 0" (api) = 0 + F
Example 4
In a manufacturing unit inspection, from a lot of 20 baskets which include 6 defectives, a sample
of 2 baskets is drawn at random without replacement. Prepare the probability distribution of the
number of defective baskets. Also calculate E(X) for the random variable X.
Solution:
As X denotes the number of defective baskets in a draw of 2 without replacement
=X =0,1and2
Therefore, in a draw of two baskets;
x 0 1 2
zh No defective baskets | One defective basket | Two defective baskets
1413 _ 182 6,530
20°19 380 20°19 380
20 0
“Pi 380 380
Note that Dp, = 1
Therefore, E(X) = YY" (xip,) =
(X) of a random variable X, is the theoretical mean of X. It is not based on sample data
but on the distribution of it
So, the mean expectation value is a parameter and not a statistic.
‘Sometimes it is also represented by use of Greek letter mu (11) as well.
Applied MathematicsAlso random variables with different probability distributions can have equal means. Let us take
an example to study this statement in detail
Have a look at probability distribution of two different random variables X and Y as given below:
x 1 2 3 +
1 3 2 1
Pu) 7 7 7 7
2_6
Pi ada$
Here, EO) = D7",
Y 4 5
3 1
Pi 7 7
-2
7
Here, EY) = DO) (up)
Clearly the random variables X and Y with different probability distributions can have equal
means. In such cases, we need a technique to check variability and extent to which the values of
random variable are spread out.
4.4 VARIANCE OF DISCRETE PROBABILITY DISTRIBUTION
While the mean is a central tendency; known as the average of a group of data, the variance
measures the average degree to which each number in the data is different from the mean value.
The extent or scope of the variance correlates to the size of the overall range of the given sample
Variance enables us to study the variability of random variable from the mean expectation
‘When there is a narrower range among the sample elements in a given sample space; that means
that the value of the random variable is close to mean expectation and hence the variance is less
‘And, when there is wide range among the sample elements, it means that the value of the random
variable is far from the mean expectation and thus the variance is high.
Basically, the variance measures the average degree to which each sample element differs from
the mean of the sample space
Ina probability distribution of a discrete random variable X, the variance denoted by Var (X); is the
summation of the product of the squared deviations of x, from the mean E(X) and the corresponding
probabilities p,
Definition: Let X be a discrete random variable whose possible finite values x), Xj, Xy--, Xy OCCU
with probabilities p,, p;, Py ---P, respectively.
Probability Distribution a7ae =varex) = S927 (Sn)
ra =
In other words, Var (X) = E(X?) — [EQIP where E (x2) = 08
And, the standard deviation denoted by 9, is given by:
Example 5
A class XII has 20 students whose marks (out of 30) are 14, 17, 25, 14, 21, 17, 17, 19, 18, 26, 18,
17, 17, 26, 19, 21, 21, 25, 14 and 19 years. If random variable X denotes the marks of a selected
student given that the probability of each student to be selected is equally likely.
a) Prepare the probability distribution of the random variable X.
b) Find mean, variance and standard deviation of X.
Solution: Based on the given data, let us prepare a table
As probability of a selection of a student is equally likely
That means P (a student to be selected) =
Therefore, the probability distribution is:
63 50 52 _ 385
cat an ae tae = 19.25
Here, EX) = Thx) = +S +8454
ap, = HT 4 22 4 162, 088 | 28 | mas | 338 _ r08s
And, Efoxip, = 4 Pe +S SS ete a= Se
+o on 2 z.
= VariX)= x27 {Se0] oe (S)
rial eo
= 7609 140225
20 400
_153780 — 148225
400
= SES
= = 139
And standard deviation, ¢, = /Var() = y135
Example 6
Let X denote the number of hours a person watches television during a randomly selected day. The
probability that X can take the values x, has the following form, where k is some unknown constant.
(0.2,
ke,
Poa Te
a) Find the value of k.
b) What is the probability that the person watches two hours of television on a selected day?
c) What is the probability that the person watches at least two hours of television on a selected day?
d) What is the probability that the person watches at most 2 hours of television on a selected day?
€) Calculate mathematical expectation
f) Find variance and standard deviation of random variable X
Solution:
x, o ji ]2]3
02 | k | 2 | 2K
a) AsEp,=1
302 +k 42k +2k=1
= 5k = 08
ak=+
35
b)_ Probability that the person watches two hours of television
=P @,=2)
48
= k=2«g- 5
Probab Oistibuton©) Probability that the person will watch at least two hours of television
P (x, = 2,3)
6
sk +%k=dk=4 xt
2
4) Probability that the person will watch at most two hours of television
=P (x,=0, 1,2)
= 02 +k + 2k = 0.2 43k = 0.24 3 x Zz
3 35
e) x, 0 1 2 3
1 4 8 8
Pay=p | 02= 5 Fs s 3B
4 16 24
XP; iv 3B B 3B
4 32 72
xP, io B B B
. 4416, mH
RO) = DG) =0+ 54+ F+ S= Z=176
a 4.3272 _ 100
9 Dis’Pi = 0+ 35 +35 tas as
n n y 2.
108 (44 108 1936
= VartX)= 3079, (Es0] sl ws
_, 2700-1936 _ 764 _1 95
O25 65
And standard deviation, o, = /Var(X) = V122 =1.1
45 BINOMIAL DISTRIBUTION
When you toss a coin, it either shows a ‘head’ or a ‘tail’. When you are asked to calculate 3 +
4, the answer is either ‘correct’ or ‘incorrect’. In such similar experiments the likely outcome is either
a ‘success’ or a ‘failure’.
In the case of discrete random variable X denoting a prime
number on throw of a die, we can say that numbers 2, 3 and 5
will be considered as ‘success’, while 1, 4 and 6 will be counted as
‘failure’ in the experiment.
Each time you roll a die or perform any experiment in
probability, it is called a trial. If in an experiment, a die is rolled
thrice, then the number of trials is counted as 3, each trial having
exactly two outcomes, namely, success or failure.
410 Applied MathematicsIndependent trials which have only two outcomes usually referred as ‘success’ and ‘failure’ are
called Bernoulli trials. Here, the probability of success and failure remains same.
Definition: In a random experiment, a collection of trials is called Bernoulli trials, if:
a) The number of trials is finite.
b) The trials are independent by nature.
©) Each trial has exactly two outcomes defined as success and failure.
d) The probability of success remains the same in each trial,
Recall example 4 where the random variable X denotes the number of defective baskets in a draw
of two baskets without replacement
What would happen if 5 baskets are drawn with replacement?
3X =0,1,2,3,4 and 5
In this case, probability of drawing a defective basket will be considered a success, usually
denoted by p and probability of drawing a non-defective basket will be a failure, denoted by q
Also, the draw of a basket will be called a trial and since we are drawing 5 baskets; the number
of trials is 5
Do you think that these trials qualify as Bernoulli trials?
since probability of success remains same in all the trials,
6
here p = © and q=1-
ere p = 5 and q
hence we can say these are binomial trials.
When the drawing is done without replacement, the probability of success (i.e., drawing a defective
sket) i ial is & yn i = x
basket) in first trial is = , in 2° trial it will be [> and so on.
Clearly, the probability of success is not same for all trials, hence the trials in example 4 are not
Bernoulli trials,
Probability of ‘r’ successes in ‘n’ Bernoulli trials is given by:
pig’
P ( ‘r’ successes) = Cf p'q™” = 7mm
Where 1 = number of trials
r = number of successful trials = 0, 1, 2, 3, ..., n
p = probability of a success in a trial
q = probability of a failure in a trial
And, p+ q=1
Clearly, P (‘r’ successes), is the (r + 1)th term in the binomial expansion of (g + p)? -
The probability distribution of number of successes for a random variable X can be written as:
x 0 1 2 3 Pa r a n
Ptr) =p, | Ch p°q"| ct p'g" | CH p?q" | cE pig" |... | ce ptqh |... | CE p"g™™
This probability distribution is called Binomial distribution with parameters n and p.
The binomial distribution with n Bernoulli trials and success p is also denoted by B(n, p)
Probability Distribution anExample 7:
Prepare the Binomial distribution B (4, ; )
Solution: Here total number of trials = n = 4 and p = 3
Asp+q=l>4q=
Now, number of successes = r = 0, 1, 2, 3 or 4
The binomial distribution can be given as:
Now let us calculate the mean expectation, Variance and Standard Deviation for example 7
Recall that EX) = 50" (mp) = ox +1xS+2xB+3xB+axB= Be - 267
Also see that np = 4% 2.67
;
— ~ 8 96 288 256) 216\? _ 5832
= Var) Sven-{San] = 0+atart art ar) (Gr) > dar = 099
aS
Also see that pq = 4x$X
And Standard deviation =
Example 8
If a fair coin is tossed 9 times, find the probability of
a) exactly five tails
b) At least five tails
©) At most five tailsSolution:
Repeated tosses of a fair coin qualify as Bernoulli's trails
Let X denote the number of tals in an experiment of 9 such trials and hence is the binomial distribution
Here, n= 9, p = >and q
As P ( ‘r’ successes) = C7 P™@™” ma PT
a) Probability of exact 5 successes in 9 trials = P( X = 5) = C2 p&q>*
ae OO
1 4,
“5e-5 2
«
“3
b) Probability of at least 5 successes in 9 trials = P (X > 5)
4 256
GQIa+ eh +cr+R+ $= 55
512,
c) Probability of at most 5 successes in 9 trials = P (X < 5) =1- P(X > 5)
1- @%lcs+cF +c3 +3]
_ 131 _ 382
312 512
Example 9
In a manufacturing unit inspection, from a lot of 20 baskets which include 6 defectives, a sample
of 2 baskets is drawn at random with replacement. Prepare the binomial distribution of the number
of defective baskets. Also find E(X) and Var(X) for the random variable X
Solution: Here, X denotes the number of defective baskets in a draw of 2 baskets with replacement
Clearly, the trials are Bemoulli trials
And X = 0, 1 and 2
Also number of trials = n = 2
If drawing a defective basket is considered a success,
3
then p= S=25q-5
x 0 1 2
5 No defective baskets One defective basket Two defective baskets
cade EE ENVIS
PU) =P a) (ia) ct(¢o) (ao) ex(¢s) Ge)
-_* 2 2
~ 100 ~ 100 ~ 100
F(X) = np = 4X35 = 12 and Var(X) = npq = 4x2x=
084
Probability Distribution 4.13Example 10
2
‘The probability that Rohit will hit a shooting target is. While preparing
for an international shooting competition, Rohit aims to achieve the
probability of hitting the target at least once to be 0.99. What is the
minimum number of chances must he shoot to attain this probability?
Solution: Let the number of chances Rohit shoots the target be 1
Here, the trials are Bernoulli with p be the probability of success to
2
hit the target = 5 and q be the probability of failure to hit the target be 1 - p =
ge"
As Rohit wants to hit the target at least once with the probability of 0.99
= P(r = 123,...) 20.99
= 1- P(r = 0) 20.99
Then P(r number of successes) = 67 @)"(Q)’
= 1- Ew OO" = 099
= 1-092 ©"
= 0012 G)"
= 100 < 3)"
As >1003n2>5
=> Rohit should hit the target at least 5 times to achieve his target.
Example 11
Sonal and Anannya are playing a game by throwing a die alternatively till one of them gets a ‘I’
and wins the game. Find their respective probabilities of winning, if Sonal starts first
Solution: Clearly the trials are Bernoulli’s with n — co
Getting a 1 on a single throw of the die is considered a success
=P Zand q=1-p=
Sonal starts the game by throwing the die first
P( Sonal to win in the first throw) = +
When will Sonal get a chance to win next?
Sonal will get to try winning in third throw; when Sonal fails in first throw and then Anannya
fails to win in second throw
P (Sonal to win in third throw) = (2)? x3
Next time Sonal will get to win is fifth throw
P (Sonal to win in fifth throw) = (@)* x $ and so on
4ai4 Applied MathematicsHence, P(Sonal will win) = P( in first throw) + P( in third throw) + P(in fifth throw) + ...
sya s\F a
=3+() xi+( xi+
It is an infinite geometric series with a
Example 12
A die is thrown again and again until three 5s’ are obtained. Find the probability of obtaining the
third 5 in the seventh throw of the die
Solution: Clearly the trials are Bernoulli with n = 6
2 1 5
g > P=sandq=1-p=;
P (a5 ona single throw of die
For finding the probability of third six in the seventh throw of the die, we know that there must
have been two 5s’ on previous six throws
= P (third 5 on seventh throw of die) = P (two 5s on six throws) xP (a 5 on the next single throw
of die)
25 _ 3125
6 ze2 x2 1
C8 OO * $= 15% 35 xe = Gs
4.6 POISSON DISTRIBUTION
Let us consider the car sales of a car dealer
showroom X in a city, on a given day.
Do you think that the number of cars sales on a
given day will make for a random variable?
Assuming that each car sale is an independent
event, meaning that sale of one car sale gives no
information about when the next sale will happen
Source courtesy
And the probability of one car sale in a given length j./jtpactibeen note nee carimages html
of time, does not change over time.
‘Theoretically, the rate at which the car sales are occurring is not changing through time.
Therefore, we can conclude that the events defined as car sales in such a case are occurring
randomly and independently.
Based on these conditions, a random variable X, representing the number of events in a given
length of time has a Poisson distribution.
A discrete probability distribution that expresses the probability of a given number of events
occurring over a fixed period of time or space is called a Poisson distribution if
1. The events occur with a known constant mean rate
2. The events are independent of the time from the occurrence of the previous event
Probability Distribution a3. The rate of occurrence of events is constant and not based on time
4, The probability of an event is proportional to the length of the period of time
The Poisson distribution can also be used for the number of events in other specified intervals
such as distance, area or volume.
DEFINITION: Let X be the discrete random variable which represents the number occurrence
of events over a period of time.
If X follows the Poisson distribution, then the probability of occurrence of ‘k” number of events
over a period of time is given by:
ake
kt
Where ¢ is Euler's number (¢ = 2.71828...)
‘K is the number of occurrences of the event such that k = 0, 1,2,
And_ = E(X) = Var(X), is a positive real number
With existence conditions:
P(X=K =f =
Formula courtesy: https://en.wikipedia.org/wiki/Poisson_distribution
A restaurant is doing booming business. It was recorded that during their peak business time,
an average of 30 customers per hour arrive at the restaurant. Can we develop a Poisson probability
distribution model for the arrivals of customers, if 30 customers arrive in an interval of 1 hour on
an average?
You might say that arrival average is 1 customer every 2 minutes
>0* >60"
HEH
But the thing to remember here is that arrival time of each customer is random and hence this
approach is inappropriate
Let us try another approach and divide each one-minute interval along an interval of an hour
so that each customer arrival is equally-likely
During each minute, let us consider one customer's arrives in the middle of that time interval
As probability for a customer to arrive is 3 > this is going to be a binomial distribution B (60)
Thus the process will average at E(X) = np = 60 x 3= 30 arrivals during an hour
But then again, we cannot assume that the customers are arr
time intervals
What if we divide the time interval in seconds and consider that probability for a customer to
ig at uniform pace and at regular
Pople athensarrive is not equally- likely but biased at ==. In such a case, the binomial distribution will be B
2
(3600, 5)
And the process will average at E(X) = np = 3600 x
30 arrivals during an hour
To summarize this process, we can say that:
a) as n- co, the time intervals getting larger and larger, p is smaller than before while E(X) =
np is kept constant at 30 customers per hour
b) In the limit as n — oo, the number of customers arriving during an hour is ~ Poisson (30)
©) With the width w= 35 of an hour with arrival rate A= 30 customer per hour:
i) Probability of an arrival during the interval = Aw= 30x45 = =
ii) Probability of more than one arrival during a time interval is 0
iii) Probability of an arrival during a time interval is independent of the previous arrivals
jcture credits: https: //mindyourdecisions.com/blog/2013/06/21/what-do-deaths-from-horse-
kicks-have-to-do-with-statistics/
Example 13
As the story goes, the Prussian soldiers monitored 10 cavalry corps over a period of 20 years. The
annual number of recorded deaths due to horse-kick ‘k’ observations is as shown in the table:
k 0 1 2 3 4 Total
‘Number
of deaths 109 65 22 3 1 200
Does this data provide adequate description of Poisson distribution?
Solution:
Here F(X) = QEMBHIX6S 42x24 3x344x1_ 12 yey
200 00
Probability Distribution 47Ask=0,123,4
By Poisson distribution formula: P (x= k) = S8%e***
Then Poisson model progression will be
061s
Ot 7
corto
Pik = 0)= 054
P(k = 1) = = 033
a
Plk = 2) = o4
Pk = 3) = = 0.02
P(k = 4) = = 0.003
Now Poisson prediction will be: 200 x P(k)
Yes, the Poisson predictions are adequate for the given data
Example 14
‘A traffic engineer records the number of bicycle riders that use a particular cycle track. He records
that an average of 3.2 bicycle riders use the cycle track every hour. Given that the number of bicycles
that use the cycle track follow a Poisson distribution, what is the probability that:
a) 2 or less bicycle riders will use the cycle track within an hour?
b) 3 or more bicycle riders will approach the intersection within an hour?
Also write the mean expectation and variance for the random variable X
Solution:
For this problem, E(X) = Var(X) = 4 = 3.2
a) The goal is to find P(X <2)
As P(X=k)=
Fey
SPX =0)=
32
PX =1)=
rn
aatet?
2
Therefore, P(X < 2) = P(X = 0) + P(X = 1) + P(X = 2) = 0.041+0.1340.21 = 0.381
PK =2)=b) The goal is to find P(X > 3)
‘The probability that there are 3 or more bicycle riders using the track within an hour has no
upper limit on the value of ‘k’, which means that this probability cannot be calculated directly
But, using the rule of complement we can say that
P(X 2 3) = 1- PX < 2) = 1 0.381 = 0.619
In the given Poisson distribution, E(X) = Var(X) = a = 3.2
For the calculation of Euler’s number: http:/ /eguruchela.com/math/calculator/e-power-x
Example 15
A particular river near a small-town floods and overflows twice in every 10-years on an average.
Assuming that the Poisson distribution is appropriate, what is the mean expectation. Also calculate
the probability of 3 or less overflow floods in a 10-year interval.
Solution: As the average event of flood overflow, in every 50-years is two
=In the given Poisson distribution, 2 = 2
The goal is to find P(X < 3)
P(X
P(X= =0.18
Therefore, P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2) +P(K = 3)
= 0.14+0.27+0.2740.18 = 0.86
Example 16:
For a Poisson distribution model, if arrival rate of passengers at
an airport is recorded as 30 per hour on a given day. Find
a) The expected number of arrivals in the first 10 minutes
of an hour
b) The probability of exactly 4 arrivals in the first 10 minutes
of an hour
©) The probability of 4 or fewer arrivals in the first 10 minutes
of an hour
) The probability of 10 or more arrivals in an hour given Picture courtesy: https://
that there are 8 arrivals in the first 10 minutes of that &Xélishlive.ef.com/blog/career-
hour snglish/travel-english-key-words-
for-the-airport,
Probability Distribution 419Solution:
a) As 10 minutes = } th of an hour
= In the given Poisson distribution, X is defined as number of arrivals in the first 10 minutes
or w =5 th hour is the width of time interval
Here, 2 = 30 as number of arrivals is 30 per hour
Therefore, E(X) = Aw = 30 x $= 5 for the first 10-minute of the hour
x,
and P( X = k) = where k = 0, 1, 2,3, ..
b) Probability of exactly 4 arrivals in the first 10 minutes of an hour = P(k = 4) = =~ = 0.176
) The probability of 4 or fewer arrivals in the first 10 minutes of an hour = P(k < 4) = P( k =
0)+P(k = 1)+ P(k = 2) + P(k = 3) +P{k= 4)
=
= 0.007+0.03+0.08+0.14+0.18 * 0.44
) We are given that there have been 8 arrivals in the first 10 minutes (=;th hour)
And we need to find probability of 10 or more arrivals in an hour
‘That means, we need to find probability of 10 - 8 = 2 arrivals in 60 - 10 = last 50 minutes
(Eth hour)
Therefore, E(X) = Aw = 30 xis 25 for the last 50-minute of the hour
And, in this case P(X = k) = = where k = 0, 1, 2,3, ..
= probability of 10 or more arrivals in an hour given that there are 8 arrivals in the first 10
minutes of that hour = P(k = 2, 3, 4, 5, ...09)
1-P(k=0,1)
4.7 NORMAL DISTRIBUTION
In this module, the distributions discussed up till now are applicable when the random variable
is discrete by nature. In case of a continuous random variable like heights or weights; as we have
infinite number of values between two distinct values; thus it becomes very difficult to distribute the
total probability among all these values.
Therefore, a continuous random variable X is defined in terms of its probability density function
f(z) also known as PDF.
re Applied MathematicsIn such a case the probability density function f(x) is defined as:
A continuous random variable X is designed to follow normal distribution with constant
parameters 4 = E(X) and Var(x) = o? and written as X ~N (x , 0?)
F(x) 2 0,V x € (02,00)
1 4euy
10) = ens) ®
where, 1 € (<2, «) is the mean of normal distribution
© > Ois the standard deviation
such that
When a random variable can take on any value within a given range where the probability
distribution is continuous ( refer 4.1.1), it is called a normal distribution or Gaussian distribution. A
random variable with a Gaussian distribution is said to be normally distributed, and is called a normal
deviate
In the normal distribution function given by (i), the curve known as probability curve is bell-
shaped with one peak point as shown below:
Picture courtesy: https:/ /www.Isssimplified.com/normal-distribution-for-lean-six-sigma/.
The normal distribution is used in the cases
where we need to make inferences by taking
random samples; and distribution of random
variable is not known. This type of distribution is
applied to fit the actual observed frequency
distribution on many phenomena like weights and
heights
A Normal distribution have key features that
are easy to spot in graphs:
1. The mean, median and mode of the sample
space are exactly the same.
2. The bell-shaped probability curve has one peak point, it means that the normal distribution
hhas a unique mode
Probability Distribution reThe area below the curve f(x) Vx € (—09,00) has two tails of the curve extended on both
sides and never touch the axis. As the line through x = y is dividing the normal curve into
two equal parts in all aspects which means that the normal curve is symmetrical about x =
1 as half the values fall below the mean and half above the mean.
4. Ina normal distribution curve the total area below the curve is always equal to 1 unit; ie.,
FL f@)
5. The distribution can be described by two values: the mean and the standard deviation
Picture credits: https://en.wikipedia,org/wiki/Bean_ machine
4.7.1 Standard Normal Distribution
As discussed above, X is a normal variate based on two parameters namely, mean (j2) and
standard deviation(o)
Normal
But in a real-life situation, there can be a data set with a mean as 50 and standard deviation of
3 while there can be another data set with a mean of 100 and a standard deviation of 5. How do
we compare such different normally distributed data sets?
4.7.2 Z-Score of Normal Distribution
When mean (4) = 0 and standard deviation(e) = 1 for a data set, then the normal distribution
is called as standard normal distribution
oped Mathematics‘Standard normal distribution
Pry sat
Picture courtesy: https: //www.scribbr.com/statistics/normal-distribution,
We make use of data by converting it into a standard normal variate. All normal variate can be
converted to standard normal distribution. In order to do so, we calculate the standard score or Z-
score for each of the data value in the normal variate therefore enabling to compare information
since they are on the same scale. This distribution is also called a Z-Distribution.
Basically, the Z-score in a standard normal distribution represents how far the said data point
from the mean (i)
How do we find the Z-Score?
Recall that a normal distribution function is given by f(x) = ame
In this case Z is called the Z-Score.
Example 17:
Calculate Z-Score for a normal distribution of length of 7 rare species of Indian
butterfly that you have in your garden
Butterfly 1 2 3 4 5 6 7
Length)
(in cm) 2 2 3 Zz § 1 6
and o = [Oda a sana eee a esa ead HE | 169 Iago =
Solution: Here meai
Butterfly 3 5 6 7
Length (in cm) 2 3 5 1 6
x-u| 2-3 2-3 | 3-3 2-3 5-3 1-3 6-3
ZScore= "| Teo | To | 169-9) Teo | Te | 169 | 167
= 0.59 | = -0.59 = 0.59 = 1.18 = 1.18 = 1.78
Probability Distribution 423Notice that butterfly number 3 has the Z-Score = 0, it means that this data point is the mean of
the data.
Also note that the Z-score is positive if the data point lies above the mean, and negative if it lies
below the mean.
68-95-99.7 Rule
oo
os
F030
Boas
2o20
Boas
Boro
a.
0.00
u-30 u-200 u-o toto +30
Picture courtesy: https://www.simplypsychology.org /normal-distribution.html
As shown in the graph above, if the data values in a normal distribution are converted into z-
scores in a standard normal distribution, then the percentage of the data that fall within specific
numbers of standard deviations (0) from the mean (1) for bell-shaped curve
1. Data points are symmetrical along the mean ()
2. Z-score describes the position of each data point in terms of its distance from the mean, when
measured in standard deviation units.
3. The Z-score is positive if the data point lies above the mean, and negative if it lies below the
mean.
4. there is a 68.27% probability of randomly selecting a Z-score between -1 and +1 standard
deviations from the mean.
= S22 Fax has probability 68.27%
where f(x) = se -e 3
5. 95.45% probability of randomly selecting a score between -2 and +2 standard deviations from
the mean,
= SEX FGDde has probability 95.45%
a oe
wm?
where f(x)
6. 99.73% probability of randomly selecting a score between -3 and +3 standard deviations from
the mean.
+30 .
= JERE F@)dx_ has probability 99.73%
where £6) = ciz.e
e Applied MathematicsExample 18:
Given that mean of a normal variate X is 12 and standard deviation is 4, then find:
a) Find the Z-Score of data point 20
b) The data point if its Z-Score is 5
©) Data point if its Z-Score is -2
Solution:
a) As e=12 and o=4 and x = 20
aatfe
3 20= x-12
3x =32
©) As=12 and o=4 and Z =
Then Z.
>-8
Sxa4
4.7.3 Z-Test for Normal Distribution
If a drug company announces one day that they had found a new drug that cures diabetes, you
would want to be sure how true is their claim. Techniques and various hypothesis test are able to
tell you if it's probably true, or probably not true. Some of the popular hypothesis tests used in
probability distribution are f-test, chi-square test, t-test and Z-test
We are going to discuss one of these tests used in normal distribution data set. To use Z-test, we
need to see that:
a) sample size is greater than 30.
b) data points should be independent from each other.
©) data should be randomly selected from a population, where each data point has equally likely
of being selected.
d) sample sizes should be equal if at all possible.
Let us now see how to use Z-test in a given normal distribution of data set
Example 19
In a district, exam scores of 300 student of class XII are recorded at the end of the session.
a) Ramesh scored 800 marks in total out of 1000, The average score for the batch was 700 and
the standard deviation was calculated to be 180. Find out how has Ramesh scored compared
to his batch mates in the whole district.
Probability Distribution 425b) Sudha scored 420 marks in the same batch. What can you say about her performance as
compared to the batch of 300 students?
©) How much has Abhay scored if he has done better than 44.83% of his batchmates?
Solution:
a) Firstly, we need to find Ramesh’s Z-Score and use the respective z-table before we determine
how well he has performed as compared to his batch mates
As 4 =700 and o = 180 and x = 800
zZ=tte 7= = 056
Once you have the Z-Score, the next step is choosing between the two Z- Tables. (Refer Appendix
at the end)
In the Z-table, go vertically down on the leftmost column to find the value of the first two digits
of your Z Score (0.5 in this case) and then go alongside on the topmost row to find the value of the
digits at the second decimal position (.06 in this case). Once you have mapped these two values, the
intersection of the row of the first two digits and column of the second decimal point in the table
gives the value 0.7123 ie. the area on the left of ordinate corresponding to Z = 0.56. This area also
represents the probability of scoring < 800 marks.
Lastly, to get this as a percentage we multiply that number with 100 ie. 0.7123 x 100 = 71.23%.
Hence, we can say that Ramesh did better than 71.23% of students in the district.
b) In the case of Sudha, ¢ = 700 and ¢ = 180 and x = 420
= z= =m
= oo = 1.56
Ze
Looking at the Z-Table we can say that it maps to 0.0594 and hence we can say that Sudha did
better than 100x 0.0594 =5.94% of students in the district
c) If Abhay has done 44.83% better than his batchmates, then his score on Z-Table is 44.83 +
100 = 0.4483 which corresponds to Z-Score = - 0.13
Here ¢ = 700 and ¢ = 180 and Z = -0.13
Therefore Z
= -234=x—700
= x= 6766 © 677
Which means that Abhay has scored approximately 677 marks out of 1000
Example 20:
Given that the scores of a set of candidates on an IQ test are normally distributed. If the IQ test has
a mean of 100 and a standard deviation of 10, what is the probability that a candidate who takes
the test will score between 90 and 110?
Solution: P(90
by 0,7 = Var(X)= xp, (Esn]
a
ot
In other words, Var (X) = EQ@) - [EQQP, where EQ@) = >)
11. The standard deviation denoted by 0, = /Var(%)
12. In a random experiment, a collection of trials is called Bernoulli trials, if:
i, The number of trials is finite.
ii, The trials are independent by nature.
iii, Each trial has exactly two outcomes defined as success and failure.
iv. The probability of success remains the same in each trial
13. Probability of ‘r’ successes in ‘n’ Bernoulli trials is given by
P (‘r’ successes) = C? Pq’ He
Where n = number of trials
r = number of successful trials = 0, 1, 2, 3,
Probability Distribution 429p = probability of a success in a trial
q = probability of a failure in a trial
And, p+ q=1
14. P(‘r successes) is the (r+1)th term in the binomial expansion of (q + p)"
15. The binomial distribution with n Bernoulli trials and Probability of suc
by Bin, p)
16. Ina binomial distribution having ‘n’ number of Bernoulli trials where p denotes the probability
of success and q denotes the probability of failure, then
i, Mean = np
Variance = npq
iii, Standard Deviation = /7™Pq
17. Let X be the discrete random variable which represents the number occurrence of events over
a period of time. If X follows the Poisson distribution, then the probability of occurrence of
‘k’ number of events over a period of time is given by
s p is also denoted
P(r=m= fm = **
Where ¢ is Euler’s number (¢ = 2.71828...)
‘’ is the number of occurrences of the event such that k = 0, 1,2,
And 4 = E(X) = Var(X), is a positive real number
With existence conditioi
ii, f(K) 20, fork =0,1,2,..
18. A continuous random variable X is defined in terms of its probability density function f(x)
also known as PDF as well
19. A continuous random variable X is designed to follow normal distribution with constant
parameters 4 = E(X) and Var(x) = ¢? and written as X ~N ( 4 , a?)
F(x) 2 0,¥ x €( —00,00)
such that f(x) = =t=.e 3 yg
where 1 € (—00,00) is the mean of normal distribution and g > 0 is the standard deviation
20. When a random variable can take on any value within a given range where the probability
distribution is continuous, it is called a normal distribution or Gaussian distribution.
21. A random variable with a normal/Gaussian distribution is said to be normally distributed,
and is called a normal deviate
a, The mean, median and mode of the sample space are exactly the same.
b. The bell-shaped probability curve has one peak point, it means that the normal distribution
has a unique mode
Pople athens22.
23.
24.
25.
c. The area below the curve f(x) Wx €( —o9,00) has two tails of the curve extended on
both sides and never touch the axis. As the line through x = y is dividing the normal curve
into two equal parts in all aspects which means that the normal curve is symmetrical
# as half the values fall below the mean and half above the mean.
about x
d. Ina normal distribution curve the total area below the curve is always equal to 1 unit;
ie, [= FG) =1
e. The distribution can be described by two values: the mean and the standard deviation
When mean (j:) = 0 and standard deviation(a) = 1 for a data set, then the normal distribution
is called as standard normal distribution
In a normal distribution of data, the Z-score is given by Z = *
When the Z-score is positive if the data point lies above the mean, and negative if it lies below
the mean
When the data values in a normal distribution are converted into Z-scores in a standard
normal distribution, then the percentage of the data that fall within specific numbers of
standard deviations (c) from the mean (1) for bell-shaped curve is constant.
i. Data points are symmetrical along the mean (u)
ii, Z-score describes the position of each data point in terms of its distance from the mean,
When measured in standard deviation units.
iii. The Z-score is positive if the data point lies above the mean, and negative if it lies below
the mean.
There is a 68.27% probability of randomly selecting a Z-score between -1 and +1 standard
deviations from the mean.
a, 3 S°2 F@)dx has probability 68.27%
sp
b. where f(x) = Se
vv, 95.45% probability of randomly selecting a score between -2 and +2 standard deviations
from the mean.
a. 9 S22 Fax has probability 95.45%
b. where f(x) = she.e
vi, 99.73% probability of randomly selecting a score between -3 and +3 standard deviations
from the mean.
a. = S73 FO2)dx has probability 99.73%
b. where f(x) = te.e
Probability Distribution ey26. Some of the popular hypothesis tests used in probability distribution are Etest, chi-square
test test and Z-test
27. To use Z-test, we need to see that:
i. sample size is greater than 30.
data points should be independent from each other.
iii. data should be randomly selected from a population, where each data point has equally
likely of being selected.
iv. sample sizes should be equal if at all possible.
v. for convenient calculation of Z-Score, we use Z-Table to interpret normal distribution data set
4.10 ANSWER KEY TO CHECK YOUR PROGRESS
1. @) no, Bp #1 (b) no, p<0 (6) yes 2X =O1,2 3.2 4, 105/512, 193/512, 53/64 5. CPS
6. k = 0.15, 0.75, 0.3, 0.55 7. 4 or more times 8. 7 9. 0.24 10. 32/81 11. 0.5 12. 7 13. 0.217 14, 99.89%
15, 2 16. 0.05 17. 0.235 18. 0.099 19. (a) 0.11507 , (b) 0.88493, (c) 0.5328, (d) 0.6826 20. better than
43 other volunteers 21. 90% 23. (i) 1781, (ii). 424 24. Mean = 50, SD = 10
Applied Mathematics4,11 APPENDIX
Z- SCORE TABLE
zo
oot 002 003 00s 00s 006
‘00020001 0001 0001 0001 _.o00r
0002 0002 0002 0002 00020002
0003 0003 .0003 0003 0003 0003
0005 00050008 0008/0008 0008
0007 0008 00060006 .0006 0006
0003 0009 © .0009 0008 | .0008 0008
0013 001300120012
0018 0018» .0017—0016
0025 0028 0023-0023
‘003s 032.0031
0045 0084 00430081
0060 00590057 .0055
0080 © 0078» .0075 0073
10s 0102 0093-0096 .0098
01396 01320128. 0125122
0174 ©0170 016501620158
0222 0217 -0212,-0207 9202
0281 0274 ©0268-0282 0256
0381 034403360323 03220314
0435 042704180409 04010392
0537 0528 0516-0505 0495 485
0655 0643 053006180606 .0594
0793 0778 «076407430735 0721
12.1093 «1075-1056 1038,
131412921271 12511280
“153915151492 -1469 1446,
17881762, .1736 17111685,
2033 2005-1977 «1989
2878 264328112578 2886
301529812945 29122877
3372 3396 ©3300 32583228
373 378537073689 36923508
4168 4129-4090 -4052 40133974
4562 4522 4483443440884
4960 492049804840 400147614721 46814641
Probability DistributionNumber in tho
table ropresonts
Pizs2)
0 2
Tor 00005 006007 One 008
00 | Sooo 5040 5080 S160 S1S9—5239 27ST
01 | soso 5438 578 5517588755865 STI4T9,
02 | sr ses Se) S10 5048 Se7 6025806410
03 | 5170 217 255623 6a3] 53685405 6H SKE 7
4 | 6554 65916628 8884 6700 57365772008 SBM TD
0s | sors 6950 6085 70197054 708871237187 -7180 7208
os | 7257 og 72k 7357 7a 7224S 88879
07 | 7590 76 764276737708 TTT TaD
os | res 7910-7009 79577995 m0zS BOS OTH 8105S
os | sis ie 52125238 net 289 83IS DES
vo | oes 4385461 24858808 S31 855857759882
11 | 643 56586888708 872357498770 579088108800,
12 | sot aca sees 2907 as25 mht?
13 | 9002 9049 9068 9082 9009 9S stata ote 177
wa | sis S207 sums 8518S 8D MOHD
15 | sos 90455057 $370 sanz 08H ea
1s | 9452 94639174 94t 9405 95059515 952595055
17 | 9554 9564-9573 958295819580 9608 9618525608,
1a | 9641 95499656 9654 96715789685 68968906,
19 | sa sms sm s7 S78 MTHS DBT
20 | sm sme sve S788 97339788 9003980827
21 | 9921 9825-9650 98349838982 80H 858M ST
22 | sos 9964 9068 987198759878 5881 at 88T 890
23 | sos 9996 5089019004 9506 90099198198
24 | sore 9920 9022 9925 9527900998828
25 | 9998 9940 904199439045 9G 90K 090519982
25 | 9953 995590569957 9959 99609061 99620063964
27'| 9955 9965 9967 9958 9969 9970 S871 9729874
25 | a7 9375 9578 9577 sa77 3789579957880 98
29 | 9991 9982 -90e2 993 sae 99049085 8588806
30 | 9957 9967 9987 9988 9988 9969 9089 9899080000
31 | 9090 9991 9061 9991 99529962 son2 00 neg
32 | 9008 9993-9004 990490049904 80m4 0050885095,
33 | 9995 99959055 90959995 99869885005 ons 57
34 | 997 9997 9057 995799579987 9087097857988,
35 | 9098 99989088 9998 9008 9088958000 onse 98
36 | 9006 9998 9009 9909 9009 9089 8080 009008900
Table Courtesy: https:/ /www.dummies.com/education/math/ statistics /how-to-use-the-z-table/,
goo
re Applied Mathematics