[go: up one dir, main page]

0% found this document useful (0 votes)
4 views53 pages

Week 4 Notes

Uploaded by

kewchinloong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views53 pages

Week 4 Notes

Uploaded by

kewchinloong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

SC2000/CZ2100

Probability & Statistics

Week 4

1
Ch 6. Probability Distributions

● Random variables – Discrete


– Continuous
● Discrete Probability distributions –
Binomial, Poisson, Geometric
● Continuous Probability distributions –
Uniform, Exponential, Normal

2
Random Variables

The outcomes of random experiments commonly


takes on numerical values.
If the outcomes are not numerical, they can be
represented in terms of numbers (this facilitates
mathematical analysis).
A random variable (r.v.) is a variable that has a
numerical value which is defined on or determined
by the outcomes or events of an experiment. The
numerical value cannot be predicted exactly but
can be described by its probability.

3
Random Variables
Random variables can be discrete or continuous.
Eg: The no. of seeds germinating from a flower
pot. Possible values for X are 0, 1, 2, … (discrete).
The maximum daily temperature in Singapore.
Possible values are 23 – 35oC, e.g. 26.1276oC
(continuous).
The response to questions with “Yes”, “No”, or
“Don't know” answers. This are not a r.v. (the
values are not numerical).
Let Y be the number of “Yes”s.
Y is a discrete r.v. 4
Random Variables
Expected Value of a r.v.
For a given set of observed data {x1, x2, …, xn}, one
of the central tendency measure is the arithmetic
average:
𝑥𝑥1 + 𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛
𝑋𝑋� =
𝑛𝑛
𝑋𝑋� is calculated based on a sample of 𝑛𝑛 observed
values.
If we know the probability of occurring associated
with each value of a r.v., then we can obtained the
weighted average which is known as the expectation
of the r.v.
5
Random Variables
Expected Value of a r.v.
The probability distribution or probability mass
function (pmf) of a r.v. X is given as:

Values of X x1 x2 ... xk
Probabilities p1 p2 ... pk

Its expected value is defined as:


k
E ( X ) = x1 p1 + x2 p2 + ... + xk pk = ∑ xi pi
i =1
Also denoted by µx
6
Random Variables
Eg: (Decision Analysis)
An oil exploration company has a lease for which it must
decide on one of the following options:
1. Sell now and can get $125K.
2. Hold for a year and if oil prices rise (prob = 0.6) it can
sell for $300K or if oil prices fall (prob = 0.4) it can get
$100K.
3. Drill now. The cost of drilling is $200K and drilling will
lead to one of the following outcomes:
Well type Dry Wet Gusher
Probability 0.5 0.4 0.1
Profit $0 $400K $1500K

What should the company do?


7
Random Variables
Solution: Let X be the financial gain.
Option 1 (sell now): financial gain = $125K.
Option 2 X 300 100
(hold for 1 yr): Probability 0.6 0.4

∴E (X ) = 300x0.6 + 100x0.4 = $220K


Option 3, X = profit – drilling costs ($200K)
X 0 – 200K 400K – 200K 1,500K – 200K
Probability 0.5 0.4 0.1

∴E (X ) = -200x0.5 + 200x0.4 + 1300x0.1 = $110K


Best decision is to hold for a year and then sell.
8
Given the pmf of X below, if Y=(X−1)2, find E[Y].
x 0 1 2
PX(x) 0.2 0.5 0.3

9
Given the pmf of X below, if Y=(X−1)2, find E[Y].

x 0 1 2
PX(x) 0.2 0.5 0.3

10
Random Variables
Variance of a r.v.
Recall that variance is a measure of variability. For a
sample of n independent observations the sample
variance is obtained using:
s2 = ∑
n
( xi − X ) 2
or
1 2
� 𝑥𝑥 −
∑ 𝑥𝑥 2

i =1 n −1 𝑛𝑛 − 1 𝑛𝑛
The variance of a r.v. X is defined as
var( X ) = p1 (x1 − µ ) + p2 (x2 − µ ) +  + pn ( xn − µ ) 2
2 2

n
= ∑ pi (xi − µ ) = E (x − µ )
i =1
2
[ 2
]
where pi = probability that xi occurs.
11
Random Variables
It represents the theoretical limit of the sample
variance s 2 as the sample size n becomes very
large. Var(X ) is often denoted by σ 2.

An alternate formula for var(X ) is as follows:

var(X ) = Σ pi ( x2i − 2x iµ + µ 2)
= Σ pi x2i − 2 µ Σ pi x i + µ 2 Σ pi
= E [X 2] − µ 2 or E [X 2] − (E [X ])2

12
Random Variables
Eg
Let the random variable X be the number of male
students in a group from a class of size 5. The
probability distribution of X is

X 0 1 2 3 4 5

P(X = x) 1/32 5/32 10/32 10/32 5/32 1/32

What is the expected number of males E (X ) in the


group, and what is the standard deviation, σ, of
X?
13
Random Variables
n
Solution E ( X ) = ∑ xi P ( xi )
i =1
= 0 × 321 + 1× 325 + 2 × 10
32 + 3 × 10
32 + 4 × 5
32 + 5 × 1
32

32 = 2.5 = µ
= 80

i.e. on average such groups have 2.5 males.

var(X ) = ∑ x 2 P (x ) − µ 2
= (0 × + ... + 5 × ) − (2.5)
2 1 2 1 2
32 32
2
= 7.5 − 2.5 = 1.25
So σ = var( X ) = 1.25 ≈ 1.12
14
Given the pmf of X below, if Y=(X−1)2, find Var[Y].
x 0 1 2
PX(x) 0.2 0.5 0.3

15
Given the pmf of X below, if Y=(X−1)2, find Var[Y].
x 0 1 2
PX(x) 0.2 0.5 0.3

16
Random Variables
Empirical quantity Theoretical quantity
Remarks
(m observed data) (mathematical d.r.v.)
𝑓𝑓𝑖𝑖
𝑓𝑓𝑖𝑖 → 𝑝𝑝𝑖𝑖
Relative freq of 𝑥𝑥𝑖𝑖 is 𝑃𝑃 𝑋𝑋 = 𝑥𝑥𝑖𝑖 = 𝑝𝑝𝑖𝑖 𝑛𝑛
𝑚𝑚
as 𝑛𝑛 → ∞
𝑓𝑓𝑖𝑖
� =1 � 𝑝𝑝𝑖𝑖 = 1
𝑚𝑚
𝑖𝑖 𝑖𝑖

Mean: Expectation:
𝑓𝑓𝑖𝑖 𝑋𝑋� → 𝜇𝜇

𝑋𝑋 = � 𝑥𝑥𝑖𝑖 𝐸𝐸 𝑋𝑋 = 𝜇𝜇 = � 𝑝𝑝𝑖𝑖 𝑥𝑥𝑖𝑖 as 𝑛𝑛 → ∞
𝑚𝑚
𝑖𝑖 𝑖𝑖

Variance: Var[X]:
� 2 𝑓𝑓𝑖𝑖
(𝑥𝑥𝑖𝑖 − 𝑋𝑋) 𝑠𝑠 2 → 𝜎𝜎 2
2
𝑠𝑠 = � � 2 𝑝𝑝𝑖𝑖
𝜎𝜎 2 = �(𝑥𝑥𝑖𝑖 − 𝑋𝑋) as 𝑛𝑛 → ∞
𝑚𝑚 − 1
𝑖𝑖 𝑖𝑖
17
Random Variables
Expected Value and Variance for a Function of r.v.
(Linear transformation of r.v.)
If Y = a + b X where X is a r.v. and a and b are
known constant values, then
E (Y ) = a + b E (X )
and
var(Y ) = b 2 var(X )
so
σ y = b var( X ) =
2 2 2
b σx = bσ x

18
Random Variables
Similarly if T = a + b X + c Y where X and Y are
uncorrelated r.v. and a, b and c are known
constants, then

E (T ) = a + bE ( X ) + cE (Y )

and
var(T ) = b var( X ) + c var(Y )
2 2

19
Random Variables
Eg: A company makes products for local and
export markets. The number of sales for next
year are estimated as follows:
local, X units 1,000 3,000 5,000 10,000
probability 0.1 0.3 0.4 0.2

export, Y units 300 500 700


probability 0.4 0.5 0.1

Hence E (X ) = 1000 x 0.1 + 3000 x 0.3 +


5000 x 0.4 + 10000 x 0.2
= 5000 (= expected local sales )
20
Random Variables
E (Y ) = 300 × 0.4 + 500 × 0.5 + 700 × 0.1
= 440 ( = expected export sales)
Suppose the company makes a profit of $2000
on each unit sold on the local market and $3500
on each exported unit.
Hence the total profit is T = 2000X + 3500Y
And the expected profit is
E (T ) = 2000 E ( X ) + 3500 E (Y )
= 2000 × 5000 + 3500 × 440
= $11,540,000
⇒ this is next year’s estimated profit.
21
Random Variables
Eg: A component is made by cutting a piece of
metal to length X and trimming it by amount Y. Both
processes are somewhat imprecise.
Nett length T =bX +cY with b = 1 and c = –1.
E (T ) = b E (X ) + c E (Y ) = E (X ) − E (Y )
2 2
Var(T ) = b var(X ) + c var(Y )
Note
i.e. var(T ) is greater than either var(X ) or var(Y ),
even though T = X – Y, because both X and Y
contribute to the variability in T.
22
Prob. Distribution of Discrete r.v.
Special Probability Distributions
● Binomial Distribution
● Poisson Distribution
● Geometric Distribution

23
Prob. Distribution of Discrete r.v.
The list of possible values that a d.r.v. X can take
and their probabilities is called the discrete
probability distribution (also known as probability
mass function, pmf) for X.
Eg:
Let the random variable X be the number of girls
in a family of 3 children.
X =3 GGG
Possible values: X =2 GGB GBG BGG
X =1 BBG BGB GBB
X =0 BBB
24
Prob. Distribution of Discrete r.v.
Assume that the 8 outcomes are equally likely,

xi 0 1 2 3

Probability Distribution P( X = x i ) 1/8 3/8 3/8 1/8

Cumulative Distribution P( X ≤ x i ) 1/8 4/8 7/8 1

In general, we write:
P( X = xi ) = pi for i = 1,  , k, and 0 ≤ pi ≤ 1
Notation convention: we often use capital letters for
random variables and small letters for specific values.

25
Prob. Distribution of Discrete r.v.
Eg: Let X denote the sum of the results of 2 dice
thrown. 1,1 2,1 3,1 … 6,1
1,2 2,2 3,2 … 6,2
Outcomes: .. .. .. ..
. . . .
1,6 2,6 3,6 … 6,6

36 outcomes and the values of X are 2, 3, ..., 12.


Assuming that the outcomes are equally likely, i.e.
the probability of each outcome is 1/36, the
probability distribution of X is
x 2 3 4 ... 10 11 12
P( X = x) 1
36
2
36
3
36
... 3
36
2
36
1
36

Exercise: find the cumulative distribution.


26
Prob. Distribution of Discrete r.v.
Special Probability Distributions
● Binomial Distribution
● Poisson Distribution
● Geometric Distribution

27
Prob. Distribution of Discrete r.v.
Binomial Distribution
 Consider n Bernoulli trials, where each trial is
an “experiment” with exactly 2 possible
outcomes, "success" and "failure“.
 Assume that the probability of success (S ) is the
same for all trials, P (S ) = p, and the probability
of failure P (F ) = 1 – p.
 Assume also that trials are independent, i.e., the
probability for any given combination of
successes and failures can be obtained by
multiplying the probabilities for each trial
outcome.
28
Prob. Distribution of Discrete r.v.
Eg: Consider 5 trials, the probability,
P (SSFSF ) = p ⋅p ⋅(1-p )⋅p ⋅(1-p )
= p 3(1-p)2

The probability of obtaining any 3 successes


and 2 failures in 5 trials, i.e. SSS F F , SS F S F , ...
etc., is p 3(1-p)2 for each of the different ways
this could occur.

29
Prob. Distribution of Discrete r.v.
The number of distinct "arrangements" of 3 successes
and 2 failures can be calculated using the binomial
coefficient  nx  or nC x
 
The binomial coefficient is defined as:
where n! = n x ( n − 1 ) x … x 2 x 1

 5  = 5! = 5 × 4 × 3 × 2 ×1 = 10
In this example,  3  3!2! (3 × 2 ×1)(2 ×1)
so there are 10 distinct ways to obtain 3 successes in 5
trials, with each arrangement having a probability of
p 3(1-p)2.
30
Prob. Distribution of Discrete r.v.
Let X be the r.v. equal to the total number of
successes in n trials. To calculate the probability of
obtaining x successes, we have:
P ( X = x ) =  n  ⋅ p x ⋅ (1 − p ) n − x
 x
# arrangements of prob. of prob. of
x S ’s and (n-x) F ’s x S ’s (n-x) F ’s
The distribution of the count of successes is called the
binomial distribution with two parameters, n and p.
We say X ~ B (n, p).
X 0 1 … n
n
P (X =x ) C 0 p 0(1−p)n n
C 1 p 1 (1−p)n −1 …
n
C n p n (1−p)n − n
31
Prob. Distribution of Discrete r.v.
Binomial Distribution – X ~ B(n,p)
P ( X = x) =  n  ⋅ p x ⋅ (1 − p ) n − x
 x
Example:

32
Prob. Distribution of Discrete r.v.
Mean and variance of X ~ B(n,p)
𝑛𝑛 𝑥𝑥 𝑛𝑛−𝑥𝑥
𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = 𝑝𝑝 1 − 𝑝𝑝 Recall:
𝑥𝑥
E[X]= ∑ 𝑥𝑥 𝑝𝑝(𝑥𝑥)
The mean of X is given by:
𝑛𝑛
𝑛𝑛 𝑥𝑥 𝑛𝑛−𝑥𝑥
𝐸𝐸 𝑋𝑋 = � 𝑥𝑥 𝑝𝑝 1 − 𝑝𝑝 = 𝑛𝑛𝑛𝑛
𝑥𝑥
𝑥𝑥=0

And the variance is:

𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 = 𝐸𝐸 𝑋𝑋 2 − 𝐸𝐸 𝑋𝑋 2 = 𝑛𝑛𝑛𝑛 1 − 𝑝𝑝 = 𝑛𝑛𝑛𝑛𝑛𝑛

where 𝑞𝑞 = 1 − 𝑝𝑝
33
Prob. Distribution of Discrete r.v.
Example:
A football team plays 3 games. Assume each game is a
Bernoulli trial with prob(win) = 0.5. Let the r.v. X be
the number of wins. What is the probability that the
team will win exactly 2 games?

Solution
X has binomial distribution with n = 3 and p = 0.5,
outcome Win(W ) or Lose(L) on each trial. i.e.
X ~ B (3, 0.5)

34
Prob. Distribution of Discrete r.v.
Using the formula for Binomial probabilities:

P( X = x ) =  n  p x (1 − p )
n− x

 x

3
= 0.52 (1 − 0.5)1
2

= 3/8

35
There are 10 blue and 30 green balls in a bag. A ball is
random picked, the colour is noted and it is put back into the
bag. This process is repeated 10 times. Let X be the number
of times a blue ball was picked, Calculate E[X] and Var[X].

X~B(n,p) where n = 10 and p = 10/40

E[X] = np

Var[X] = npq, where q = 1 - p

36
Prob. Distribution of Discrete r.v.
Example:
A quality control system selects a sample of 10 items
from each batch of products for testing. If 2 or more
of the items are defective the whole batch is
rejected.

If the probability of an item being defective is 0.05,


what is the probability of
(i) having 2 defectives in the sample?
(ii) the batch being rejected?

37
Prob. Distribution of Discrete r.v.
Soln: Let X be the number of defectives in the sample of
n = 10 items. X ~ B (10, 0.05)
10 2 8
(i) P (X = 2) = C 2 (0.05) (0.95)

(ii) P (reject batch)=P (X ≥2)


= 1 – P (X<2)
= 1 – P (X=0) – P (X=1)
= 1 – 0.5987 – 0.3151
= 0.0861
38
Prob. Distribution of Discrete r.v.
Special Probability Distributions
● Binomial Distribution
● Poisson Distribution
● Geometric Distribution

39
Prob. Distribution of Discrete r.v.
Poisson Distribution – Notation: X~Pois(µ)
 Let r.v. X be the number of “successes” in a given
time interval where X is a non-negative integer
 The r.v. X is a Poisson r.v. with probability
distribution given by

where µ is the average number of successes and e


is a constant = 2.71828...

 The expected value and the variance of X are both


equal to µ.
40
Prob. Distribution of Discrete r.v.
Poisson Distribution – X~Pois(µ)

Example:
P(X) X~Po(µ = 3)

41
Prob. Distribution of Discrete r.v.
Eg:
A salesman sells on the average 3 iphones per
day. The number of iphones sold per day is a
Poisson r.v.
(i) Calculate the probability that in a given day
he will sell some iphones.
(ii) Given that there are 8 working hours per
day, what is the probability that in a given
hour he will sell one iphone?

42
Prob. Distribution of Discrete r.v.
Soln:
(i) P(some iphones) = 1 – P(no iphone)

(ii) Avg no. of iphones sold/hour = 3/8 = 0.375

On a given hour, P(X=1) =

43
Wooclap Ex: A shop selling a certain product of which the weekly
demand is a Poisson variable with mean 3. On 1st day of each month,
the stocks are replenished. Obtain the minimum number of the product in
stock on 1st day of a month so that the shop is at least 50% sure of being
able to meet the demands for the month.

44
Prob. Distribution of Discrete r.v.
Poisson Distribution – Approx to Binomial Dist
For Binomial Distribution X ~ B(n, p) with large n :
i.e. 𝑛𝑛 → ∞, 𝑝𝑝 → 0 and 𝑛𝑛𝑛𝑛 → a constant 𝜇𝜇

Then it can be shown that :


𝑛𝑛! 𝑒𝑒 −𝜇𝜇 𝜇𝜇 𝑥𝑥
𝑝𝑝 𝑥𝑥 (1 − 𝑝𝑝)𝑛𝑛−𝑥𝑥 →
𝑥𝑥! 𝑛𝑛 − 𝑥𝑥 ! 𝑥𝑥!

∴ X ~ B(n,p) ≈ Pois(µ), where µ = 𝑛𝑛𝑛𝑛, we have:

45
Prob. Distribution of Discrete r.v.
Poisson Distribution – Approx to Binomial Dist
Conditions for which the probability of X~B(n,p) can
be approximated by the probability X ~ Pois(µ):
 large n (n ≥ 100)
 small p (p ≤ 0.01)
 constant µ = np (µ ≤20)

Note that when 𝑝𝑝 → 0, then (1 − 𝑝𝑝) → 1.


∴ the expected value and the variance of X ~B(n,p)
are approximately equal.
46
Intuitive Explanation:
To obtain the mean and variance of the Poisson
distribution, consider the binomial distribution under
the following conditions:
𝑛𝑛 → ∞, 𝑝𝑝 → 0 and 𝑛𝑛𝑛𝑛 → a constant 𝜇𝜇
The mean and variance of the binomial distribution are
given by:
𝐸𝐸 𝑋𝑋 = 𝑛𝑛𝑛𝑛
𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 = 𝑛𝑛𝑛𝑛𝑛𝑛
Since 𝑞𝑞 = 1 − 𝑝𝑝, and as 𝑝𝑝 → 0, 𝑞𝑞 → 1 we have:

𝐸𝐸 𝑋𝑋 = 𝜇𝜇 and 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 = 𝑛𝑛𝑛𝑛𝑞𝑞 = 𝜇𝜇


47
Example: Suppose in a production line, 1 in 200 LED
produced is defective. A random sample of 1000 LEDs are
selected. What is the probability that at most 2 of them are
defective?

Let X = no. of defective LED


p = Prob that a LED is defective = 1/200 = 0.005
np = 5
n >100, p<0.01 and np<20
use Poisson Dist to approx. Binomial Dist

P(X≤2) = P(X=0) + P(X=1) + P(X=2)


52 𝑒𝑒 −5
= 𝑒𝑒 −5 + 5𝑒𝑒 −5 + = 0.125
2!
48
Prob. Distribution of Discrete r.v.
Special Probability Distributions
● Binomial Distribution
● Poisson Distribution
● Geometric Distribution

49
Prob. Distribution of Discrete r.v.
Geometric Distribution – Notation: X~G(p)
 Consider a sequence of independent Bernoulli trials,
where each trial is an “experiment” with exactly 2
possible outcomes, "success" and "failure“.
 Assume that the probability of success (S ) is the
same for all trials, P (S ) = p, and the probability of
failure P (F ) = 1 – p.
 Each experiment is performed consecutively until
the first success is obtained.
 Let X = no. of trials. Hence there are X − 1 failures
before the 1st success is obtained.
50
Prob. Distribution of Discrete r.v.
Hence X is a random variable with a Geometric
distribution 𝑋𝑋~𝐺𝐺(𝑝𝑝) :

P 𝑋𝑋 = 𝑥𝑥 = (1 − 𝑝𝑝)𝑥𝑥−1 𝑝𝑝
Expected value: Examples:
1
E[X] =
𝑝𝑝
Variance:
1 − 𝑝𝑝
Var[X] = 2
𝑝𝑝
51
Example:
In a data transmission, the probability that a data frame is received
in error is 0.1. If a data frame is received in error, a retransmission
will be requested.
a) What is probability that a data frame will be successfully
received in no more than 3 transmissions?
b) Determine the average number of transmissions per frame.

Let X = # of data frames transmitted to get the 1st success


∴ X ~ G(p = 0.9)
a) P(X≤3) = P(X=1) + P(X=2) + P(X=3)
= 0.9 + 0.1 * 0.9 + 0.12 * 0.9 = 0.999
b) E[X] = 1/p = 1.111

52
Prob. Distribution of Discrete r.v.
Special Probability Distributions
● Binomial Distribution
● Poisson Distribution
● Geometric Distribution

Distribution Probability Mean Variance

𝑛𝑛 𝑥𝑥 𝑛𝑛−𝑥𝑥
X~B(n,p) 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = 𝑝𝑝 1 − 𝑝𝑝 𝑛𝑛𝑛𝑛 𝑛𝑛𝑛𝑛(1 − 𝑝𝑝)
𝑥𝑥

𝑒𝑒 −𝜇𝜇 𝜇𝜇 𝑥𝑥
X~Pois(µ) 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = µ µ
𝑥𝑥!
1 1 − 𝑝𝑝
X~G(p) P 𝑋𝑋 = 𝑥𝑥 = (1 − 𝑝𝑝)𝑥𝑥−1 𝑝𝑝
𝑝𝑝 𝑝𝑝2
53

You might also like