[go: up one dir, main page]

0% found this document useful (0 votes)
32 views42 pages

Lecture 4

The document reviews concepts from probability theory including random variables, probability distributions, independence, conditional distributions, Bayes' law, and features of probability distributions such as expected value. It also provides examples of simulating airline overbooking strategies in R and working through the mammography problem.

Uploaded by

Vishal kumar Saw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views42 pages

Lecture 4

The document reviews concepts from probability theory including random variables, probability distributions, independence, conditional distributions, Bayes' law, and features of probability distributions such as expected value. It also provides examples of simulating airline overbooking strategies in R and working through the mammography problem.

Uploaded by

Vishal kumar Saw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Section 3

Review of probability theory

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 23


Review of probability theory I

Some concepts we all know but let us revise once again


A random experiment is any procedure that can, at least in theory,
be infinitely repeated and has a well-defined set of outcomes
Outcome cannot be predicted with certainty, before the experiment is
run
A random variable is one that takes on numerical values and has an
outcome that is determined by an experiment
This is a real valued (there could be other possible types of measurable
spaces) function defined over sample space of an experiment

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 24


Probability space

Probability space: ( , F, P)
Measurable space (or Borel Space):
I ( , F): and set of subsets of , denoted as F that includes null set
and it is closed under complement, closed under countable unions and
countable intersections. The pair ( , F) is called a measurable space
(consists of a set and ‡ ≠ algebra). could be a real number space.
The probability measure P : F æ [0, 1] - a function on F such that:
I P is countably additive (also called ‡-additive): if {Ai }Œ
i=1 ™ F is a
countable
tŒ collection
qŒ of pairwise disjoint sets, then
P( i=1 Ai ) = i=1 P(Ai )
I The measure of entire sample space is equal to one: P( ) = 1 and
P(ÿ) = 0

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 25


Formal definition of random variable

Random variable: X : æ R is a r.v. if {Ê : X (Ê) Æ r } œ F ’r œ R


We follow the following convention: capital letter (e.g. X ) denotes a
random variable, whereas small letter (e.g. x ) denotes a particular
outcome of the random variable X .

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 26


Types of random variables I

A random variable that can only take on the values zero and one is
called a Bernoulli (or binary) random variable.
A discrete random variable is one that takes on only a finite or
countably infinite (one-to-one correspondence with the positive
integers) number of values.
A Bernoulli random variable takes only two possible values - 0 and 1
- is an example of discrete random variable.
Probability Mass Function (pmf) of X summarises the information
concerning the possible outcomes of X and the corresponding
probabilities: f (xj ) = pj , j = 1, 2, ..., k. (It is sometimes useful to
subscript pdf by the r.v. For example, pdf for X is denoted by fX .)
A variable X is a continuous random variable if it takes on any real
value with zero probability.
While measurements are always discrete in practice, random variables
that take on numerous values are best treated as continuous.

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 27


Types of random variables II

Probability Density Function (pdf) for continuous X .


When computing probabilities for continuous random variables, it is
easiest to work with the cumulative distribution function (cdf):

F (x ) © P(X Æ x ).

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 28


Joint Distributions, Conditional Distributions, and
Independence

Joint probability density functions of two discrete r.v.

fX ,Y (x , y ) = P(X = x , Y = y ).

X and Y are said to be independent iff

fX ,Y (x , y ) = fX (x )fY (y )

for all x and y .


pdfs fX and fY are often called marginal probability density functions
to distinguish them from the joint pdf fX ,Y .
The concept of joint probability and independence can be extended for
more than two r.v.

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 29


Exercise I

An airline has 100 seats for a particular flight. Can we decide on the
optimal (or best) number of reservations the airline should make?
What are the information you need to get?
Simulate your strategy in R.

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 30


R simulation I

We plot the probability of overbooking and expected profit function


Assumptions:
I ◊ = 0.85 (probability of a customer is showing up),
I Net profit per passenger travelled = 10,
I Cost per overbooked passenger = 8 (compensation to pay)

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 31


R simulation II
1.0
0.8
0.6 Probability of overbooking
Probability

0.4
0.2
0.0

100 120 140 160 180 200

Reservations made
Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 32
Expected profit function
1000
800 Expected profit function
Expected profit

600
400
200
0

0 50 100 118 150 200

Reservations made
Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 33
Notion of independence

Independence plays an important role in obtaining some of the classic


distributions
Example: the number of successes in a sequence of independent
Bernoulli trials
Independence is often a reasonable approximation of a more
complicated situation

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 34


Conditional distribution

In econometrics, we are usually interested in how one random variable,


call it Y , is related to one or more other variables.
How X affects Y is contained in the conditional distribution of Y given
X
This information is summarized by the conditional probability density
function, defined by

fY |X (y |x ) = fX ,Y (x , y )/fX (x )

for all values of x such that fX (x ) > 0.


Interpretation for discrete case: P(Y = y |X = x ) “The probability of
Y = y given that X = x ”

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 35


Bayes’ law

What is the probability of an event, based on prior knowledge of a related


event?
P(B|A): posterior probability of B given A
P(A) and P(B): prior probability and marginal probability
Then
P(B|A)P(A)
P(A|B) =
P(B)

Note that P(B) = P(B|A)P(A) + P(B|¬A)P(¬A)

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 36


Mammography Problem I
Can doctors do a proper Bayesian inference?
Mammography problem:
The probability of breast cancer is 1% for a woman at age forty
who participates in routine screening. If a woman has breast cancer,
the probability is 80% that she will get a positive mammography.
If a woman does not have breast cancer, the probability is 9.6%
that she will also get a positive mammography. A woman in this
age group had a positive mammography in a routine screening.
What is the probability that she actually has breast cancer?
(Source: Gigerenzer, G., & Hoffrage, U. (1995). How to improve Bayesian reasoning
without instruction: Frequency formats. Psychological Review, 102(4))

Eddy (1982): 95 out of 100 physicians estimated the posterior probability


P(cancer |positive) to be between 70% and 80%.
Many physicians, college students, and staff at Harvard Medical School could not
diagnose properly using Bayesian inference.

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 37


Mammography Problem II

The answer is 7.8%.

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 38


Features of probability distribution I

A Measure of Central Tendency: The Expected Value


⁄ Œ
E (X ) = xf (x )dx .
≠Œ

For discrete case,


k
ÿ
E (X ) = xj f (xj ).
j=1

Expected value of X can be a number that is not even a possible value


of X
Expected value of a function of random variable g(X ) is also defined in
the same way ⁄ Œ
E [g(X )] = g(x )fX (x )dx .
≠Œ

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 39


Features of Expectation

For any constant c, E (c) = c


For any constants a and b, E (aX + b) = aE (X ) + b
If {a1 , a2 , . . . , an } are constants and {X1 , X2 , . . . , Xn } are random
variables, then

E (a1 X1 + a2 X2 + · · · + an Xn ) = a1 E (X1 ) + a2 E (X2 ) + · · · + an E (Xn )

We cannot extend this property to non-linear functions

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 40


Other measures of central tendency

Another Measure of Central Tendency: The Median


The definition is too complicated for our purpose: Intuitively, the
median value of X divides the area under the pdf into two equal parts
(for continuous case) or divides the possible discrete values in order
into two equal parts (for discrete case)
If the random variable is symmetric about the mean, then the median
and the mean are the same.

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 41


Measures of variability

Measures of Variability: Variance and Standard Deviation


Ë È
Var (X ) © E (X ≠ µ)2

The standard deviation of a r.v. X , denoted


 by sd(X ) is the positive
square root of the variance: sd(X ) = + Var (X )

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 42


Variance plot
pdf
fY

fX

µ X, Y
Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 43
Properties of variance

Var (X ) = 0 if and only if there is a constant such that P(X = c) = 1,


in which case E (X ) = c
For any constants a and b, Var (aX + b) = a2 Var (X )

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 44


Standardising a r.v. I

Given a r.v. X , we can subtract its mean(µ) and divide by sd (‡) to


define a new r.v.
X ≠µ
Z=

such that E (Z ) = 0 and Var (Z ) = 1
We can use standardise version of a r.v. to define other features of a
distribution
These features are described by higher order moments

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 45


Skewness and Kurtosis

Skewness: E (Z 3 ) = E [(X ≠ µ)3 ]/‡ 3 (Fisher-Pearson coefficient of


skewness)
This is zero if symmetric around mean
Negative skew (left-skewed): the left tail is longer, Positive skew
(right-skewed): the right tail is longer
Kurtosis: E (Z 4 ) = E [(X ≠ µ)4 ]/‡ 4 . This is always positive.
Larger value means tails are thicker
We compare kurtosis with a reference value of 3 of a Normal
distribution (excess kurtosis).
Mesokurtic: 0 excess kurtosis, Leptokurtic: positive excess kurtosis,
Platykurtic: negatve excess kurtosis

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 46


Features of Joint and Conditional Distributions

Measures of Association: Covariance and Correlation

Cov (x , y ) = E [(X ≠ µX )(Y ≠ µY )]


= E (XY ) ≠ µX µY

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 47


Properties of covariance

If X and Y and independent then Cov (X , Y ) = 0 (converse not true)


For any constants a1 , b1 , a2 and b2

Cov (a1 X + b1 , a2 Y + b2 ) = a1 a2 Cov (X , Y )

Cauchy-Schwartz Inequality:

|Cov (X , Y )| Æ sd(X )sd(Y )

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 48


Correlation

Correlation Coefficient:
Cov (X , Y ) ‡XY
flXY = =
sd(X ).sd(Y ) ‡X ‡Y

Both Cov () and Corr () are measure of linear dependence


Correlation is bounded by -1 and +1: ≠1 Æ flXY Æ 1
For any constants a1 , b1 , a2 and b2

Corr (a1 X + b1 , a2 Y + b2 ) = Corr (X , Y ) if a1 a2 > 0

Corr (a1 X + b1 , a2 Y + b2 ) = ≠Corr (X , Y ) if a1 a2 < 0

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 49


Variance of sum of random variables

For any constants a and b

Var (aY + bY ) = a2 Var (X ) + b 2 Var (Y ) + 2abCov (X , Y )

We can extend this for more than two variables


n
1ÿ 2 n
ÿ n ÿ
ÿ
Var ai Xi = ai2 Var (Xi ) + 2 ai aj Cov (Xi , Xj )
i=1 i=1 j=1 i>j

If Xi are pairwise uncorrelated, then this is simply the sum of variances


(with square of the coefficients)

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 50


Conditional Expectation I

Often in the social sciences, we would like to explain one variable,


called Y , in terms of another variable, say, X .
If Y is related to X in a nonlinear fashion, we would like to know this.
We can summarize the relationship between Y and X by looking at the
conditional expectation of Y given X , sometimes called the conditional
mean
E (Y |X = x ), in shor, E (Y |x )
When y is continuous,
⁄ Œ
E (Y | x ) = yfY |X (y | x )dy .
≠Œ

E (Y | x ) is some function of x , which tells us how expected values of


Y varies with x

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 51


Conditional Expectation II

For example,

E (WAGE | EDUC ) = 1.05 + 0.45 EDUC

Conditional expectation can be a non-linear function as well

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 52


Properties of Conditional Expectation I

E [c(X )|X ] = c(X ), for any function c(X )


For functions a(X ) and b(X ),
E [a(X )Y + b(X )|X ] = a(X )E (Y |X ) + b(X )
If X and Y are independent, then E (Y |X ) = E (Y )
Law of iterated expectations: E [E (Y |X )] = E (Y )
A more general case: E (Y |X ) = E [E (Y |X , Z )|X ]
If E (Y |X ) = E (Y ), then Cov (X , Y ) = 0. In fact, every function of X
is uncorrelated with Y . (Converse is NOT true)
I If X and Y are correlated, then E (Y |X ) must depend on X .
I The conditional expectation captures the nonlinear relationship between
X and Y whereas Correlation captures linear association. (remember the
example of Y = X 2 )
Quick exercise: If U and X are random variables such that E(U|X) =
0, then argue that E(U) =0, and U and X are uncorrelated.

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 53


Conditional Variance

Given random variables X and Y , the variance of Y , conditional on


X = x , is simply the variance associated with the conditional
distribution of Y , given X = x

Var (Y |X = x ) = E (Y 2 |x ) ≠ [E (Y |x )]2

If X and Y are independent, then Var (Y | X ) = Var (Y )

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 54


Some well known distributions

Normal distribution
Standard Normal distribution
Chi Square distribution
t distribution
F distribution

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 55


Normal distribution

The pdf of a normal variable X ≥ Normal(µ, ‡ 2 ) is


C D
1 1 1 x ≠ µ 22
f (x ) = Ô exp ≠
‡ 2fi 2 ‡

where µ = E (X ) and ‡ 2 = Var (X )


Standard normal variable Z ≥ Normal(0, 1). The pdf is

1
„(z) = Ô exp(≠z 2 /2)
2fi
Cumulative distribution: (z) = P(Z Æ z)

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 56


Standard normal properties

Symmetric

P(Z > z) = 1 ≠ (z)


P(Z < ≠z) = P(Z > z)
P(a Æ Z Æ b) = (b) ≠ (a)
P(|Z | > c) = 2[1 ≠ (c)]

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 57


Properties of Normal

If X ≥ N(µ, ‡ 2 ) then aX + b ≥ N(aµ + b, a2 ‡ 2 )


If X and Y jointly normally distributed, then they are independent if
and only if Cov (X , Y ) = 0
Any linear combination of independent, identically distributed normal
random variables has a normal distribution

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 58


Chi square

Let Zi , i = 1, 2, . . . , n be independent random variables, each


distributed as standard normal. Define a new random variable
n
ÿ
X= (Zi )2
i=1

X has what is known as a chi-square distribution with n degrees of


freedom
X ≥ ‰2n
E (X ) = n and Var (X ) = 2n

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 59


Plot of chi-square
Chi−Square Distribution
0.5

df=2
df=4
df=8
0.4
0.3
Density

0.2
0.1
0.0

0 5 10 15 20 25

x
Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 60
t distribution

The t distribution is the workhorse in classical statistics and multiple


regression analysis
We obtain a t distribution from a standard normal and a chi-square
random variable
Z
T =
X /n
where Z ≥ N(0, 1) and X ≥ ‰2n and they are independent.
We say T ≥ tn
Degrees of freedom from the chi-square random variable in the
denominator
pdf of the t distribution has a shape similar to that of the standard
normal distribution except that it is more spread out
As the degrees of freedom gets large, the t distribution approaches the
standard normal distribution.
E (T ) = 0 for n > 1 and Var (T ) = n/(n ≠ 2) for n > 2

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 61


Plot of t distribution
t Distribution
0.4

df=24
df=2
df=1
0.3
Density

0.2
0.1
0.0

−4 −2 0 2 4

x
Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 62
F distribution

Another important distribution for statistics and econometrics


(hypothesis testing in the context of multiple linear regression model,
ANOVA)
Let X1 ≥ ‰2k1 and X2 ≥ ‰2k2 be two independent random variables.
Then the random variable
X1 /k1
F =
X2 /k2

has a distribution known as F distribution with (k1 , k2 ) degrees of


freedom.
We denote F ≥ Fk1 ,k2
The order of degrees of freedom is important

Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 63


Plot of F distribution
F Distribution
1.0

df=2,8
df=6,8
df=6,20
0.8
0.6
Density

0.4
0.2
0.0

0 1 2 3 4

x
Sourabh B Paul (IIT Delhi) Econometric Methods II Semester 2023-24 64

You might also like