[go: up one dir, main page]

0% found this document useful (0 votes)
88 views2 pages

STAT 3008 Applied Regression Analysis Tutorial 2 - Term 2, 2019 20

This document provides an overview of maximum likelihood estimation and its application to simple and multiple linear regression models. It also reviews key properties of expectations, variances, and covariances as they relate to random variables and vectors. Several examples are provided to illustrate maximum likelihood estimation and derive properties of parameter estimates and model residuals.

Uploaded by

Mingyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views2 pages

STAT 3008 Applied Regression Analysis Tutorial 2 - Term 2, 2019 20

This document provides an overview of maximum likelihood estimation and its application to simple and multiple linear regression models. It also reviews key properties of expectations, variances, and covariances as they relate to random variables and vectors. Several examples are provided to illustrate maximum likelihood estimation and derive properties of parameter estimates and model residuals.

Uploaded by

Mingyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

STAT 3008 Applied Regression Analysis

Tutorial 2 | Term 2, 2019−20

ZHAN Zebang
19 February 2020

1 Maximum Likelihood Estimates


• Maximum likelihood estimates (MLE): the estimates of parameters are obtained by
maximizing the (log-)likelihood function.
iid
• Model setting: yi = β0 + β1 xi1 + · · · + βp xip + ei , where ei ∼ N (0, σ 2 ).
iid
yi ∼ N (β0 + β1 xi1 + · · · + βp xip , σ 2 )
n  
2
Y 1 1 2
L(β, σ ) = √ exp{− 2 (yi − β0 − β1 xi1 − · · · − βp xip ) }
i=1 2πσ 2 2σ
 n n
1 1 X
= √ exp{− 2 (yi − β0 − β1 xi1 − · · · − βp xip )2 }
2πσ 2 2σ i=1
n
2 2n n 2 1 X
l(β, σ ) = log L(β, σ ) = − log(2π) − log(σ ) − 2 (yi − β0 − β1 xi1 − · · · − βp xip )2
2 2 2σ i=1
(β̃, σ̃ 2 ) = arg max(β,σ2 ) l(β, σ 2 ).
A notational convention: log(·) refers to the natural logarithm ln(·) when the base is
omitted. This may be different from your calculators.
In R, we use log() for the natural logarithm and log10() for the common logarithm.

• For simple linear regression,


n
2 2 n n 2 1 X
l(β0 , β1 , σ ) = log L(β0 , β1 , σ ) = − log(2π) − log(σ ) − 2 (yi − β0 − β1 xi )2 .
2 2 2σ i=1
∂l ∂l ∂l
Set |β̃0 ,β̃1 ,σ̃2 = |β̃0 ,β̃1 ,σ̃2 = | 2 = 0
∂β0 ∂β1 ∂σ 2 β̃0 ,β̃1 ,σ̃
SXY 1 SXY 2 n−2 2
⇒ β̃1 = = β̂1 , β̃0 = ȳ − β̃1 x̄ = β̂0 , σ̃ 2 = (SY Y − )= σ̂ .
SXX n SXX n

Example 1. (Alternative form of SLR) For a data set with observations {(xi , yi ), i = 1, ..., n},
iid
consider the regression model yi = α0 +α1 (xi − x̄)+ei , where ei ∼ N (0, σ 2 ). Find the maximum
likelihood estimates of α0 , α1 and σ 2 .

Example 2. (MLE method) For a data set with observations {(xi , yi ), i = 1, ..., n}, consider
iid
the regression model yi = β1 x2i + ei , where ei ∼ N (0, σ 2 ). Find the maximum likelihood esti-
mates of β1 and σ 2 .

1
2 Expectations and Variances
Recall: If X and Y are two random variables, a and b are two constants, then

• E(aX + bY ) = a E(X) + b E(Y ).

• Var(X) = E[(X − E(X))2 ] = E(X 2 ) − [E(X)]2 .

• Var(aX + b) = a2 Var(X).

• Cov(X, Y ) = E[(X − E(X))(Y − E(Y ))] = E(XY ) − E(X) E(Y ).

• Cov(aX, bY ) = ab Cov(X, Y ).

If X ∈ Rp and Y ∈ Rq are random vectors, Am×p and Bn×q are constant matrices, then

• E(AX) = A E(X).

• Var(AX) = A Var(X)A0 , where A0 denotes the transpose of A.

• Cov(AX, BY ) = A Cov(X, Y )B 0 .

Example 3. For the maximum likelihood estimates of β1 in Example 2, find E(β̃1 ).

Example 4. (2016S Midterm) For the multiple regression model Y = Xβ +e, e ∼ N (0, σ 2 In ),
let β̂ = (X 0 X + kI)−1 X 0 Y where k is a constant. What are E(β̂) and Var(β̂)?

It is very useful and important later to define the hat matrix by H = X(X 0 X)−1 X 0 .
If X ∈ Rn×(p+1) and (X 0 X)−1 exists, then H ∈ Rn×n . Then, we can verify that

• H is symmetric: H 0 = H. ⇒ In − H is symmetric.

• H is idempotent: H = HH. ⇒ In − H is idempotent.

• HX = X, X 0 H = X 0 .

Example 5. (2019F Midterm) Let A = In − X(X 0 X)−1 X 0 . Show that A10 = A.

Example 6. (Probabilistic properties of residuals) Consider the regression model Y = Xβ +e


with E(e) = 0 and Var(e) = σ 2 In . Let β̂ = (X 0 X)−1 X 0 Y , Ŷ = HY , ê = Y − Ŷ .
Show that

(1) E(ê) = 0.

(2) Var(ê) = σ 2 (In − H).

(3) Cov(ê, Y ) = σ 2 (In − H).

(4) Cov(ê, Ŷ ) = O.

(5) Cov(ê, β̂) = O.

You might also like