0% found this document useful (0 votes)

23 views73 pages

Week-3 Annotated

The document covers regression modeling, focusing on inferences concerning the slope parameter, sampling distribution of the estimator, and confidence intervals. It discusses hypothesis testing for the slope parameter and the equivalence of F tests and t tests. Additionally, it includes simulation techniques to understand the behavior of estimators and validate statistical methods.

Uploaded by

qq1812016515

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views73 pages

Week-3 Annotated

Uploaded by

qq1812016515

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 73

Regression Modelling

Week 3

Week 3 Regression Modelling 1 / 73

1 Inferences concerning —1 (CH 2.1)

2 Inferences concerning —0 (CH2.2)

3 Interval estimation of E (Yh ) (CH2.4)

4 Prediction of new observation (CH 2.5)

5 An example with R

Week 3 Regression Modelling 2 / 73

Section 1

Inferences concerning —1 (CH 2.1)

Ei Nco 0

Yi E Yi Bot Xi

Yin ftp.xi Vly

Week 3 Regression Modelling 3 / 73
Inferences concerning —1
Point estimator b
We are interested in the slope parameter —1 . Confidence Interval
E.g. tests
Hypothesis
H0 : —1 = 0
Ha : —1 ”= 0

Under SLR settings, when —1 = 0, then the model becomes

E (Y ) = —0

Before discussing inferences concerning —1 , need the sampling

distribution of b1 .

Week 3 Regression Modelling 4 / 73

Sampling distribution of b1

q
(Xi ≠ X̄ )(Yi ≠ Ȳ )
b1 = q
(Xi ≠ X̄ )2

The sampling distribution of b1 refers to the different values of b1

that would be obtained with repeated sampling when the levels of the
predictor variable X are held constant from sample to sample. (Shown
in simulation)

I
The sampling distribution of b1 is normal with mean and variance:

E (b1 ) = —1
‡ 2
V (b1 ) = ‡ 2 (b1 ) =
Sxx

Week 3 Regression Modelling 5 / 73

Sampling distribution of b1

b1 is a linear combination of Yi
n
ÿ
b1 = ki Yi ,
i=1

where

Xi ≠ X̄
ki = q
(Xi ≠ X̄ )2

Sxy CX x̅ Yi I Ʃ Xi x̅
Y Xi x̅ Y

be I TLEY
Week 3 Regression Modelling 6 / 73
Sampling distribution of b1 Sxx É Xi x̅

ELXi x̅ Xi x̅

E Y E X ECY
EX x̅E
Im
F Xi
Normality: A linear combination of independent normal random
variables is normally distributed.
Mean: E (b1 ) = —1

Ʃ
ET ELEM Yi
Xi x̅ E Yi
4 ELA
x̅
Y
Xi F Po Xi x̅ β XD
É Xi x̅ Bot β Xi
EEx
EEEE.fi CA
Week 3 Regression Modelling
O
β Sxx 7 / 73
Sampling distribution of b1
ax a V X

VA
If x y independent XY Vly
‡2
Variance: ‡ 2 (b1 ) = V (b1 ) = Sxx

Cbi V Ʃ Y VLE.CA x̅
Y

VICKI AY CX T.VE
5

Week 3 Regression Modelling 8 / 73

Sampling distribution of b1

Estimated Variance:

2 MSE
s (b1 ) =
Sxx

This is an ubiased estimator of ‡ 2 (b1 ).

b1 has minimum variance among all unbiased linear estimators of the
form:
ÿ
—ˆ1 = ci Yi .

(Proved P43)
Ò
MSE
We call s(b1 ) = Sxx the standard error of b1 .

Week 3 Regression Modelling 9 / 73

Simulation

In practical scenarios, we typically have access to only a single sample,

making it impossible to directly observe the sampling distribution. However,
through simulation, we can emulate the process of repeated sampling. This
approach allows us to observe the distribution of sample statistics, calculate
their mean and variance, and compare these empirical results with
theoretical expectations. Such simulations are invaluable for understanding
the behavior of estimators and for validating statistical methods.

Week 3 Regression Modelling 10 / 73

Sampling distribution of b1 by simulation

Assume a true model

Yi = 2 + Xi + Ái , where Ái ≥ N (0, 82 )

So —0 = 2, —1 = 1 and ‡ = 8.
Assume the X values are 1, 2, . . . , 25, so the sample size is 25.
We learnt from theoretical results that the sampling distribution of b1
is normal with mean 1, and variance S64xx .To construct a simulated
sampling distribution of b1 , we follow steps below:

Week 3 Regression Modelling 11 / 73

Sampling distribution of b1 by simulation

1. We generate a random sample of Yi from the true model.

2. We pretend not knowing the values of the parameters and use least
squares estimation to get estimated values for the parameters. In this
case, we record the b1 value.
3. Repeat the first two steps nrep times, where nrep is a large number, so
that we have many random samples and their corresponding estimates.
We have {b1(1) , b1(2) , . . . , b1(nrep) }.
4. Plot the distribution of all the b1 ’s and calculated the mean and
variance. Compare with theoretical results.

Week 3 Regression Modelling 12 / 73

Sampling distribution of b1 by simulation
X = 1:25
n = length(X) n 25
nrep = 5000
Sxx = sum((X-mean(X))ˆ2)
beta0 = 2; beta1 = 1; esigma = 8
b0 = b1 = MSE = s_b1 = s_b0 = array(NA, dim = nrep)
set.seed(7038)
for(i in 1:nrep)
{
Y = beta0 + beta1*X + rnorm(n, 0, esigma)
mod = lm(Y ~ X)
b0[i] = mod$coefficients[1]
b1[i] = mod$coefficients[2]
MSE[i] = sum(mod$residualsˆ2)/(n-2)
s_b1[i] = sqrt(MSE[i]/Sxx)

}
s_b0[i] = sqrt(MSE[i]*(1/n + mean(X)ˆ2/Sxx)) Scb 1T
Week 3 Regression Modelling 13 / 73
Sampling distribution of b1 by simulation

it
β
mean(b1) # empirical mean if b1

## [1] 0.9983658
sdb1 = esigma/sqrt(Sxx) # theoretical sd
sdb1 FE
## [1] 0.2218801
sd(b1) # empirical sd of b1

## [1] 0.2230607

Week 3 Regression Modelling 14 / 73

Sampling distribution of b1 by simulation
hist(b1, prob = TRUE, breaks = 50, ylim = c(0,1.8))
myfun = function(x) dnorm(x, beta1, sdb1)
plot(myfun, 0.3, 1.7, col= red , add = TRUE)
Histogram of b1

NCR
1.5
1.0
Density

0.5

11 1
0.0

11
0.5 1.0 1.5

b1
Week 3 Regression Modelling 15 / 73
Sampling distribution of (b1 ≠ —1 )/s(b1 )

We have:

b1 ≥ N (—1 , ‡ 2 (b1 ))

Standardize: N
b1 ≠ —1
a
≥ N (0, 1)
‡(b1 )

Substitute ‡(b1 ) with s(b1 ), we get a studentized statistic. We have

b1 ≠ —1
≥ tn≠2 ,
s(b1 )
Ò
MSE
where s(b1 ) = Sxx .

Week 3 Regression Modelling 16 / 73

T distribution and Normal distribution

plot(function(x) dnorm(x), -4,4, lwd = 2,ylab = density )

plot(function(x) dt(x, df=2), -4, 4, ylab="", add = T,
col= 2, lty =2, lwd = 2)
plot(function(x) dt(x, df=6), -4, 4, ylab="", add = T,
col= 3, lty =2, lwd = 2)
plot(function(x) dt(x, df=n-2), -4, 4, ylab="", add = T,
col= 4, lty =2, lwd = 2)
legend(x=2.5, y=0.35, legend = c( normal , t2 , t6 , t23 ),
lty=c(1,2,2,2), col = c(1,2,3,4), lwd = c(2,2,2,2))

Week 3 Regression Modelling 17 / 73

T distribution and Normal distribution
0.4

normal
t2
0.3

t6
t23

now
density

0.2

In Non
0.1
0.0

−4 −2 0 2 4

Week 3 Regression Modelling 18 / 73

Sampling distribution of (b1 ≠ —1 )/s(b1 ) by simulation
hist((b1-beta1)/s_b1, prob = TRUE, breaks = 20, ylim = c(0,0.4))
myfun = function(x) dt(x, df = n-2)
plot(myfun, -4, 4, col= red , add = TRUE)
Histogram of (b1 − beta1)/s_b1
0.4

the
0.3
Density

0.2
0.1
0.0

−4 −2 0 2 4

(b1 − beta1)/s_b1

Week 3 Regression Modelling 19 / 73

Confidence interval for —1

Since (b1 ≠ —1 )/s(b1 ) follows a tn≠2 distribution, we have

Ó b1 ≠ —1 Ô
P t–/2;n≠2 Æ Æ t1≠–/2;n≠2 = 1 ≠ –,
s(b1 )

where t–/2;n≠2 denotes the (–/2)100 percentile of the t distribution

with n ≠ 2 degrees of freedom.
t is symmetric so t–/2;n≠2 = ≠t1≠–/2;n≠2 .
Rearranging terms, we have
Ó Ô
P b1 ≠ t–/2;n≠2 s(b1 ) Æ —1 Æ b1 + t–/2;n≠2 s(b1 ) = 1 ≠ –

Week 3 Regression Modelling 20 / 73

Confidence interval for —1

α 0105

95
The 100(1 ≠ –)% confidence interval for —1 is:

Ë È
b1 ≠ t–/2 (n ≠ 2)s (b1 ) , b1 + t–/2 (n ≠ 2)s (b1 )
Ò
MSE
where s (b1 ) = Sxx .

55 Ei

2.069 0 2.069

Week 3 Regression Modelling 21 / 73

100(1 ≠ –)% Confidence interval
50 5 2.5
2.0
1.5

PO
1.0
b1

0.5
0.0

0 10 20 30 40 50

Index

Week 3 Regression Modelling 22 / 73

Code
X = 1:25
n = length(X)
beta0 = 2; beta1 = 1; esigma = 8
b0 = b1 = ci1 = ci2 = array(NA, dim = 50)
set.seed(7038)
for(i in 1:50)
{
Y = beta0 + beta1*X + rnorm(n, 0, esigma)
mod = lm(Y ~ X)
b0[i] = mod$coefficients[1] CI
b1[i] = mod$coefficients[2] 95
ci1[i] = confint(mod)[2,1] # lower limits
ci2[i] = confint(mod)[2,2] # upper limits
of
CI for β
I
}
library(plotrix) # need to install this package beforehand
plot(b1, ylim = c(0,2),ylab = b1 )
plotCI(b1,y=NULL, uiw=ci2-b1, liw=b1-ci1, err="y", add=TRUE)
abline(h = 1, col = red )

Week 3 Regression Modelling 23 / 73

Homework

1. Design a simulation study to find the sampling distribution of MSE,

MSR.
2. Run the simulation in the previous slide for a large number of times.
Calculate the percentage of confidence intervals covering the true
parameter value.

Week 3 Regression Modelling 24 / 73

Tests concerning —1
Using the result
b1 ≠ —1
≥ tn≠2 ,
s(b1 ) Iii

we can conduct hypothesis tests for —1 .

I E
Two-sided test: whether there is a linear relationship
to 025 to 925
H0 : —1 = 0
Ha : —1 ”= 0

Test statistic
b1
t =
ú
s(b1 )

Reject if |t ú | > t1≠–/2;n≠2 .

Week 3 Regression Modelling 25 / 73
Tests concerning —1

One-sided test: Is there a positive (negative) relationship?

H0 : —1 Æ 0 (—1 Ø 0) tri z
Ha : —1 > 0 (—1 < 0)

Test statistic
t.li

b1
t =
E
ú
s(b1 )

Reject if tú > t1≠–;n≠2 (t ú < t–,n≠2 ).

to.os to 95

Week 3 Regression Modelling 26 / 73

Equivalence of F test and t test

For a given – level, the F test of —1 = 0 versus —1 ”= 0 is equivalent

algebraically to the two-sided t test on —1 .
MSR value same
F =ú
MSE p
b1
t =
ú
s(b1 )

We can show that (t ú )2 = F ú .

MSR
5
SSR CY Y É both Xi 75
Eg Cy b x̅ b Xi 45 IT bicxi TT
bisx tt.

T bE
Week 3
bIIE
Regression Modelling
f
27 / 73
Hypothesis testing with —1 at a general value

Two-sided test

H0 : —1 = c v.s. Ha : —1 ”= c

Test statistic

b1 ≠ c
t =
ú
≥ tn≠2 under H0
s (b1 )

Reject H0 if |t ú | > t1≠–/2;n≠2 .

Week 3 Regression Modelling 28 / 73

Hypothesis testing with —1 at a general value

One-sided test tail test

Upper
H0 : —1 Æ c v.s. Ha : —1 > c

Test statistic

b1 ≠ c
t =
ú
≥ tn≠2 under H0
s (b1 )

Reject H0 if t ú > t1≠–;n≠2 .

Week 3 Regression Modelling 29 / 73

Section 2

Inferences concerning —0 (CH2.2)

Week 3 Regression Modelling 30 / 73

Sampling distribution of b0

The sampling distribution of b0 is normal with mean and variance:

E (b0 ) = —0
C D
2 2 1 X̄ 2
‡ (b0 ) = V (b0 ) = ‡ +
n Sxx

Week 3 Regression Modelling 31 / 73

Sampling distribution of b0

Normality: b0 is also a linear combination of Yi .

b Y bit

LC LC of Yi
of Yi

Week 3 Regression Modelling 32 / 73

Sampling distribution of b0

Mean: E (b0 ) = —0

bi ECT x̅ ECBD
E bo E Y
EC EYi x̅ β
ÉELY x̅
β
É pot β
Xi x̅ β
x̅

Week 3
Pot RIEI
Regression Modelling
β β
33 / 73
Sampling distribution of b0

VCkÉYi
VCXtVCY
InVLY Ë
no
2
È EVIXIYI
Variance: ‡ 2 (b0 ) = ‡ 2 1
n + X̄
Sxx
I2COVCi

bo V b

VCXBD see proof

V
y zcovcy.IE on Wattle
In ÉVCb

x̅
If Jr
ʰ
É
Week 3 Regression Modelling 34 / 73
Sampling distribution of b0

2
βo
mean(b0) # empirical mean of b0

## [1] 2.014146
sd(b0) # empirical sd of b0

## [1] 3.275723
sdb0 = esigma*sqrt(1/n + mean(X)ˆ2/Sxx)
sdb0 # Theoretical sd of b0

## [1] 3.298485

Week 3 Regression Modelling 35 / 73

Sampling distribution of b0
Histogram of b0

bo
N Po
0.10
0.08
Density

0.06
0.04
0.02
0.00

−10 −5 0 5 10

Week 3 Regression Modelling 36 / 73

Sampling distribution of (b0 ≠ —0 ) /s (b0 )

An estimator of ‡ 2 (b0 )
C D
2 1 X̄ 2
s (b0 ) = MSE +
n Sxx

Ú 1 2
1 X̄ 2
We call s(b0 ) = MSE n + Sxx the standard error of b0 .

The sampling distribution of the studentised statistic

b0 ≠ —0
≥ tn≠2
s(b0

Week 3 Regression Modelling 37 / 73

Sampling distribution of (b0 ≠ —0 ) /s (b0 )
hist((b0-beta0)/s_b0, prob = TRUE)
myfun = function(x) dt(x, df = n-2)
plot(myfun, -3, 3, col= red , add = TRUE)
Histogram of (b0 − beta0)/s_b0
0.3
Density

0.2
0.1
0.0

−4 −2 0 2 4

(b0 − beta0)/s_b0

Week 3 Regression Modelling 38 / 73

Confidence interval for —0

The 100(1 ≠ –)% confidence interval for —0 is:

Ë È
b0 ≠ t1≠–/2,n≠2 s (b0 ) , b0 + t1≠–/2,n≠2 s (b0 )
Ú Ë È
1 x̄ 2
where s (b0 ) = MSE n + Sxx .

estimate Z t Cs e
point

Week 3 Regression Modelling 39 / 73

Hypothesis Testing with —0 at a General Value

Two-sided test

H0 : —0 = c
Ha : —0 ”= c

Test statistic

b0 ≠ c
t =
ú
≥ tn≠2 under H0
s (b0 )

Reject H0 if |t ú | > t1≠–/2,n≠2 .

Week 3 Regression Modelling 40 / 73

Hypothesis Testing with —0 at a General Value

One-sided test

H0 : —0 Æ c (—0 Ø c)
Ha : —0 > c (—0 < c)

Test statistic

b0 ≠ c
t =
ú
s (b0 )

Reject H0 if t ú > t1≠–,n≠2 (t ú < t–,n≠2 ).

Week 3 Regression Modelling 41 / 73

Section 3

Interval estimation of E (Yh ) (CH2.4)

Week 3 Regression Modelling 42 / 73

Interval estimation of E (Yh )

bot bi Xu
Objective: estimate the mean for one or more probability distributions
of Y .
Example: the Toluca Company is interested in the mean number of
work hours for a range of logt sizes for the purposes of finding the
optimum lot size.

Week 3 Regression Modelling 43 / 73

Interval estimation of E (Yh )

Xh denotes the level of X for which we wish to estimate the mean

response. Xh may be a value in the sample or other value of the
predictor within the scope of the model.
We already know Yˆh = b0 + b1 Xh is a point estimator for E (Yh ).
To get the interval estimation, we need to know the sampling
distribution of Ŷh .

Week 3 Regression Modelling 44 / 73

The sampling distribution of Ŷh

The sampling distribution of Ŷh is normal, with mean and variance:

E {Ŷh } = E (Yh )
C D
2 1 (Xh ≠ X̄ )2
V {Ŷh } = ‡ +
n Sxx

Normality: Ŷh is a linear combination of the observations Yi .

botbin

LCofYi
Week 3 Regression Modelling 45 / 73
The sampling distribution of Ŷh

E E both Xn Ecbolt XnECb

Mean: E {Ŷh } = E {Yh }
Variance: Pot β Xh
E
Yn
V Yn V bot bi Xh

y b x̅ b Xn

VEY b Xn x̅

Xn V bi 2Corky b Xn x̅
Y x̅

I an IT I
Week 3 Regression Modelling 46 / 73
The sampling distribution of Ŷh

So
y
So

if
I

! "2
Figure 1: The further from X̄ is Xh , the greater is the quantity Xh ≠ X̄ and the
larger is the variance of Ŷh .

Week 3 Regression Modelling 47 / 73

The sampling distribution of Ŷh

Estimated standard deviation of Ŷh :

C D
2 1 (Xh ≠ X̄ )2
s {Ŷh } = MSE +
n Sxx

Week 3 Regression Modelling 48 / 73

Sampling distribution of (Ŷh ≠ E (Yh ))/s(Ŷh )

(Ŷh ≠ E (Yh ))/s{Ŷh } is distributed as tn≠2 .

The 100(1 ≠ –)% confidence interval for E (Yh ) is:

Ë È
Ŷh ≠ t1≠–/2,n≠2 s(Ŷh ), Ŷh + t1≠–/2,n≠2 s(Ŷh )
Û 5 6
2
1 (Xh ≠X̄ )
where s(Ŷh ) = MSE n + Sxx .

Week 3 Regression Modelling 49 / 73

Section 4

Prediction of new observation (CH 2.5)

Week 3 Regression Modelling 50 / 73

Prediction of new observation

Objective: Predict a new observation Y corresponding to a given level

X.
Example: with the Toluca Company example, the next lot to be
produced consists of 100 units and management wishes to predict the
number of work hours for this particular lot.

Week 3 Regression Modelling 51 / 73

Prediction of new observation

Denote the level of X for the new trial as Xh and the new observation
on Y as Yh(new ) .
Difference between estimation of the mean response E (Y ) and
prediction of a new response Yh(new ) :
The former one is to estimate the mean of the distribution of Y .
The latter one is to predict an individual outcome drawn from the
distribution of Y .

Week 3 Regression Modelling 52 / 73

Prediction interval for Yh(new )

The point prediction for Yh(new ) is

Ŷh = b0 + b1 Xh

Compared with constructing the confidence interval for

E (Yh ) = —0 + —1 Xh , we now want a prediction interval for

Yh(new ) = —0 + —1 Xh + Áh(new )

To quantify the variation induced by the extra error term, we have

V {Yh(new ) } = V {Ŷh } + ‡ 2

Week 3 Regression Modelling 53 / 73

Prediction interval for Yh(new )

Total variation of predication ‡ 2 {pred}

S 1 22 T
1 2 1 Xh ≠ X̄
‡ 2 {pred} = ‡ 2 Ŷh + ‡ 2 = ‡ 2 U1 + +
W X
V
n Sxx

An unbiased estimator of ‡ 2 {pred} is

S 1 22 T
1 2 1 Xh ≠ X̄
2 2
se2
W X
s {pred} = s Ŷh + = MSE U1 + + V
n Sxx

Week 3 Regression Modelling 54 / 73

Prediction interval for Yh(new )

It can be shown that

Yh( new ) ≠ Ŷh

≥ tn≠2
s{pred}

The 100(1 ≠ –)% prediction interval of Yh(new ) is

Ë È
Ŷh ≠ t1≠–/2,n≠2 s{pred}, Ŷh + t1≠–/2,n≠2 s{pred}
Û 5 6
2
1 (Xh ≠X̄ )
where s{pred} = MSE 1 + n + Sxx .

Week 3 Regression Modelling 55 / 73

Comparison

Mean Response Individual Outcome

E (Yh ) = —0 + —1 Xh Yh( new ) = —0 + —1 Xh + Áh( new )

Point Estimator: Ŷh = b0 + b1 Xh Point Prediction: Ŷh = b0 + b1 Xh

Confidence 1 interval
2 Ŷh ± Prediction interval Ŷh ±
t1≠–/2,n≠2 s Ŷh t1≠–/2,n≠2 s{pred}

Û 5 6 Û 5 6
1 2
(Xh ≠X̄ ) (Xh ≠X̄ )
2 2
1 1
s Ŷh = MSE n + Sxx s{pred} = MSE 1 + n + Sxx

Week 3 Regression Modelling 56 / 73

Section 5

An example with R

Week 3 Regression Modelling 57 / 73

Toluca company example

Toluca <- read.table("CH01TA01.txt")

# need to put the data file into your working directory first
X <- Toluca[,1]
Y <- Toluca[,2]
n <- length(Y)

X = “Lot Size” and Y = “Hours Worked”

Week 3 Regression Modelling 58 / 73

Fitting the model

mymodel <- lm(Y ~ X)

coef(mymodel)

## (Intercept) X
## 62.365859 3.570202
b0 = coef(mymodel)[1]
b1 = coef(mymodel)[2]
summary(mymodel)$coefficients
S bo
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 62.365859 26.1774339 2.382428 2.585094e-02
## X 3.570202 0.3469722 10.289592 4.448828e-10
5lb
Week 3 Regression Modelling 59 / 73
Interpretation of the coefficients

Ŷ = 62.37 + 3.5702X

We estimate that the mean number of work hours increases by 3.57

hours for each additional unit produced in the lot.
We estimate that the mean number of workhours is 62.37 when the
number of lot units is 0. (meaningless)

Week 3 Regression Modelling 60 / 73

Fitting the model
500
400
Hours worked

300

FEE
200

i
1
100

50
20 40 60 80 100 120

Lot sizes

Week 3 Regression Modelling 61 / 73

Confidence intervals of coefficients (manually)

alpha = 0.05
s_b0 = summary(mymodel)$coefficients[1,2]
s_b1 = summary(mymodel)$coefficients[2,2]
b0_ci = c(b0 + qt(alpha/2, df = n-2)*s_b0,
b0 + qt(1-alpha/2, df = n-2)*s_b0)
b1_ci = c(b1 + qt(alpha/2, df = n-2)*s_b1,
b1 + qt(1-alpha/2, df = n-2)*s_b1)
b0_ci

## (Intercept) (Intercept)
## 8.213711 116.518006
b1_ci

## X X
## 2.852435 4.287969

Week 3 Regression Modelling 62 / 73

Confidence intervals of coefficients (by function)

confint(mymodel)

## 2.5 % 97.5 %
## (Intercept) 8.213711 116.518006
## X 2.852435 4.287969
With confidence coefficient .95, we estimate that the mean number of
work hours increases by somewhere between 2.85 and 4.29 hours for
each additional unit in the lot.
confint(mymodel, level=0.90)

## 5 % 95 %
## (Intercept) 17.501100 107.230617
## X 2.975536 4.164868

Week 3 Regression Modelling 63 / 73

Hypothesis testing with —

The t statistic and p-values refer to two-tail tests at zero

summary(mymodel)$coefficients

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 62.365859 26.1774339 2.382428 2.585094e-02
## X 3.570202 0.3469722 10.289592 4.448828e-10
value
p
two tail test i
Ho β O
4 10.289

na Bito p value4.4 10
Week 3 reject Ho
Regression Modelling 64 / 73
Equivalence of F Test and t Test
X
Y
correlation
β
anova(mymodel)
p t
##
##
Analysis of Variance Table
to
tes
## Response: Y P
## Df Sum Sq Mean Sq F value Pr(>F)
## X 1 252378 252378 105.88 4.449e-10 ***
## Residuals 23 54825 2384
## ---
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1

Week 3 Regression Modelling 65 / 73

Hypothesis testing with — critical
If using c v approac
ᵈ value
0.05
am tail test
upper a2
C v is
to.gs
One-tail test

tail test
g 1.714
Upper Lower tail test 1714
i reject Ho
Ho
β Ho 1170 tail
Ha 170 Ha co for lower
p test Cr is
10.289 10.289 1.714
not 17
p value pvalue 1 4
IE 2t
i Fail to
RejectHo Fail to reject Ho reject
Week 3 Regression Modelling 66 / 73
Hypothesis testing with —

Test at non-zero value

H0 : —1 = 2
Ha : —1 ”= 2

ts = (b1-2)/s_b1
ts

## X
## 4.525441
pvalue = 2*(1-pt(ts, df = n-2))
pvalue

## X
## 0.0001519178 2 Ho
Reject
Week 3 Regression Modelling 67 / 73
Confidence interval for E (Yh )
Let’s try to find a 90% confidence interval for E (Yh ) when Xh = 65.
Ú Ë È
1 (Xh ≠X̄ )2
Ŷh ± t1≠–/2;n≠2 MSE n
+ Sxx

X_h = 65
Y_hat = b0 + b1*X_h
Y_hat

## (Intercept)
## 294.429
MSE = sum(residuals(mymodel)ˆ2) / df.residual(mymodel)
s_Yhat = sqrt(MSE*(1/n +(X_h-mean(X))ˆ2/sum((X-mean(X))ˆ2)))
alpha = 0.1
Y_hat_ci = c(Y_hat - qt(1-alpha/2, n-2)*s_Yhat,
Y_hat + qt(1-alpha/2, n-2)*s_Yhat)
names(Y_hat_ci) = c( lower , upper )
Y_hat_ci

## lower upper
## 277.4315 311.4264
Week 3 Regression Modelling 68 / 73
Prediction interval for Ŷh(new )

A 90% prediction interval when Xh = 65.

Û 5 6
2
1 (Xh ≠X̄ )
Ŷh ± t1≠–/2,n≠2 MSE 1 + n + Sxx

s_Yhnew = sqrt(MSE*(1 + 1/n +

(X_h-mean(X))ˆ2/sum((X-mean(X))ˆ2)))
Y_hat_pi = c(Y_hat - qt(1-alpha/2, n-2)*s_Yhnew,
Y_hat + qt(1-alpha/2, n-2)*s_Yhnew)
names(Y_hat_pi) = c( lower , upper )
Y_hat_pi

## lower upper
## 209.0432 379.8148

Week 3 Regression Modelling 69 / 73

Confidence interval and prediction interval

predict(mymodel, newdata = data.frame(X = X_h),

interval = confidence , level = .90)

## fit lwr upr

## 1 294.429 277.4315 311.4264
predict(mymodel, newdata = data.frame(X = X_h),
interval = prediction , level = .90)

## fit lwr upr

## 1 294.429 209.0432 379.8148

With confidence coefficient 90%, the mean work hours required when lots of
65 units are produced is between 277 and 311.
With confidence coefficient 90%, we predict that the number of work hours
required for any lot of 65 units is between 209 and 380.

Week 3 Regression Modelling 70 / 73

Confidence interval and prediction interval
120
For a series of Xh values 24
X_h = seq(20, 120, by = 2) 20 22
ci = predict(mymodel, newdata = data.frame(X = X_h),
interval = confidence )
pi = predict(mymodel, newdata = data.frame(X = X_h),
interval = prediction )
plot(X,Y, ylim = c(0, 600))
abline(mymodel, lwd = 2)
lines(X_h, ci[,2], lty = 2, col = 2, lwd = 2)
lines(X_h, ci[,3], lty = 2, col = 2, lwd = 2)
lines(X_h, pi[,2], lty = 2, col = 3, lwd = 2)
lines(X_h, pi[,3], lty = 2, col = 3, lwd = 2)
legend(20, 550, c( 95CI , 95PI ), col = c(2,3),
lty = c(2,2), lwd = c(2,2))

Week 3 Regression Modelling 71 / 73

Prediction interval for Ŷh(new )
600

20,550
95CI
500

95PI
400
300
Y

200
100
0

20 40 60 80 100 120

Week 3 Regression Modelling 72 / 73

Read textbook Ch 2.1-2.5.

Week 3 Regression Modelling 73 / 73

Week 3
No ratings yet
Week 3
73 pages
Week 3
No ratings yet
Week 3
54 pages
Week 2
No ratings yet
Week 2
54 pages
Series 1
No ratings yet
Series 1
2 pages
HW 03 Sol
No ratings yet
HW 03 Sol
9 pages
Regression Analysis & Model Estimation
No ratings yet
Regression Analysis & Model Estimation
66 pages
Linear Regression Assignment Guide
No ratings yet
Linear Regression Assignment Guide
2 pages
Lecture10 Regression2 TS PDF
No ratings yet
Lecture10 Regression2 TS PDF
22 pages
00 Lab Notes
No ratings yet
00 Lab Notes
8 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Finance Students' Matlab Guide
No ratings yet
Finance Students' Matlab Guide
3 pages
Stats 500 HW3 Solutions Analysis
No ratings yet
Stats 500 HW3 Solutions Analysis
4 pages
Linear Regresssion
No ratings yet
Linear Regresssion
29 pages
Advanced ANOVA Techniques
No ratings yet
Advanced ANOVA Techniques
35 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Unit 3a
No ratings yet
Unit 3a
30 pages
Presentation REGRESSION
No ratings yet
Presentation REGRESSION
26 pages
Linear Regression
No ratings yet
Linear Regression
23 pages
36-401 Modern Regression HW #2 Solutions: Problem 1 (36 Points Total)
No ratings yet
36-401 Modern Regression HW #2 Solutions: Problem 1 (36 Points Total)
15 pages
Lecture9 Regression
No ratings yet
Lecture9 Regression
24 pages
CH 2
No ratings yet
CH 2
31 pages
Practical Linear Regression Guide
No ratings yet
Practical Linear Regression Guide
162 pages
Math644 - Chapter 1 - Part2 PDF
No ratings yet
Math644 - Chapter 1 - Part2 PDF
14 pages
Linera Regression II PDF
No ratings yet
Linera Regression II PDF
14 pages
Lecture8 4
No ratings yet
Lecture8 4
29 pages
Regression Analysis Essentials
No ratings yet
Regression Analysis Essentials
55 pages
Regression Linear
No ratings yet
Regression Linear
24 pages
Business Statistics II
100% (2)
Business Statistics II
100 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Week1 SLR
No ratings yet
Week1 SLR
30 pages
Multiple Linear Regression & Nonlinear Regression Models
No ratings yet
Multiple Linear Regression & Nonlinear Regression Models
51 pages
Example Econometrics
No ratings yet
Example Econometrics
6 pages
PSet1 - Solnb Solutiond
No ratings yet
PSet1 - Solnb Solutiond
10 pages
Week 13
No ratings yet
Week 13
25 pages
Linear Regression Lecture Notes
100% (2)
Linear Regression Lecture Notes
228 pages
STAT 302-1 Sample Final Exam
No ratings yet
STAT 302-1 Sample Final Exam
26 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
36 pages
GLMM Introduction for Tree Breeders
No ratings yet
GLMM Introduction for Tree Breeders
47 pages
1.1 Simple Linear Regression Model
100% (1)
1.1 Simple Linear Regression Model
15 pages
Homework 3 R Tutorial: How To Use This Tutorial
No ratings yet
Homework 3 R Tutorial: How To Use This Tutorial
8 pages
Week 2-B.Concentration Around The Mean
No ratings yet
Week 2-B.Concentration Around The Mean
5 pages
6034 - Classical Linear Regression Model
No ratings yet
6034 - Classical Linear Regression Model
30 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
11 SimpleRegression
No ratings yet
11 SimpleRegression
42 pages
Multiple Regression Model - 03
No ratings yet
Multiple Regression Model - 03
27 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
34 pages
Suggested - Gordon Chap 8.2 - 8.4.6
No ratings yet
Suggested - Gordon Chap 8.2 - 8.4.6
14 pages
Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing
No ratings yet
Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing
16 pages
Definition of Simple Linear Regression
No ratings yet
Definition of Simple Linear Regression
9 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
60 pages
Midterm2021R1 Sol PDF
No ratings yet
Midterm2021R1 Sol PDF
13 pages
Se LN 5
No ratings yet
Se LN 5
9 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Advanced Mathematics ( )
No ratings yet
Advanced Mathematics ( )
2 pages
Stock Watson 3U ExerciseSolutions Chapter3 Instructors
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter3 Instructors
23 pages
HullOFOD11eProblem 01 - 37
No ratings yet
HullOFOD11eProblem 01 - 37
2 pages
HullOFOD11eProblem 01 - 40
No ratings yet
HullOFOD11eProblem 01 - 40
2 pages
HullOFOD11eProblem 31 - 15
No ratings yet
HullOFOD11eProblem 31 - 15
194 pages
HullOFOD11eProblem 21 - 28
No ratings yet
HullOFOD11eProblem 21 - 28
63 pages
Stock Watson 3U ExerciseSolutions Chapter4 Instructors
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter4 Instructors
15 pages
Stock Watson 3U ExerciseSolutions Chapter8 Instructors
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter8 Instructors
14 pages
Stock Watson 3U ExerciseSolutions Chapter17 Instructors
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter17 Instructors
26 pages
HullOFOD11eProblem 25 - 30
No ratings yet
HullOFOD11eProblem 25 - 30
19 pages
QRM 10
No ratings yet
QRM 10
101 pages
Stock Watson 3U ExerciseSolutions Chapter15 Instructors
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter15 Instructors
12 pages
Stock Watson 3U ExerciseSolutions Chapter14 Instructors
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter14 Instructors
13 pages
QRM 03
No ratings yet
QRM 03
16 pages
Rplots
No ratings yet
Rplots
5 pages
QRM 02
No ratings yet
QRM 02
51 pages
Sheet 4
No ratings yet
Sheet 4
1 page
Week 2 - Tutorial Solution
No ratings yet
Week 2 - Tutorial Solution
2 pages
QRM 01
No ratings yet
QRM 01
45 pages
Week 3 - Tutorial Solutions
No ratings yet
Week 3 - Tutorial Solutions
8 pages
Stock Watson 3U ExerciseSolutions Chapter5 Instructors
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter5 Instructors
18 pages
Sheet 2
No ratings yet
Sheet 2
1 page
Sheet 1
No ratings yet
Sheet 1
1 page
Wooldridge 7e Ch06 IM
100% (1)
Wooldridge 7e Ch06 IM
20 pages
W 15808
No ratings yet
W 15808
62 pages
Lecture Notes - 1
No ratings yet
Lecture Notes - 1
56 pages
Intraday Liquidity Dynamics and News Releases Around Price Jumps: Evidence From The DJIA Stocks
No ratings yet
Intraday Liquidity Dynamics and News Releases Around Price Jumps: Evidence From The DJIA Stocks
39 pages
Tut W2 Sol
No ratings yet
Tut W2 Sol
6 pages
Financial Analytics of Inverse BTC Options in A Stochastic Volatility World
No ratings yet
Financial Analytics of Inverse BTC Options in A Stochastic Volatility World
39 pages
J Jfineco 2010 03 009
No ratings yet
J Jfineco 2010 03 009
25 pages
Linear Regression Problems With Solution
No ratings yet
Linear Regression Problems With Solution
5 pages
Leon-Garcia-IPPR - Chapters 1-6
No ratings yet
Leon-Garcia-IPPR - Chapters 1-6
180 pages
Advanced Econometrics Session1
No ratings yet
Advanced Econometrics Session1
49 pages
Econometrics Midterm Exam Guide
No ratings yet
Econometrics Midterm Exam Guide
1 page
Hypothesis Testing & Confidence Intervals One Sample
No ratings yet
Hypothesis Testing & Confidence Intervals One Sample
3 pages
Quadratic Arch Model
No ratings yet
Quadratic Arch Model
24 pages
Explain The Linear Regression Algorithm in Detail
No ratings yet
Explain The Linear Regression Algorithm in Detail
12 pages
Simeneh Gebeyehu
100% (1)
Simeneh Gebeyehu
89 pages
Business Regression Analysis Guide
No ratings yet
Business Regression Analysis Guide
68 pages
05 Ejercicios Unidad 5 Diagnosis
No ratings yet
05 Ejercicios Unidad 5 Diagnosis
3 pages
Sales Promotion Impact Analysis
No ratings yet
Sales Promotion Impact Analysis
13 pages
Data Science MCQ
100% (1)
Data Science MCQ
4 pages
Design & Analysis of Experiments 10E 2020 Montgomery 1
No ratings yet
Design & Analysis of Experiments 10E 2020 Montgomery 1
50 pages
Fundamentals of Data Analysis
No ratings yet
Fundamentals of Data Analysis
16 pages
CFA Model Analysis Summary
No ratings yet
CFA Model Analysis Summary
110 pages
Quantile Methods Slides 2024
No ratings yet
Quantile Methods Slides 2024
35 pages
Mode Choice Modeling Guide
No ratings yet
Mode Choice Modeling Guide
249 pages
Logit and Probit Models Explained
No ratings yet
Logit and Probit Models Explained
11 pages
Panel Data Problem Set 2
No ratings yet
Panel Data Problem Set 2
6 pages
DAY 6 MLR Case Studies
No ratings yet
DAY 6 MLR Case Studies
24 pages
Speech Enhancement
No ratings yet
Speech Enhancement
9 pages
Diskriminasi Aitem & Reliabilitas: Reliability Statistics
No ratings yet
Diskriminasi Aitem & Reliabilitas: Reliability Statistics
10 pages
Bayesian Methods For Management and Business Pragmatic Solutions For Real Problems 1st Edition Eugene D. Hahn All Chapter Instant Download
No ratings yet
Bayesian Methods For Management and Business Pragmatic Solutions For Real Problems 1st Edition Eugene D. Hahn All Chapter Instant Download
55 pages
Correlation: Assignment 3 - Correlation and Regression Analysis
No ratings yet
Correlation: Assignment 3 - Correlation and Regression Analysis
7 pages
EC220/221 Introduction To Econometrics: Canh Thien Dang
No ratings yet
EC220/221 Introduction To Econometrics: Canh Thien Dang
30 pages
Chapter 8 Stats
No ratings yet
Chapter 8 Stats
13 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
2 pages
Reading 2 Time-Series Analysis
No ratings yet
Reading 2 Time-Series Analysis
47 pages
SFA - Group 10 - Assignment
No ratings yet
SFA - Group 10 - Assignment
4 pages
Machine Learning Exam Solutions
No ratings yet
Machine Learning Exam Solutions
7 pages