Chapter 10: Simple Linear Regression and
Correlation
Course Name: PROBABILITY & STATISTICS
Lecturer: Huong Pham
Hanoi, 2023
1 / 86 Chapter 10: Simple Linear Regression and Correlation
Content
1 Empirical Models
2 Simple Linear Regression
3 Hypothesis Tests in Simple Linear Regression
4 Correlation
2 / 86 Chapter 10: Simple Linear Regression and Correlation
Content
1 Empirical Models
2 Simple Linear Regression
3 Hypothesis Tests in Simple Linear Regression
4 Correlation
3 / 86 Chapter 10: Simple Linear Regression and Correlation
1. Empirical Models
Many problems in engineering and science involve exploring the re-
lationships between two or more variables.
Regression analysis (phân tích hồi quy) is a statistical technique
that is very useful for these types of problems.
4 / 86 Chapter 10: Simple Linear Regression and Correlation
1. Empirical Models
Regression analysis is used to:
Predict the value of a dependent variable based on the value of at
least one independent variable.
Explain the impact of changes in an independent variable on the
dependent variable.
5 / 86 Chapter 10: Simple Linear Regression and Correlation
1. Empirical Models
Regression analysis is used to:
Predict the value of a dependent variable based on the value of at
least one independent variable.
Explain the impact of changes in an independent variable on the
dependent variable.
Dependent variable Y: the variable we wish to predict or explain.
5 / 86 Chapter 10: Simple Linear Regression and Correlation
1. Empirical Models
Regression analysis is used to:
Predict the value of a dependent variable based on the value of at
least one independent variable.
Explain the impact of changes in an independent variable on the
dependent variable.
Dependent variable Y: the variable we wish to predict or explain.
Independent variable X: the variable used to predict or explain the
dependent variable.
5 / 86 Chapter 10: Simple Linear Regression and Correlation
1. Empirical Models
Regression analysis is used to:
Predict the value of a dependent variable based on the value of at
least one independent variable.
Explain the impact of changes in an independent variable on the
dependent variable.
Dependent variable Y: the variable we wish to predict or explain.
Independent variable X: the variable used to predict or explain the
dependent variable.
A scatter plot (biểu đồ phân tán) can be used to:
Visualize the relationship between X and Y variables.
Help suggest a starting point for regression analysis.
5 / 86 Chapter 10: Simple Linear Regression and Correlation
1. Empirical Models
We think of the regression model as an empirical model.
6 / 86 Chapter 10: Simple Linear Regression and Correlation
Content
1 Empirical Models
2 Simple Linear Regression
3 Hypothesis Tests in Simple Linear Regression
4 Correlation
7 / 86 Chapter 10: Simple Linear Regression and Correlation
2. Simple Linear Regression
We assume that each observation, Y, can be described by the
model
Y = β0 + β1 x + ε
where ε is a random error with mean zero and (unknown)
variance σ 2 , the slope β1 and intercept β0 of the line are
called regression coefficients.
Sample contain n data points (xi , yi ), i = 1, 2, ..., n on the
plane. Express the n observations in the sample as
y i = β 0 + β 1 x i + εi
The point estimates for β0 , β1 , σ 2 are denoted by β̂0 , β̂1 , σ̂ 2 .
Estimated regression equation (best-fit line) is ŷ = β̂0 + β̂1 x
8 / 86 Chapter 10: Simple Linear Regression and Correlation
2. Simple Linear Regression
To estimate the regression coefficients, we use Least Squares method, it
mean minimize
Xn Xn
L= ε2i = (yi − β0 − β1 xi )2
i=1 i=1
Consider L as a function of two unknowns (variables) β0 , β1 . Solve the
system
∂L ∂L
= 0, =0
∂β0 ∂β1
The least squares estimators of β0 , β1 , say β̂0 , β̂1 must satisfy
β̂0 = ȳ − β̂1 x̄
( n
P Pn
Pn x )( yi )
xi yi − i=1 i n i=1
β̂1 = i=1Pn ( n
P 2
2 i=1 xi )
i=1 xi − n
P P
where x̄ = ( xi )/n, ȳ = ( yi )/n.
9 / 86 Chapter 10: Simple Linear Regression and Correlation
2. Simple Linear Regression
Then, the point estimates of β0 , β1 say β̂0 , β̂1 are
Sxy
β̂0 = ȳ − β̂1 x̄, β̂1 =
Sxx
where,
n n
( ni=1 xi )( ni=1 yi )
X X P P
Sxy = (xi − x̄)(yi − ȳ) = xi yi −
n
i=1 i=1
n n
( ni=1 xi )2
X X P
2 2
Sxx = (xi − x̄) = xi −
n
i=1 i=1
10 / 86 Chapter 10: Simple Linear Regression and Correlation
2. Simple Linear Regression
Total corrected sum of squares:
n n Pn
X X ( yi )2
SST = (yi − ȳ)2 = yi2 − i=1
i=1 i=1
n
Regression sum of squares
n
X
SSR = (ŷi − ȳ)2 = β̂1 Sxy
i=1
Error sum of squares
n
X
SSE = (yi − ŷi )2 = SST − β̂1 Sxy = SST − SSR
i=1
SSE
An unbiased estimator of σ 2 is σ̂ 2 = n−2
11 / 86 Chapter 10: Simple Linear Regression and Correlation
2. Simple Linear Regression
12 / 86 Chapter 10: Simple Linear Regression and Correlation
2. Simple Linear Regression
13 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 1
14 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 2
15 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 3
16 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 4
17 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 5
18 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 6
19 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 7
20 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 8
21 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 9
22 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 10
23 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 11
24 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 12
25 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 13
26 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 14
27 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 15
28 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 16
29 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 17
30 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 18
31 / 86 Chapter 10: Simple Linear Regression and Correlation
Content
1 Empirical Models
2 Simple Linear Regression
3 Hypothesis Tests in Simple Linear Regression
4 Correlation
32 / 86 Chapter 10: Simple Linear Regression and Correlation
Test hypothesis about the slope and intercept
Remark
Estimated of regression slope β1 is β̂1
Estimated of regression intercept β0 is β̂0
The estimated standard error of the slope is
s
σ̂ 2
se(β̂1 ) =
Sxx
The estimated standard error of the intercept is
s
x̄2
1
se(β̂0 ) = σ̂ 2 +
n Sxx
SSE
where σ̂ 2 = n−2 .
33 / 86 Chapter 10: Simple Linear Regression and Correlation
Use of t-Tests
A very important special case of test on slope
H0 : β1 = 0
H1 : β1 6= 0
These hypotheses relate to the significance of regression. Fail-
ure to reject H0 is equivalent to concluding that there is no linear
relationship between x and Y .
34 / 86 Chapter 10: Simple Linear Regression and Correlation
Hypothesis Tests in Simple Linear Regression
Example 1
Suppose we have the following information from a simple regression:
β̂0 = 105.4, β̂1 = −14.5, se(β̂0 ) = 2.6, se(β̂1 ) = 2.4, n = 120
What is the value of the test statistic for testing H0 : β1 = −14
35 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 1
36 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 2
37 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 3
38 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 4
39 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 5
40 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 6
41 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 7
42 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 8
43 / 86 Chapter 10: Simple Linear Regression and Correlation
Content
1 Empirical Models
2 Simple Linear Regression
3 Hypothesis Tests in Simple Linear Regression
4 Correlation
44 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
To measure the strength of the linear relationship between X and Y we
can use the correlation coefficient ρ which have properties:
−1 ≤ ρ ≤ 1
If ρ ∼ 1, then there is a strong positive linear regression.
If ρ ∼ −1, then there is a strong negative linear regression
If ρ ∼ 0, then linear relation between X and Y is weak
45 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
The estimator of ρ is the sample correlation coefficient.
Definition
The sample correlation coefficient of a data (xi , yi ), i = 1, ..., n is
Sxy
R= √
Sxx SST
The coefficient of determination
2
Sxy Sxy SSR SSE
R2 = = β̂1 = =1−
Sxx SST SST SST SST
Note
−1 ≤ R ≤ 1, 0 ≤ R2 ≤ 1
The correlation coefficient measures the strength of the
relationship between two variables.
46 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
47 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 2
Given the least squares regression line ŷ = −2.87 − 1.6x and a coeffi-
cient of determination of 0.36, what is the coefficient of correlation?
48 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 2
Given the least squares regression line ŷ = −2.87 − 1.6x and a coeffi-
cient of determination of 0.36, what is the coefficient of correlation?
2
Answer: We have √ R = 0.36. Since β̂1 = −1.6 < 0 the coefficient of
correlation is − 0.36 = −0.6.
48 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 2
Given the least squares regression line ŷ = −2.87 − 1.6x and a coeffi-
cient of determination of 0.36, what is the coefficient of correlation?
2
Answer: We have √ R = 0.36. Since β̂1 = −1.6 < 0 the coefficient of
correlation is − 0.36 = −0.6.
Example 3
Suppose we have the following information from a simple regression:
β̂0 = 116, β̂1 = 10.2, n = 148, x̄ = 3.9, SST = 15600, SSE = 9200
What is the correlation coefficient?
48 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 2
Given the least squares regression line ŷ = −2.87 − 1.6x and a coeffi-
cient of determination of 0.36, what is the coefficient of correlation?
2
Answer: We have √ R = 0.36. Since β̂1 = −1.6 < 0 the coefficient of
correlation is − 0.36 = −0.6.
Example 3
Suppose we have the following information from a simple regression:
β̂0 = 116, β̂1 = 10.2, n = 148, x̄ = 3.9, SST = 15600, SSE = 9200
What is the correlation coefficient?
Answer: The coefficient of determination
SSE 9200
R2 = 1 − =1− = 0.41
SST 15600
√
Since β̂1 > 0 the coefficient of correlation is 0.41 = 0.64
48 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 4
In a regression problem the following pairs of (x, y) are given:
(−4, 8), (−1, 3), (0, 0), (1, −3) and (4, −7). What does this indicate about
the value of coefficient of determination?
49 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 4
In a regression problem the following pairs of (x, y) are given:
(−4, 8), (−1, 3), (0, 0), (1, −3) and (4, −7). What does this indicate about
the value of coefficient of determination?
Answer:
X X
xi = 0, x̄ = 0, yi = 1, ȳ = 0.2
X X X
x2i = 34, xi yi = −66, yi2 = 131
Sxy = −66 − 0 ∗ 0.2/5 = −66
Sxx = 34 − 02 /5 = 34
SST = 131 − 0.22 /5 = 130.8
The coefficient of determination is
2
Sxy (−66)2
R2 = = = 0.98
Sxx SST 34 ∗ 130.8
49 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 5
You want to explore the relationship between the grades students receive
on their first quiz (X) and their first exam (Y). The first quiz and test
scores for a sample of 16 students reveal the following summary statistics:
X
(xi − x̄)(yi − ȳ) = 320.5, sx = 2.05, sy = 16.9
What is the sample correlation coefficient?
50 / 86 Chapter 10: Simple Linear Regression and Correlation
4. Correlation
Example 5
You want to explore the relationship between the grades students receive
on their first quiz (X) and their first exam (Y). The first quiz and test
scores for a sample of 16 students reveal the following summary statistics:
X
(xi − x̄)(yi − ȳ) = 320.5, sx = 2.05, sy = 16.9
What is the sample correlation coefficient?
Answer: We have Sxy = 320.5
Sxx = (n − 1)s2x = 15 ∗ 2.052 = 63
SST = (n − 1)s2y = 15 ∗ 16.92 = 4284.15
Hence, the sample correlation coefficient
Sxy 320.5
R= √ =√ = 0.617
Sxx SST 63 ∗ 4284.15
50 / 86 Chapter 10: Simple Linear Regression and Correlation
Test for Zero Correlation
Test the hypothesis
H0 : ρ = 0
Test statistic √
R n−2
T0 = √
1 − R2
has a t−distribution with n − 2 degrees of freedom if H0 is true.
51 / 86 Chapter 10: Simple Linear Regression and Correlation
Test for Zero Correlation
Example 6
You want to explore the relationship between the grades students
receive on their first two exams. For a sample of 25 students, you
find a correlation coefficient of 0.45. What is the value of the test
statistic for testing H0 : ρ = 0 and H1 : ρ 6= 0?
52 / 86 Chapter 10: Simple Linear Regression and Correlation
Test for Zero Correlation
Example 6
You want to explore the relationship between the grades students
receive on their first two exams. For a sample of 25 students, you
find a correlation coefficient of 0.45. What is the value of the test
statistic for testing H0 : ρ = 0 and H1 : ρ 6= 0?
Answer: We known that n = 25, R = 0.45. Hence, the test statistic
is √ √
R n−2 0.45 ∗ 23
t0 = √ =√ = 2.417
1 − R2 1 − 0.452
52 / 86 Chapter 10: Simple Linear Regression and Correlation
Test for Zero Correlation
Example 7
For a random sample of 22 professionals, the correlation between
their age and their income was found to be 0.3. You are interested
in testing the null hypothesis that there is no linear relationship
between these two variables against the alternative that there is a
positive relationship. What is your conclusion in testing H0 : ρ = 0
and H1 : ρ > 0 at α = 0.1? Let t0.1,20 = 1.325, t0.05,20 = 1.725
53 / 86 Chapter 10: Simple Linear Regression and Correlation
Test for Zero Correlation
Example 7
For a random sample of 22 professionals, the correlation between
their age and their income was found to be 0.3. You are interested
in testing the null hypothesis that there is no linear relationship
between these two variables against the alternative that there is a
positive relationship. What is your conclusion in testing H0 : ρ = 0
and H1 : ρ > 0 at α = 0.1? Let t0.1,20 = 1.325, t0.05,20 = 1.725
Answer: We known n = 22, R = 0.3. Test statistic value is
√
0.3 ∗ 20
t0 = √ = 1.4
1 − 0.32
Since t0 > t0.1,20 , we should reject H0 .
53 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 1
54 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 2
55 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 3
56 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 4
57 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 5
58 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 6
59 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 7
60 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 8
61 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 9
62 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 10
63 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 11
64 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 12
65 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 13
66 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 14
67 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 15
68 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 16
69 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 17
70 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 18
71 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 19
72 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 20
73 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 21
74 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 22
75 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 23
76 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 24
77 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 25
78 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 26
79 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 27
80 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 28
81 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 29
82 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 30
83 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 31
84 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 32
85 / 86 Chapter 10: Simple Linear Regression and Correlation
Question 33
86 / 86 Chapter 10: Simple Linear Regression and Correlation