[go: up one dir, main page]

0% found this document useful (0 votes)
11 views22 pages

L2 TwoVariable Regression 2023

two variable intro

Uploaded by

potheadpandafk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views22 pages

L2 TwoVariable Regression 2023

two variable intro

Uploaded by

potheadpandafk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

1

Two-Variable
Regression Analysis
2

A Hypothetical Example
● The data belongs to the total population of 60 families in a hypothetical
community and their weekly income (X) and weekly consumption
expenditure (Y), both in dollars.

● There are 10 fixed values of X and the corresponding Y values against


each of the X values.

● Thus, there are 10 Y subpopulations.


3

Example
4

Example
● There are 10 mean values for the 10 subpopulations of Y.

● These are called the conditional expected values, as they depend on the
given values of the (conditioning) variable X.

● It is written as E(Y | X), i.e. the expected value of Y given the value of X
5

Example
● The unconditional expected value of weekly consumption expenditure is
denoted as E(Y).

● If we add the weekly consumption expenditures for all the 60 families in the
population and divide this number by 60, we get the number $121.20
($7272/60), which is the unconditional mean, or expected, value of weekly
consumption expenditure, E(Y).

● It is unconditional in the sense that in arriving at this number we have


disregarded the income levels of the various families.
6

Example
● The graph
7

Example
● “What is the expected value of weekly consumption expenditure of a
family?”

● The answer is $121.20 (the unconditional mean).

● “What is the expected value of weekly consumption expenditure of a family


whose monthly income is, say, $140?”

● The answer is $101 (the conditional mean).


8

The Meaning of the Term Linear


● Linearity in the Variables

○ The first “natural” meaning of linearity is that the conditional expectation of Y is a linear
function of Xi

○ The above equation is not a linear function.

● Linearity in the Parameters

○ The second interpretation of linearity is that the conditional expectation of Y, E(Y | Xi), is a
linear function of the parameters, the 𝛽𝑠; it may or may not be linear in the variable X.
9

PRL
● If we join the conditional mean values, we get the population regression
line (PRL), or more generally, the population regression curve.

● It is also known as the regression of Y on X.


10

Example
11

The Concept of Population Regression


Function (PRF)
● E(Y | Xi) = f (Xi)
where f (Xi) denotes some function of the explanatory variable X.

● Known as the conditional expectation function (CEF) or population


regression function (PRF) or population regression (PR).

● The functional form of the PRF is therefore an empirical question and the
underlying theory may also suggest some form.
12

PRF
● The simplest model is

● where 𝛽1 and 𝛽2 are unknown but fixed parameters known as the


regression coefficients

● 𝛽1 and 𝛽2 are also known as intercept and slope coefficients,


respectively.

● Equation itself is known as the linear population regression function.


13

Stochastic Specification of PRF

● The deviation 𝜇𝑖 is an unobservable random variable taking positive or negative


values.

● 𝜇𝑖 is known as the stochastic disturbance or stochastic error term.


● It can be expressed as the sum of two components:

○ (1) E(Y | Xi): known as the systematic, or deterministic, component, and

○ (2) 𝜇𝑖 is the random, or nonsystematic, component.


14

PRF
15

PRF
● Since E(Yi | Xi) is the same thing as E(Y | Xi), the above implies that

E(ui | Xi) = 0
16

The Significance of the Stochastic


Disturbance Term
● The disturbance term 𝜇𝑖 is a proxy for all those variables that are omitted
from the model but that collectively affect Y.

● The relevant question is:

● Why not introduce these variables into the model explicitly?


17

Disturbance Term
The reasons are:

1. Vagueness of theory
2. Unavailability of data
3. Core variables versus peripheral variables
4. Intrinsic randomness in human behavior
5. Poor proxy variables
6. Principle of parsimony
7. Wrong functional form
18

The Sample Regression Function (SRF)


● Can we estimate the PRF from the sample data?

● We may not be able to estimate the PRF “accurately” because of sampling


fluctuations.

● The sample regression function can be written as


19

SRF
● Sample regression function in its stochastic form
20

SRF
● Estimate the PRF

● On the basis of the SRF


21

SRF
● How should the SRF be constructed so that 𝛽1 is as “close” as possible to
the true 𝛽1 and 𝛽2 is as “close” as possible to the true 𝛽2 even though we
will never know the true 𝛽1 and 𝛽2 ?
22

Illustrative Examples

You might also like