[go: up one dir, main page]

0% found this document useful (0 votes)
2 views86 pages

Chapter I

Uploaded by

amanueco21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views86 pages

Chapter I

Uploaded by

amanueco21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Econometrics II (Econ3062)

YISAK NIGUSSE(MSc.)
yisaknigusse519@gmail.com

DEPARTMENT OF ECONOMICS
MIZAN TEPI UNIVERSITY
Light of the Green Valley!

June 5, 2024
CONTENTS OF CHAPTER ONE

I
Describing Qualitative Data

II
Dummy Regressors
III
Limited Dependent Variable
Models
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 1 / 84
CHAPTER I: 1.1 Describing Qualitative Data

In regression analysis the dependent variable, or


regressand, is influenced not only by ratio scale
variables, but also by variables that are essentially
qualitative or nominal scale, in nature, such as sex,
race, color, religion, nationality, geographical region,
political upheavals, and party affiliation.
Qualitative factors often come in the form of binary
information (dummy variables).

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 2 / 84


CHAPTER I: 1.2 Dummy regressors

A dummy variable is a variable that takes on the value


1 or 0.
Dummy Variables usually indicates the dichotomous
such as “presence” or “absence”, “yes” or “no”, etc.
Alternative names are indicator variables, binary
variables, categorical variables, and dichotomous
variables.
Why do we use the values zero and one to describe
qualitative information?
The real benefit of capturing qualitative information
using zero-one variables is that it leads to regression
models where the parameters have very natural
interpretations. (See Table 1.1 of your handout)
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 3 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy explanatory variables can be used for several


purposes.
1 To allow for difference in intercept terms
2 To allow for differences in slope
3 To estimate equations with cross-equation restrictions
4 To test for stability of regression coefficients

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 4 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term


In case of qualitative explanatory variables one
account for dummy variables with implicit assumption
that the regression lines for the different groups differ
only in the intercept term but have the same slope
coefficients.
Example:R/ship b/n income y and years of schooling
x for two groups (M for male & F for female).

 α1 + βx + u for the first group (M )
y= ..1.1
α2 + βx + u for the second group (F )

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 5 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term


These equations can be combined into a single
equation as,

y = α1 + (α2 − α1 ) D + βx + u........1.2

 1 for the group 1
Where D = 
0 for the group 2

In this case the coefficient of the dummy variable (D)


measures the differences in the two intercept terms.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 6 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term


If there are more groups, we have to introduce more
dummies. For three groups we have;



 α1 + βx + u for the group 1
Y= α2 + βx + u for the group 2
α3 + βx + u for the group 3

These can be written as:


y = α1 + (α2 − α1 ) D1 + (α3 − α1 ) D2 + βx + u.....1.3

 1 for the group 2
Where D1 =  ;
0 for the group 1 and 3
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 7 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term



 1 for the group 3
D2 = 
0 for the group 1 and 2
It can be easily checked that by substituting the
values for D1 and D2 in (1.3), we get the intercepts
α1 , α2 and α3 respectively for the three groups.
Note that in combining the three equations, we are
assuming that the slope coefficient β is the same for
all groups and that the error term u has the same
distribution for the three groups.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 8 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term


If there is a constant term in the regression equation,
the number of dummies defined should always be one
less than the number of groupings by that category
because the constant term is the intercept for the
base group and the coefficients of the dummy
variables measure differences in intercepts.
If we do not introduce a constant term in the
regression equation, we can define a dummy variable
for each group, and in this case the coefficients of the
dummy variables measure the intercepts for the
respective groups.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 9 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term


If we include both the constant term and three
dummies, we will be introducing perfect
multicollinearity and the regression program will not
run (or will omit one of the dummies automatically).

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 10 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term


Example: As yet another example, suppose that we
have data on consumption C and income Y for a
number of households. In addition, we have data on:
1. S: the sex of the head of the household
2. A: age of the head of the household, which is given
in three categories: < 25 years, 25 to 50 years, and
> 50 years.
3. E: education of the head of household, also in
three categories: < high school, ≥ high school but <
college degree, ≥ college degree.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 11 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term


We include these qualitative variables in the form of
dummy variables:

 1, if male
D1 =  ;
0, if female

 1, age is < 25 years
D2 = 
0, otherwise

 1, if age is b/n 25 and 50 years
D3 = 
0, otherwise
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 12 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term

(
1 if < high school degree
D4 =
0 otherwise
(
1 if ≥ high school degree but , < college degree
D5 =
0 therwise

For each category the number of dummy variables is one less


than the number of classifications. Then we run the regression
equation:

C = α + βY + γ1 D1 + γ2 D2 + γ3 D3 + γ4 D4 + γ5 D5 + ε...1.4

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 13 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in the intercept term


The assumption made in the dummy-variable method
is that it is only the intercept that changes for each
group but not the slope coefficients (i.e., coefficients
of x).
Example 2: The dummy variable method is also used
if one has to take care of seasonal factors. For
example, if we have quarterly data on C and Y, we fit
the regression equation:

C = α + βY + λ1 D1 + λ2 D2 + λ3 D3 + u...1.5

If we have monthly data, we use 11 seasonal dummies.


YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 14 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in slope


coefficients
We can use dummy variables to allow for differences
in slope coefficients as well.
Example:

y1 = α1 + β1 x1 + u for the first group and

y2 = α2 + β2 x2 + u for the second group

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 15 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in slope coefficients


We can write these equation together as:
y1 = α1 + (α2 − α1 ) · 0 + β1 x1 + (β2 − β1 ) · 0 + u1
y2 = α1 + (α2 − α1 ) · 1 + β1 x2 + (β2 − β1 ) · x2 + u2 or
y = α1 + (α2 − α1 ) · D1 + β1 x1 + (β2 − β1 ) · D2 + u1

Where:( Where
0 for all obserbations in the first group
D1 = ;
1 for all obserbations in the second group
D
(2 =
0 for all obserbations in the first group
x2 i. e. , the respective value of x for the second group

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 16 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in slope coefficients


The coefficient of D1 measures the difference in the
intercept terms and the coefficient of D2 measures the
difference in the slope.
Suitable dummy variables can be defined when there
are changes in slopes and intercepts at different times.
Suppose that we have data for three periods and in
the second period only the intercept changed (there
was a parallel shift). In the third period the intercept
and the slope have changed. Then we write

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 17 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in slope coefficients



y1

 = α1 + β1 x1 + u1 for period 1

y2 = α2 + β1 x2 + u1 for period 2 . . . . . . . . . 1.5


y3 = α3 + β2 x3 + u1 for period 3

y = α1 +(α2 − α1 )·D1 +(α3 − α1 )·D2 +β1 x +(β2 − β1 )·D3 +u..1.6

(
1 for observations in period 1
D1 =
0 for other observations
(
1 for observations in period 3
D2 =
0 for other observations
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 18 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy variables for changes in slope coefficients


x1 for observations in period 1




x = x2 for observations in period 2

0 for other observations for period 3



 0 for observations in period 1 and 2
D3 = 
x3 or the respective value of x for

Note that in all these examples we are assuming that


the error terms in the different groups all have the
same distribution.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 19 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy variables for cross equation constraints


Consider the joint estimation of the demand for beef,
pork, and chicken as an example. The set of demand
equations to be estimated are as follows:



 p1= = α1 + β11 x1 + β12 x2 + β13 x3 + γ1 y + u1

p2= = α2 + β12 x1 + β22 x2 + β32 x3 + γ2 y + u2 ...1.9
p3 = α3 + β13 x1 + β23 x2 + β33 x3 + γ3 y + u3

P1 = retail price of beef p2 = retail price of pork


P3 = retail price of chicken X1 = consumption of beef per capita
Where;
X2 = consumption of pork per capita X3 = consumption of chicken per capita
y = disposable income per capita

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 20 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for cross equation constraints


The special thing about the system of equations (1.9)
is the symmetry in the coefficients. We have;
∂p1 ∂p2 ∂p1 ∂p3 ∂p2 ∂p3
= = β12 ; = = β13 ; = = β23
∂x2 ∂x1 ∂x3 ∂x1 ∂x3 ∂x3
Thus, there are cross equation restrictions on the coefficients. If
we assume that V
P 1
(u ) = V (u2 ) = V (u3 ), we can minimize
( u12 + u22 + u32 ), obtain the normal equations, and
P P

estimate the regression coefficients.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 21 / 84


CHAPTER I: 1.2 Dummy regressors

Dummy variables for testing stability of regression


coefficients
The definition of the appropriate dummy variables
depends on whether we are using the analysis of
covariance test or the predictive test for stability.
Consider, for instance, the two equations
y1 = α1 + β1 x1 + γ1 z1 + u1 for the first period
y2 = α2 + β2 x2 + γ2 z2 + u2 for the second period.
Both these residual sums of squares can be obtained
from the same dummy variable regression if we define
enough dummy variables.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 22 / 84
CHAPTER I: 1.2 Dummy regressors

Dummy variables for testing stability of regression


coefficients
For instance, we can write the equations for the two periods as;

Y = α1 + (α2 − α1 ) D1 + β1 X + (β2 − β1 ) D2 + γ1 Z + (γ2 − γ1 ) D3 + u...1.9

Note that we write the equation in terms of differences in the parameters and define the dummy variables accordingly;

n
1 for period 2
D1 =
0 for period 1

n
X that is, the corresponding value of X for observations in period 2
D2 =
0 for all observations in period 1
n
Z2 that is the corresponding value of Z for the obserbations in period 2
D3 =
0 for all observations in period 1

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 23 / 84


CHAPTER I: 1.2 Dummy regressors

Cautions in the use of dummy variables: Dummy variable


trap
If a qualitative variable has m categories, introduce
only (m - 1) dummy variables; if not and create
dummy variables equal to m in case where there is
intercept we will fall into what is called the dummy
variable trap, that is, a situation of perfect collinearity
or perfect multicollinearity
Hence, for qualitative regressor the number of dummy
variables introduced must be one less than the
categories of that variable.
Although it is easy to incorporate in the regression
models, one must use the dummy variables carefully.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 24 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

A Single dummy independent variable


In the simplest case, with only a single dummy
explanatory variable, we just add it as an independent
variable in the equation.
Suppose a researcher wants to find out whether sex
makes any difference in a college teacher’s salary,
assuming that all other variables such as age,
education level, experience etc. are held constant.
Y = β0 + β1 D + u
Where Y = annual salary of college teacher

 1 for male
D=
0 otherwise
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 25 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

A Single dummy independent variable


The coefficient of D (β1 ) determines whether there is
discrimination against female.
For example, consider the following simple model of
hourly wage determination:

wage = β0 + β1 D + β2 Educ + u.

Where wage = wage rate of a certain individual;


Educ= level of education

 1 for male
D=
0 otherwise
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 26 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

A Single dummy independent variable


The group, category, or classification that is assigned
the value of 0 is often referred to as the base,
benchmark, control, comparison, reference, or omitted
category.
The coefficient attached to the dummy variable D can
be called the differential intercept coefficient because
it tells by how much the value of the intercept term of
the category that receives the value of 1 differs from
the intercept coefficient of the base category.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 27 / 84


CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Using dummy variables for multiple categories


Assuming that the three educational groups have a
common slope but different intercepts in the
regression of annual expenditure on health care on
annual income, we can use the following model:
Yi = β0 + β1 D1i + β2 D2i + β3 X3 + ui
Where Yi = annual expenditure on health care; Xi =
annual income (
1 if high school education
D1 =
0 otherwise
(
1 if college education
D2 =
0 otherwise
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 28 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Using dummy variables for multiple categories


The mean health expenditure function for less than
high school individual
E (Yi /D1 = 0, D2 = 0, Xi ) = β0 + β3 Xi
The mean health expenditure function for high school individual
E (Yi /D1 = 1, D2 = 0, Xi ) = (β0 + β1 ) + β3 Xi
The mean health expenditure function for college individual
E (Yi /D1 = 0, D2 = 1, Xi ) = (β0 + β2 ) + β3 Xi
which are, respectively the mean health care expenditure
functions for the three levels of education, namely, less than high
school, high school, and college.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 29 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Using dummy variables for multiple categories


In case of one quantitative and two qualitative
variable: Assume that in addition to sex, color a
person determines salary: We can now write as;

Yi = α1 + α2 D2i + α3 D3i + βXi + ui

Where Yi = annual salary, Xi = years of teaching


experience
 
 1 if male  1 if white
D2 =  ; D3 = 
0 otherwise 0 otherwise

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 30 / 84


CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Using dummy variables for multiple categories


Mean salary for black female professor:
E (Yi /D2 = 0, D3 = 0, Xi ) = α1 + βXi
Mean salary for black male professor:
E (Yi /D2 = 1, D3 = 0, Xi ) = (α1 + α2 ) + βXi
Mean salary for white female professor:
E (Yi /D2 = 0, D3 = 1, Xi ) = (α1 + α3 ) + βXi
Mean salary for white male professor:
E (Yi /D2 = 1, D3 = 1, Xi ) = (α1 + α2 + α3 ) + βXi
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 31 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Interactions among dummy variables


Consider the following model:
Yi = α1 + α2 D2i + α3 D3i + βXi + ui
Where Yi = annual expenditure on clothing, Xi =
income
 
 1 if female  1 if college graduate
D2 =  ; D3 = 
0 otherwise 0 otherwise
Implicit in this model is the assumption that the differential
effect of the sex dummy D2 is constant across the two levels of
education and the differential effect of the education dummy D3
is also constant across the two sexes.But in many applications
such an assumption may be untenable.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 32 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Interactions among dummy variables


A female college graduate may spend more on
clothing than a male graduate. In other words, there
may be interaction between the two qualitative
variables D2 and D3 and therefore their effect on
mean Y may not be simply additive as in above but
multiplicative as well, as in the following model:
Yi = α1 + α2 D2i + α3 D3i + α4 (D2i D3i ) + βXi + ui
E (Yi /D2 = 1, D3 = 1, Xi ) = (α1 + α2 + α3 + α4 ) + βXi
α2 = differential effect of being a female
α3 = differential effect of being a college graduate
α4 = differential effect of being a female graduate

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 33 / 84


CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Interactions among dummy variables


This shows that the mean clothing expenditure of
graduate females is different by (α4 ) from the mean
clothing expenditure of females or college graduates.
If the average expenditure on clothing by a college
graduate tends to be higher than the base category
but much more so if the graduate happens to be a
female. This shows how the interaction dummy
modifies the effect of the two attributes considered
individually.
Needless to say, omitting a significant interaction
term incorrectly will lead to a specification bias.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 34 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Testing for structural stability of regression models


Suppose we are interested in estimating a simple
saving function that relates domestic household
savings (S) with gross domestic product (Y) for
Ethiopia. Suppose further that, at a certain point of
time, a series of economic reforms have been
introduced. The hypothesis here is that such reforms
might have considerably influenced the savings-
income relationship, that is, the relationship between
savings and income might be different in the post
reform period as compared to that in the pre-reform
period. If this hypothesis is true, then we say a structural
change has happened. How do we check if this is so?

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 35 / 84


CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Testing for structural stability of regression models


We can solve such problems with the help of dummy variables.
Write the savings function as:
St = β0 + β1 Dt + β2 Yt + β3 (Yt Dt ) + ut
Where St is household saving at time t, Yt is GDP at time t and
(
0 if pre − reform (< 1991)
D2 =
1 if post − reform (> 1991)
Here β3 is the differential slope coefficient indicating how much
the slope coefficient of the prereform period savings function
differs from the slope coefficient of the savings function in the
post reform period. If β1 and β2 are both statistically significant
as judged by the t-test, the pre-reform and post-reform
regressions differ in both the intercept and the slope.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 36 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Testing for structural stability of regression models


However, if only β1 is statistically significant, then the
pre-reform and post-reform regressions differ only in
the intercept (meaning the marginal propensity to
save (MPS) is the same for pre-reform and
post-reform periods).
Similarly, if only β3 is statistically significant, then the
two regressions differ only in the slope (MPS).
Chow’s test
One approach for testing the presence of structural
change (structural instability) is by means of Chow’s
test. The steps involved in this procedure.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 37 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Testing for structural stability of regression models


1 Estimate the regression equation:
St = α + β1 Yt + εi; 1, 2 . . . n1 (s1)
For the whole period (pre-reform plus post-reform periods)
and find the error sum of squares (ESSR )
2 Estimate equation (s1) using the available data in the
pre-reform period (say, of size n1 ), that is, estimate the
model:
St = α1 + β1 Yt + εi; 1, 2 . . . n2 . (s2)
and find the error sum of squares (ESS1 )
3 Estimate equation (s2) using the available data in the
post-reform period (say, of size n2 ), that is, estimate the
model:
4 St = α2 + β2 Yt + εi; 1, 2 . . . n2 and find the error sum of
squares
YISAK (ESS2).
NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 38 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as independent variables)

Calculate ESSUR = ESS1 + ESS2


Calculate the Chow test statistic:
(ESSR − ESSUR ) /K
F =
ESSUR (n1 + n2 − 2K )

Decision rule: Reject the null hypothesis of identical intercepts


and slopes for the pre-reform and post-reform periods, that is,
(
α1 = α2
H0 = if F > Fα (K, n1 + n2 − 2K )
β1 = β2

Where Fα (K, n1 + n2 − 2K ) is technical value from the


F-distribution with K (in our case K = 2 ) and n1 + n2 − 2K
degrees of freedom for a given significance level of α.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 39 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as dependent variable)

There are many situations in which the dependent


variable in a regression equation simply represents a
discrete choice assuming only a limited number of
values.
Models involving dependent variables of this kind are
called limited (discrete) dependent variable models
(also called qualitative response models).
For example, Y can be defined to indicate whether an
adult has a high school education; or Y can indicate
whether a college student used illegal drugs during a
given school year; or Y can indicate whether a firm
was taken over by another firm during a given year.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 40 / 84
CHAPTER I: 1.2 Dummy regressors (Dummy as dependent variable)

In short, qualitative response models are models in


which the dependent variable is a discrete outcome.
Look at the following example:
Y = α0 + α1 x1 + α2 x2 + ui

 1 if individual i attended college
Y =
0 otherwise
In the above example the dependent variable Y takes
on only two values (i.e., 0 and 1). Conventional
regression cannot be used to analyze a qualitative
dependent variable model.
The models are analyzed in a general framework of
probability models.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 41 / 84
CHAPTER I: 1.2 Dummy regressors (Categories of qualitative response models)

There are two broad categories of QRM:


1 Binomial Model: The choice is between two alternatives
2 Multinomial models: The choice is between more than two
alternatives
Binary variables: are variables that have two categories and are
often used to indicate that an event has occurred or that some
characteristic is present.
Ordinal variables: - these are variables that have categories that
can be ranked.
Nominal variables: These variables occur when there are multiple
outcomes that cannot be ordered.
Count variables: These variables indicate the number of times
some event has occurred.
In all of the above situations, the variables are discrete valued.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 42 / 84
CHAPTER I: 1.2 Dummy regressors (Qualitative choice analysis)

Qualitative choice models may be used when a


decision maker faces a choice among:
The number of choices if finite
The choices are mutually exclusive (the person chooses
only one of the alternatives)
The choices are exhaustive (all possible alternatives are
included)
For the sake of convenience, the dependent variable is
given a value of 0 or 1.
Example: Suppose the choice is whether to work or
not. The discrete dependent variable we are working
with will assume only two values 0 and 1.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 43 / 84


CHAPTER I: 1.2 Dummy regressors (The regression approaches)

The economic interpretation of discrete choice models


is typically based on the principle of utility
maximization leading to the choice of, say, A over B if
the utility of A exceeds that of B.
For example, let U1 be the utility from
working/seeking work and let U0 be the utility form
not working.Then an individual will choose to be part
of the labor force if U1 - U0 > 0, and this decision
depends on a number of factors X.
The probability that the ith individual chooses
alternative 1 (i.e. works) given his/her individual
characteristics, Xi’s is:
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 44 / 84
CHAPTER I: 1.2 Dummy regressors (The regression approaches)

Pi = pr (Yi = 1/Xi ) = Pr [(U 1 − U 0 ) i > 0] = G (Xi , β)


The vector of parameters (measures the impact of changes in X
(say, age, marital status, gender, education, occupation, and the
like) on the probability of labor force participation. And the
probability that the ith individual chooses alternative 0 (i.e. not
to work) is given by:
 1 0
 
pr (Yi = 0/Xi ) = 1 − Pi = 1 − Pr U −U i >0 = 1 − G (Xi , β)

Here Pi is called the response probability and (1 − Pi ) is called


the non-response probability.
The mean response of the ith individual given his/her individual
characteristics Xi is:
E (Yi /Xi ) = 1∗ {G(Xi, β)} + 0 ∗ {1 − G (Xi , β)} = G (Xi , β)
The problem is thus to choose the appropriate form of G(Xi, β).
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 45 / 84
CHAPTER I: 1.2 Dummy regressors (The regression approaches)

There are several methods to analyze regression


models where the dependent variable is binary.
Now let us turn our attention to the four most
commonly used approaches to estimating binary
response models (Type of binomial models). The
simplest procedure is to just use the usual OLS
method.
In this case the model is called the linear probability
model (LPM).

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 46 / 84


CHAPTER I: 1.2 Dummy regressors (The regression approaches)

The four most commonly used approaches to


estimating binary response models are;
1 Linear probability models (LMP)
2 The Logit model
3 The Probit model
4 The Tobit (censored regression) model

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 47 / 84


CHAPTER I: 1.2 Limited Dependent Variable Models (LMP)

The term linear probability model is used to denote a


regression model in which the dependent variable y is
a dichotomous variable taking the value one or zero.
The traditional approach to the estimation of limited
dependent variable (LDV) models is parametric
maximum likelihood.
When we use a linear regression model to estimate
probabilities, we call the model the linear probability
model.
The linear probability model is the regression model
applied to a binary dependent variable.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 48 / 84
CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

The linear probability model defines G(Xi, β) = Xi β.


The regression model when Y is a binary variable is
thus,

Y = β0 + β1 X1 + β2 X2 + . . . + βk Xk +  = X β + 

This is the usual linear regression model. Since Y can


take on only two values, βj cannot be interpreted as
the change in Y given a one-unit increase in Xj ,
holding all other factors constant. Y either changes
from zero to one or from one to zero. Nevertheless,
the βj still have useful interpretations.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 49 / 84
CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

If we assume that the zero conditional mean


assumption holds, that is E () = 0.
E (Y ) = β0 + β1 X1 + β2 X2 + . . . + βk Xk = X β
The key point is that when Y is a binary variable
taking on the values zero and one, it is always true
that: pr(Y = 1/X ) = E (Y /X ); the probability of
"success". Thus, we have the important equation:
pr(Y = 1/X ) = β0 + β1 X1 + β2 X2 + . . . + βk Xk = X β
which says that the probability of success, say
Pr(X ) = pr(Y = 1/X ), is a linear function of the Xj .
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 50 / 84
CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

The above equation is an example of a binary


response model, and Pr(X ) = pr(Y = 1/X ) is also
called the response probability. Because probabilities
must sum to one,

pr(Y = 0/X ) = 1 − pr(Y = 1/X )

(which is called the non-response probability) is also a


linear function of the Xj .
The multiple linear regression model with a binary
dependent variable is called the linear probability
model (LPM) because the response probability is
linear in the parameters of Xj .
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 51 / 84
CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

In the LPM, measures the change in the probability of


success when Xj changes, holding other factors fixed.
This is the usual linear regression model.
This makes linear probability models easy to estimate
and interpret, but have some shortcomings.
The drawbacks of LPM model are:
1 The predicted probability can be less than 0 or greater than
1 which violates the intuition that probabilities should be
between 0 and 1.
2 The constant slope (β) in the LPM is less intuitive in many
cases.
3 Except in cases where the probability does not depend on
any of the covariates, LPM is always heteroscedastic.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 52 / 84
CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

Because Y has only two possible outcomes (0 or 1),


the error term u for a given value of Xj has two
possible outcomes as well. In particular, the
distribution of u can be summarized as:
Pr(u = −x β/x ) = Pr(y = 0/x ) = 1 − x β;
Pr(u = 1 − x β/x ) = Pr(y = 1/X ) = x β

So, the var(u/x) = x β(1 − x β). Hence, the variance


of the error term is not constant but depends on the
explanatory variables x and the model parameters β.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 53 / 84


CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

Suppose that from hypothetical data of house ownership and


income and thus, the LPM estimated by OLS (on home
ownership) is given as follows:
yi = −0.9457 + 0.1021xi
(0.1228) (0.0082)
t = (−7.6984) (12.515)
R 2 = 0.8048
The above regression is interpreted as follows: The intercept of
-0.9457 gives the "probability" that a family with zero income
will own a house. Since this value is negative, and since
probability cannot be negative, we treat this value as zero. The
slope value of 0.1021 means that for a unit change in income, on
the average the probability of owning a house increases by
0.1021 or about 10 percent.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 54 / 84
CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

This is so whether the income level is increased or


not. This seems patently unrealistic. In realitv one
would expect that Pi is non-linearlv related to Xi .
In reality one would expect that Pi is non-linearly
related to Xi . Therefore, what we need is a
(probability) model that has the following two
features:
1 As Xi increases, Pi = E(Y = 1/X) increases but never
steps outside the 0 − 1 interval.
2 The relationship between Pi and Xi is non-linear, that is,
"one which approaches zero at slower and slower rates as
Xi gets small and approaches one at slower and slower
rates as Xi gets very large"

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 55 / 84


CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 56 / 84


CHAPTER I: 1.2 Limited Dependent Variable Models ( LPM)

Therefore, one can easily use the CDF to model


regressions where the response variable is
dichotomous, taking 0-1 values.
The CDFs commonly chosen to represent the 0-1
response models are.
The logistic – which gives rise to the logit model
The normal – which gives rise to the probit (or normit)
model

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 57 / 84


CHAPTER I: 1.2 LDVMs ( Logit Model)

In a binary response model, interest lies primarily in


the response probability

P(Y = 1/X ) = P (Y = 1/X1 , X2 , . . . , Xk )

where we use X to denote the full set of explanatory


variables.
For example, when Y is an employment indicator, X
might contain various individual characteristics such
as education, age, marital status, and other factors
that affect employment status, including a binary
indicator variable for participation in a recent job
training program.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 58 / 84
CHAPTER I: 1.2 LDVMs ( Logit Model)

Consider a class of binary response models of the


form:
P(Y = 1/X ) = G (β0 + β1 X1 + β2 X2 + . . . + βk Xk ) = G(X β)

Where G is a function taking on values strictly between zero


and one: 0 < G(z) < 1 for all real numbers z.
This ensures that the estimated response probabilities are strictly
between zero and one. As in earlier chapters, we write;

X β = β0 + β1 X1 + β2 X2 + . . . + βk Xk

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 59 / 84


CHAPTER I: 1.2 LDVMs ( Logit Model)

Various nonlinear functions have been suggested for


the function G in order to make sure that the
probabilities are between zero and one. In the logit
model, G is the logistic function:

exp(z) exβ
G(z) = = Λ(z) =
[1 + exp(z)] 1 + exβ
which is between zero and one for all real numbers z.
This is the cumulative distribution function (cdf) for a
standard logistic random variable.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 60 / 84


CHAPTER I: 1.2 LDVMs ( Logit Model)

Here the response probability is evaluated as:


P(Y = 1/X ) is evaluated as:

exβ
P(Y = 1/X ) =
1 + exβ
is the ratio of the odds of against. The natural
logarithm of the odds ratio (log-odds ratio) is:
Similarly, the non-response probability is evaluated as:
exβ 1
1 − Pi = P(Y = 0/X ) = 1 − x β
=
1+e 1 + exβ
Note that: the response and non- response probabilities both lie
in the interval [0, 1], and hence, are interpret-able.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 61 / 84
CHAPTER I: 1.2 LDVMs ( Logit Model)

The ratio of the response probability (Pi ) to the


non-response probability (1- Pi ) is called the odds
ratio. For the logit model, the odds ratio is given by:
ex β
P P(Y = 1/X ) 1+e x β
= = 1 = e x β = e β0 +β1 +
1−P P(Y = 0/X ) 1+e x β

e β2 x2 + e β3 x3 + . . . + e βk xk .
is the ratio of the odds of against.
The natural logarithm of the odds ratio (log-odds
ratio) is:
P
!
Li = ln = β0 + β1 X1 + β2 X2 + . . . + βk Xk
1−P
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 62 / 84
CHAPTER I: 1.2 LDVMs ( Logit Model)

L (the log of the odds ratio) is linear in X as well as (the


parameters). L is called the logit and hence the name logit
model is given to it. Thus, the log-odds ratio is a linear function
of the explanatory variables.
For the LPM it is Pi , which is assumed to be a linear function of
the explanatory variables. The above transformation has
certainly helped the popularity of the logit model.
Note that for the linear probability model it is Pi that is assumed
to be a linear function of the explanatory variables.
Notice these features of the logit model. As P goes from 0 to 1
(i.e., as Z varies from −∞ to +∞ ), the logit L goes from −∞
to +∞. That is, although the probabilities (of necessity) lie
between 0 and 1 , the logits are not so bounded.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 63 / 84
CHAPTER I: 1.2 LDVMs ( Logit Model)

Although L is linear in X, the probabilities themselves are not.


This property is in contrast with the LPM model where the
probabilities increase linearly with X.
Although we have included only a single X variable, or regressor,
in the preceding model, one can add as many regressors as may
be dictated by the underlying theory.
If L, the logit, is positive, it means that when the value of the
regressor(s) increases, the odds that the regressand equals 1
(meaning some event of interest happens) increases. If L is
negative, the odds that the regressand equals 1 decrease as the
value of X increases.
To put it differently, the logit becomes negative and increasingly
large in magnitude as the odds ratio decreases from 1 to 0 and
becomes increasingly large and positive as the odds ratio
increases from 1 to infinity.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 64 / 84
CHAPTER I: 1.2 LDVMs ( Logit Model)

It is estimated by using the maximum likelihood


method.
Example: Assume that Y is linearly related to the
variables Xi’s as follows: The logit estimate results are
presented as:
Yi = −10.84 − 0.74X1 − 11.6X2 − 5.7X3 − 1.3X4 + 2.5X5
t = (−3.20)(−2.51)(−3.01)(−2.4)(−1.37)

The above estimated result shows that the variables X1 , X2 and


X3 have a negative effect on the probability of an event to occur
(i.e., y = 1 ). While the sign of or the variable X5 has a positive
effect on the probability of an event to occur.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 65 / 84


CHAPTER I: 1.2 LDVMs ( Probit Model)

The estimating model that emerges from the normal CDF is


popularly known as the probit model. In the probit model, G is
the standard normal cumulative distribution function (cdf),
which is expressed as an integral: In the probit model, G is the
standard normal cumulative distribution function:
Z z
G(z) = Φ(z) = Φ(v )dv
−∞

Where Φ(z), is the standard normal density function:


z2
!
− 21
Φ(z) = (2π) exp −
2
The estimating model that emerges from the normal CDF is
popularly known as the probit model. Here the observed
dependent variable Y , takes on one of the values 0 and 1 using
the following criteria.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 66 / 84
CHAPTER I: 1.2 LDVMs ( Probit Model)

Define a latent variable Y ∗ such that Y ∗ = X1i β + εi



 1 if Y ∗ > 0
Y =
0 if Y ∗ ≤ 0

The latent variable Y∗ is continuous


(−∞ < Y∗ < ∞). It generates the observed binary
variable Y .
An observed variable, Y can be observed in two
states: if an event occurs it takes a value of 1 if an
event does not occur it takes a value of 0.
The latent variable is assumed to be a linear function
of the observed X ’s through the structural model.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 67 / 84
CHAPTER I: 1.2 LDVMs ( Probit Model)

Example: Let Y measures whether one is employed or


not. It is a binary variable taking values 0 and 1 .
Y ∗ - measures the willingness to participate in the
labor market. This changes continuously and is
unobserved. If X is a wage rate, then as X increases
the willingness to participate in the labor market will
increase. ( Y ∗ - the willingness to participate cannot
be observed).
The decision of the individual will be changed (becomes zero) if
the wage rate is below the critical point. Since Y ∗ is continuous
the model avoids the problems inherent in the LPM model (i.e.,
the problem of non-normality of the error term and
heteroscedasticity).
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 68 / 84
CHAPTER I: 1.2 LDVMs ( Probit Model)

However, since the latent dependent variable is unobserved the


model cannot be estimated using OLS. Maximization of the
likelihood function for either the probit or the logit model is
accomplished by nonlinear estimation methods. Maximum
likelihood can be used instead.
Most often, the choice is between normal errors and logistic
errors, resulting in the probit (normit) and logit models,
respectively.
The coefficients derived from the maximum likelihood (ML)
function will be the coefficients for the probit model, if we
assume a normal distribution. If we assume that the appropriate
distribution of the error term is a logistic distribution, the
coefficients that we get from the ML function will be the
coefficient of the logit model.
In both cases, as with the LPM, it is assumed that E [Ei /Xi ] = 0
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 69 / 84
CHAPTER I: 1.2 LDVMs ( Probit Model)

In the probit model, it is assumed that


Var (Ei /Xi ) = 1.
In the logit model, it is assumed that
Var (Ei /Xi ) = Π2/3 .Hence the estimates of the
parameters ( β ’s) from the two models are not
directly comparable.
But as Amemiya suggests, a logit estimate of a parameter
multiplied by 0.625 gives a fairly good approximation of the
probit estimate of the same parameter. Similarly, the coefficients
of LPM and logit models are related as follows:
βLPM = 0.25βLogit, except for intercept βLPM = 0.25βLogit + 0.5
for intercept
The standard normal cdf has a shape very similar to that of the
logistic cdf.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 70 / 84
CHAPTER I: 1.2 LDVMs ( Probit Model)

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 71 / 84


CHAPTER I: 1.2 LDVMs ( Probit Model)

The estimating model that emerges from the normal


CDF is popularly known as the probit model, although
sometimes it is also known as the normit model. Note
that both the logit and the probit models are
estimated by Maximum Likelihood Estimation.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 72 / 84


CHAPTER I: 1.2 LDVMs ( Tobit Model)

An extension of the probit model is the tobit model


developed by James Tobin.
Suppose we want to find out the amount of money the consumer
spends in buying a house in relation to his or her income and
other economic variables. Now we have a problem. If a consumer
does not purchase a house, obviously we have no data on
housing expenditure for such consumers; we have such data only
on consumers who actually purchase a house.
Thus, consumers are divided into two groups, one consisting of
say, N1 consumers about whom we have information on the
regressors (say income, interest rate etc) as well as the
regressand (amount of expenditure on housing) and another
consisting of say, N2 consumers about whom we have
information only on the regressors but on the regressand.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 73 / 84
CHAPTER I: 1.2 LDVMs ( Tobit Model)

A sample in which information on regressand is


available only for some observations is known as a
censored sample.
Therefore, the tobit model is also known as a
censored regression model. Mathematically, we can
express the tobit model as:

Yi = β0 + β1 X1i + ui if RHS > 0 = 0; otherwise

Where RHS = right-hand side. The method of


maximum likelihood can be used to estimate the
parameters of such models.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 74 / 84
CHAPTER I: Interpreting Probit & Logit Model Estimates

In the LPM, the slope coefficient measures directly


the change in the probability of an event occurring as
the result of a unit change in the value of a regressor,
with the effect of all other variables held constant.
In the logit/probit model the slope coefficient of a
variable gives the change in the log of the odds
associated with a unit change in that variable, again
holding all other variables constant.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 75 / 84


CHAPTER I: 1.2 LDVMs ( Tobit Model)

In the logit model the slope coefficient of a variable gives the


change in the log of the odds associated with a unit change in
that variable, again holding all other variables constant.
But as noted previously, for the logit model the rate of change in
the probability of an event happening is given by βj Pi (1 − Pi ),
where j is the (partial regression) coefficient of the j th regressor.
But in evaluating Pi , all the variables included in the analysis are
involved.
In the probit model, as we saw earlier, the rate of change in the
probability is somewhat complicated and is given by βjf (Zi ),
where f (Zi ) is the density function of the standard normal
variable and Zi = 1 + β2 X2i + · · · + βk Xki , that is, the regression
model used in the analysis.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 76 / 84
CHAPTER I: 1.2 LDVMs ( Is logit or probit model is

preferable?)

Thus, in both the logit and probit models all the regressors are
involved in computing the changes in probability, whereas in the
LPM only the j th regressor is involved. This difference may be
one reason for the early popularity of the LPM model.
Between logit and probit, which model is preferable?
In most applications the models are quite similar, the main
difference being that the logistic distribution has slightly fatter
tails, which can be seen from Figure 1.2. That is to say, the
conditional probability Pi approaches zero or one at a slower rate
in logit than in probit.
Therefore, there is no compelling reason to choose one over the
other.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 77 / 84
CHAPTER I: 1.2 LDVMs ( Is logit or probit model is

preferable?)

In practice many researchers choose the logit model because of


its comparative mathematical simplicity. The standard normal
cdf has a shape very similar to that of the logistic cdf. The
probit and logit models differ in the specification of the
distribution of the error term u.
The difference between the specification and the linear
probability model is that in the linear probability model we
analyses the dichotomous variables as they are, where as we
assume the existence of an underlying latent variable for which
we observe a dichotomous realization.
The probit model and the logit model are not directly
comparable.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 78 / 84
CHAPTER I: 1.2 LDVMs ( Is logit or probit model is

preferable?)

The reason is that, although the standard logistic (the basis of


logit) and the standard normal distributions (the basis of probit)
both have a mean value of zero, their variances are different; 1
for the standard normal (as we already know) and 2/3 for the
logistic distribution, where 22/7. Therefore, if you multiply the
probit coefficient by about 1.81 (which is approximately = /3 ),
you will get approximately the logit coefficient. For example, if
the probit coefficient of X variable is 1.6258 . Multiplying this by
1.81 , we obtain 2.94 , which is close to the logit coefficient.
Alternatively, if you multiply a logit coefficient by
0.55(= 1/1.81), you will get the probit coefficient. Amemiya,
however, suggests multiplying a logit estimate by 0.625 to get a
better estimate of the corresponding probit estimate.
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 79 / 84
CHAPTER I: 1.2 LDVMs ( Is logit or probit model is

preferable?)

Conversely, multiplying a probit coefficient by 1.6(= 1/0.625)


gives the corresponding logit coefficient. Incidentally, Amemiya
has also shown that the coefficients of LPM and logit models are
related as follows:
Incidentally, Amemiya has also shown that the coefficients of
LPM and logit models are related as follows: LPM = 0.25 logit
except for intercept and LPM = 0.25 logit +0.5 for intercept.
The coefficients of the logit model were multiplied by 0.625 .
The coefficients of the linear probability model were multiplied
by 2.5 through and then 1.25 was subtracted from the constant
term.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 80 / 84


CHAPTER I: 1.2 LDVMs ( Is logit or probit model is

preferable?)

The R2 ’s for the linear probability model is


significantly lower than those for the logit and probit
models. Alternative ways of comparing the models
would be:
1 To calculate the sum of squared deviations from predicted
probabilities
2 To compare the percentages correctly predicted
3 To look at the derivatives of the probabilities with respect
to a particular independent variable.

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 81 / 84


CHAPTER I: 1.2 LDVMs ( Measuring goodness of fit)

The conventional measure of goodness of fit, R2 , is not


particularly meaningful in binary regressand models.
Measures similar to R 2 , called pseudo R 2 , are available, and there
are a variety of them. Measures based on likelihood ratios: The
conventional measure of goodness of fit, R 2 , is not particularly
meaningful in binary regressand models. Measures similar to R 2 ,
called pseudo R 2 , are available, and there are a variety of them.
Measures based on likelihood ratios Let Lur be the maximum
likelihood function when maximized with respect to all the
parameters (without restriction/unrestricted) and Lr be the
maximum likelihood function when maximized with restrictions
βj = 0.
!2
2 Lr n
R =1−
Lur
YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 82 / 84
CHAPTER I: 1.2 LDVMs ( Is logit or probit model is

preferable?)

One can use an analogous measure for the logit and probit model
as well. However, for the qualitative dependent variable model,
the likelihood function attains an absolute maximum of 1 . This
means that, Lr ≤ L1π ≤ 1
Cragg and Uhler (1970) suggested a pseudo R 2 that lies between
0 and 1 .
Mc Fadden (1974) defined R 2 as;

log Lur
R2 =
log Lr

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 83 / 84


CHAPTER I: 1.2 LDVMs ( Is logit or probit model is

preferable?)

Another goodness-of-fit measure that is usually reported is the


so-called percent correctly predicted, which is computed as
follows. For each i, we compute the estimated probability that
Yi takes on the value one, if Ŷ i > 0.5 the prediction of Yi is
unity, and if Ŷi < 0Yi is predicted to be zero. The percentage of
times the predicted Yi matches the actual Yi (which we know to
be zero or one) is the percent correctly predicted.
(
1 if ŷi ≥ 0.5
yi∗ =
0 if ŷi < 0.5
number of correct predictions
CountR 2 =
total number of observations

YISAK NIGUSSE (Dep’t of Econ MTU) Chapter-I-Limited DV Models 84 / 84


THE END

You might also like