PART 1.
MULTIPLE CHOICE QUESTIONS (4 points)
There are 20 questions in this section. Each question is worth 0.2 points.
1. A change in the unit of measurement of the dependent variable in a model does not lead to a
change in:
A. the standard error of the regression.
B. the sum of squared residuals of the regression.
C. the goodness-of-fit of the regression.
D. the confidence intervals of the regression.
2. In the following equation, GDP refers to gross domestic product, and FDI refers to foreign
direct investment.
log (GDP) = 2.65 + 0.527log(bankcredit) + 0.222FDI
(0.13) (0.022) (0.017)
Which of the following statements is then true?
A. If GDP increases by 1%, bank credit increases by 0.527%, the level of FDI remaining
constant.
B. If bank credit increases by 1%, GDP increases by 0.527%, the level of FDI remaining
constant.
C. If GDP increases by 1%, bank credit increases by log (0.527)%, the level of FDI remaining
constant.
D. If bank credit increases by 1%, GDP increases by log (0.527)%, the level of FDI remaining
constant.
3. Which of the following statements is true when the dependent variable, y > 0?
a. Taking log of a variable often expands its range.
b. Models using log(y) as the dependent variable will satisfy CLM assumptions
more closely than models using the level of y.
c. Taking log of variables make OLS estimates more sensitive to extreme values.
d. Taking logarithmic form of variables make the slope coefficients more responsive to
rescaling.
4. 4. Which of the following correctly identifies an advantage of using adjusted R2 over R2?
a. Adjusted R2 corrects the bias in R2.
b. Adjusted R2 is easier to calculate than R2.
c. The penalty of adding new independent variables is better understood through
adjusted R2 than R2.
d. The adjusted R2 can be calculated for models having logarithmic functions while R2
cannot be calculated for such models.
5. 5. Suppose the variable has been omitted from the following regression equation,
is the estimator obtained when is omitted from the equation.
If is said to _____.
a. has an upward bias
b. has a downward bias
c. be unbiased
d. be biased toward zero
6. The Gauss-Markov theorem will not hold if _____.
1
a. the error term has the same variance given any values of the explanatory
variables
b. the error term has an expected value of zero given any values of the independent
variables
c. the independent variables have exact linear relationships among them
d. the regression model relies on the method of random sampling for collection of
data
7. The significance level of a test is:
a. the probability of rejecting the null hypothesis when it is false.
b. one minus the probability of rejecting the null hypothesis when it is
false.
c. the probability of rejecting the null hypothesis when it is true.
d. one minus the probability of rejecting the null hypothesis when it is
true.
8. Which of the following statements is true of confidence intervals?
a. Confidence intervals in a CLM are also referred to as point estimates.
b. Confidence intervals in a CLM provide a range of likely values for the population
parameter.
c. Confidence intervals in a CLM do not depend on the degrees of freedom of a
distribution.
d. Confidence intervals in a CLM can be truly estimated when heteroskedasticity is
present.
9. Which of the following statements is true of hypothesis testing?
a. The t test can be used to test multiple linear restrictions.
b. A test of single restriction is also referred to as a joint hypothesis test.
c. A restricted model will always have fewer parameters than its
unrestricted model.
d. OLS estimates maximize the sum of squared residuals.
10. Which of the following is a difference between the White test and the Breusch-Pagan test?
a. The White test is used for detecting heteroskedasticty in a linear regression model
while the Breusch-Pagan test is used for detecting autocorrelation.
b. The White test is used for detecting autocorrelation in a linear regression model while
the Breusch-Pagan test is used for detecting heteroskedasticity. .
c. The number of regressors used in the White test is larger than the number of
regressors used in the Breusch-Pagan test.
d. The number of regressors used in the Breusch-Pagan test is larger than the number
of regressors used in the White test.
11. If a regression equation shows that no individual t-tests are significant, but the F-statistic is
significant, the regression probably exhibits:
A) Heteroscedasticity
B) Serial correlation
C) Multicollinearity
D) Perfect collinearity
12. Which of the following is true of heteroskedasticity?
a. Heteroskedasticty causes inconsistency in the Ordinary Least Squares estimators.
b. Population R2 is affected by the presence of heteroskedasticty.
c. The Ordinary Least Square estimators are not the best linear unbiased
estimators if heteroskedasticity is present.
d. It is not possible to obtain F statistics that are robust to heteroskedasticity of an
unknown form.
13. Which of the following is true of Regression Specification Error Test (RESET)?
a. It tests if the functional form of a regression model is misspecified.
b. It detects the presence of dummy variables in a regression model.
c. It helps in the detection of heteroskedasticity when the functional form of the model is
correctly specified.
d. It helps in the detection of multicollinearity among the independent variables in a
regression model.
14. Which of the following problems, multicollinearity and/or serial correlation, can bias the
estimates of the slope coefficients?
A. Neither multicollinearity nor serial correlation
B. Both multicollinearity and serial correlation
C. Multicollinearity, but not serial correlation
D. Serial correlation, but not multicollinearity
The following information relates to Questions 15- 20
Gary Hansen is a securities analyst for a mutual fund specializing in small-capitalization growth
stocks. The fund regularly invests in initial public offerings (IPOs). If the fund subscribes to an
offer, it is allocated shares at the offer price. Hansen notes that IPOs frequently are underpriced, and
the price rises when open market trading begins. The initial return for an IPO is calculated as the
change in price on the first day of trading divided by the offer price. Hansen is developing a
regression model to predict the initial return for IPOs. Based on past research, he selects the
following independent variables to predict IPO initial returns:
Underwriter rank = 1–10, where 10 is highest rank
Pre-offer price adjustmenta = (Offer price – Initial filing price)/Initial filing price
Offer size ($ millions) = Shares sold × Offer price
Fraction retaineda = Fraction of total company shares retained by insiders
a
Expressed as a decimal
Hansen collects a sample of 1,725 recent IPOs for his regression model. Regression results appear
in Exhibit 1.
Hansen wants to use the regression results to predict the initial return for an upcoming IPO. The
upcoming IPO has the following characteristics:
− underwriter rank = 6;
− pre-offer price adjustment = 0.04
− offer size = $40 million.
− fraction retained = 0.70.
Because he notes that the pre-offer price adjustment appears to have an important effect on initial
return, Hansen wants to construct a 95 percent confidence interval for the coefficient on this
variable. He also believes that for each 1 percent increase in pre-offer price adjustment, the initial
return will increase by less than 0.5 percent, holding other variables constant. Hansen wishes to test
this hypothesis at the 0.05 level of significance. Before applying his model, Hansen asks a
colleague, Phil Chang, to review its specification and results. After examining the model, Chang
concludes that the model suffers from two problems: 1) conditional heteroskedasticity, and 2)
omitted variable bias. Chang makes the following statements:
Statement 1 “Conditional heteroskedasticity will result in consistent coefficient estimates, but
both the t-statistics and F-statistic will be biased, resulting in false inferences.”
Statement 2 “If an omitted variable is correlated with variables already included in the model,
coefficient estimates will be biased and inconsistent and standard errors will also be
inconsistent.”
Selected values for the t-distribution and F-distribution appear in Exhibits 3 and 4, respectively
15. Based on Hansen’s regression, the predicted initial return for the upcoming IPO is closest to:
A 0.0943. B 0.1064. C 0.1541. D 0.1673.
16. The 95 percent confidence interval for the regression coefficient for the pre-offer price
adjustment is closest to:
A 0.156 to 0.714. B 0.395 to 0.475. C 0.402 to 0.468. D 0.453 to 0.492.
Formula: β pre−offer price adjustment ± c0.05 × SE
17. The most appropriate null hypothesis and the most appropriate conclusion regarding Hansen’s
belief about the magnitude of the initial return relative to that of the pre-offer price adjustment
(reflected by the coefficient bj) are:
Null Hypothesis Conclusion about bj (0.05 Level of Significance)
A H0: bj = 0.5 Reject H0
B H0 : bj 0.5 Fail to reject H 0
C H0 : bj 0.5 Reject H 0
D ≤
H0: bj 0.5 Reject H0
18. The most appropriate interpretation of the multiple R-squared for Hansen’s model is that:
A. unexplained variation in the dependent variable is 36 percent of total variation.
B correlation between predicted and actual values of the dependent variable is 0.36.
C correlation between predicted and actual values of the dependent variable is 0.60.
19. Is Chang’s Statement 1 correct?
A Yes.
B No, because the model’s F-statistic will not be biased.
C No, because the model’s t-statistics will not be biased.
20. Is Chang’s Statement 2 correct?
A Yes.
B No, because the model’s coefficient estimates will be unbiased.
C No, because the model’s coefficient estimates will be consistent
PART 2. EXERCISES (6 points)
There are two exercises in this section.
Exercise 1. [3 points]
You estimate a wage regression, where the dependent variable is the logarithm of wage
(lnwage) and the independent variables are:
Education – number of years of education
Experience – years of experience
Exp. Squared – square of years of experience.
Black – dummy variable indicating if a person is black.
Other Race - dummy variable indicating if a person is neither white nor black and obtained
the following regression results:
(Note that white people are the base category here.)
The model:
log ( wage )=β 0+ β 1 education+ β 2 experience+ β 3 exp . Squared + β 4 Blac k + β 5 ×OtherRace + μ
a. Interpret the slope coefficient for Education. What is the predicted effect of accumulating an
additional 10 years of experience for a person with 5 years of experience? [0.5 points]
Interpretation of Education:
The increase of 1 year of education lead to the increase of 8.5960% of wage in holding others constant.
Predicted Effect:
log ( wage )=β 0+ β 2 experience + β 3 exp . Squared
^ ^
∆ wage=100 × (β 2¿ ×∆ experience + ^
B 2× ∆ exp . Squared )¿
The predicted effect of accumulating an additional 10 years of experience for a person with 5 years
of experience:
^
∆ wage=100 × ( 0.0367624 ×10−0.0005596 ×10 2 )=31.1664 %
It means that if a person with 5 years of experience needs to accumulate more 10 years of
experience to reach the increasing 31.17% of wage.
b. Test the hypothesis that all the slope coefficients in the regression are zero at the 1% significance
level. [0.5 points]
Hypothesis:
H0: β 1 , β 2 , β 3 , β 4=0
H1: β 1 , β 2 , β 3 , β 4 ≠ 0
Unrestricted Model:
log ( wage )=β 0+ β 1 × education+ β 2 ×experience + β 3× exp . Squared + β 4 × Black + β 5 ×OtherRace+ μ
R2 = 0.2553
Restricted Model:
log ( wage )=β 0
R2 = 0
q (Dropped independent variables) = 5
n = 678017
k=5
dfur = n – k – 1 = 678017 – 5 – 1 = 678011
( R 2ur−R2 ) /q (0.2553−0)/5
F stat = = =46487.50
( 1−R2ur ) /df ur ( 1−0.2553 ) /678011
Fcrit (1 % , 5,678011)=9.02
Because F stat > Fcrit (1 % ,5,678011) ( 46487.50> 9.02 )
Reject H0 at 1%
Conclusion: The all-independent variables have a statistically significant impact for the
model.
c. What is the interpretation of the coefficient on Black? Is there a (statistically) significant
difference between wages of black and non-black pupils? State and test the appropriate
hypothesis at the 5% significance level. [1 point]
Interpretation of Black:
If the person is black, their wage will decrease 13,56374%
Test for significant:
In the situation which other variables are held constant:
Black pupils:
log ( wage )=β 0+ β 4 ×1+ β 5× 0= β 0+ β 4
Non-black pupils:
log ( wage )=β 0+ β 4 ×0+ β 5 ×0=β 0
The difference is β 4=−0.1356374
Two-sided t-test for “black” variable:
H0: β 4=0
H1: β 4 ≠ 0
t-stat = -73.76
t-crit 10%, 5%, 1% respectively = -1.645, -1.96, -2.576
t-stat < t-crit at 10%, 5%, 1%
Reject H0
Conclusion: “black” has a statistically significant impact on “log(wage)”.
d. How would you test that experience has no effect on wages? State the appropriate hypothesis
and discuss all the necessary steps required to obtain the result. [1 point]
Step 1: Identifying the original model (Unrestricted model):
log ( wage )=β 0+ β 1 × education+ β 2 ×experience + β 3× exp . Squared + β 4 × Black+ β 5 ×OtherRace+ μ
Step 2: Stating the hypothesis.
H0: β 2 , β 3=0
H1: H0 is not true.
Step 3: Identifying and regress restricted model (drop experience , exp . Squared ¿ :
log ( wage )=β 0+ β 1 × education+ β 4 × Black + β 5× OtherRace+ μ
Step 4: Get the R2 value of restricted model.
Step 5: Calculate F-stat.
n = 678017
k=5
dfur = n – k – 1 = 678017 – 5 – 1 = 678011
q=2
( R 2ur−R2r ) /q 2
(0.2553−Rr )/2
F stat = =
( 1−R2ur ) /df ur ( 1−0.2553 ) /678011
Step 6: Find F-crit on the table.
Fcrit (α , 2 ,678011)
Step 7: Comparison and conclusion
If F-stat > F-crit => Reject H0. We can conclude that year of experience has statistically significant
impact on wage.
If F-stat < F-crit => Fail to reject H0. We can conclude that year of experience does not have
statistically significant impact on wage.
Exercise 2. [3 points]
1. Consider the following rice production model,
where prod = the tons of freshly threshed rice; area = hectares planted; labour = person-days
of hired and family labor; fert = kilograms of fertilizer.
If the variable ln(fert) is omitted from the model, in what direction are the biases in the
estimates of and ? Given the magnitude of the correlation of ln(fert) with ln(area) and
ln(labor) would you expect the bias to be relatively large or relatively small? Why, why not?
[1.5 points]
The relationship between ln(fert) and ln(area) may be positive because the more planted area
needs more fertilizer.
The relationship between ln(fert) and ln(labour) may be positive because the more fertilizer is
used may need workers to do.
The biases in the estimates of ln(area) and ln(labor) would be in the positive direction if the variable ln(fert)
is omitted from the model. This is because ln(fert) is probably positively correlated with ln(area) and
ln(labor).
Due to this positive correlation, the bias is expected to be relatively large.
2. Recently Polish Education Research Institute investigated the effect of introducing chess
lessons in elementary schools on student’s performance. The schools applied for participation
in the program and pupils’ test scores were measured after the implementation of the
program both in schools which participated and those who opted out. Using these data, you
estimate the following regression model:
where TestScores is the average score on standardised test and Chess indicates if an
individual participated in the program (Chess=1) or not (Chess=0). You find that and
that the effect of participation in the program is statistically significant. We know that schools
who applied for the program are more dynamic and provide better quality of education than
the other schools. Given this information, does provide an unbiased and consistent
estimator of the causal effect of the program? If not, what is the direction of the bias? Explain
your answer in detail. [1.5 points]
It may be that the “education” (B2) variable is omitted from the model because the given
information said that the schools applied the chess program are more dynamic and higher education
quality. And it (B2) might be positive with “chess” (B1) because if the school applies to the chess
program, there is a participation to it.
Therefore, the model is biased because of an omitted variable and B1 have the positive
bias.