Problem-set – 1
Practise Problems from Textbook
Introductory Econometrics: A Modern Approach, Jeffrey M. Wooldridge, 5th edition
Problem C4, page 64
Use the data in WAGE2.RAW to estimate a simple regression explaining monthly salary (wage)
in terms of IQ score (IQ).
(i) Find the average salary and average IQ in the sample. What is the sample standard
deviation of IQ? (IQ scores are standardized so that the average in the population
is 100 with a standard deviation equal to 15.) [Use functions available in R]
(ii) Estimate a simple regression model where a one-point increase in IQ changes
wage by a constant dollar amount. Use this model to find the predicted increase
in wage for an increase in IQ of 15 points. (Hint: Check out the function ‘predict’ in
R) Does IQ explain most of the variation in wage?
(iii) Now, estimate a model where each one-point increase in IQ has the same
percentage effect on wage. If IQ increases by 15 points, what is the approximate
percentage increase in predicted wage?
Problem c8, page 65 and 66
To complete this exercise you need a software package that allows you to generate data
from the uniform and normal distributions.
(i) Start by generating 500 observations xi – the explanatory variable – from the
uniform distribution with range [0,10]. (Most statistical packages have a command
for the Uniform [0,1] distribution; just multiply those observations by 10.) What
are the sample mean and sample standard deviation of the xi?
(ii) Randomly generate 500 errors, ui, from the Normal [0,36] distribution. (If you
generate a Normal [0,1], as is commonly available, simply multiply the outcomes
by six.) Is the sample average of the ui exactly zero? Why or why not? What is the
sample standard deviation of the ui?
(iii) Now generate the yi as yi 5 1 1 2xi 1 ui ; b0 1 b1xi 1 ui; that is, the population
intercept is one and the population slope is two. Use the data to run the regression
of yi on xi. What are your estimates of the intercept and slope? Are they equal to
the population values in the above equation? Explain.
(iv) Obtain the OLS residuals, uˆi, and verify that equation (2.60) hold (subject to
Rounding error).
(v) Compute the same quantities in equation (2.60) but use the errors ui in place of
the residuals. Now what do you conclude?
(vi) Repeat parts (i), (ii), and (iii) with a new sample of data, starting with generating
the xi. Now what do you obtain for b ˆ0 and b ˆ 1? Why are these different from
what you obtained in part (iii)?
Problem 3
Based on dataset ceosal1.csv:
(i) What is the impact of an increase in ‘roe’ by 1 percentage point on the salary of
the CEOs?
(ii) Generate a dataframe named ‘xdata’ of normally distributed random variables
(rnorm) of 209 rows and 3 columns. Name the columns as “roe1”, “roe2” and
“roe3” <colnames(xdata) <- c(“name1”, “name2”, “name3”)>
(iii) Bind this dataframe to ceosal1 (cbind)
(iv) Estimate model -> lm(salary ~ roe + roe1 + roe2 + roe3, data = ceosal)
(v) Comment on the R-square (Multiple R-squared) of model estimated in (i) and (iv)
above. Do the additional variables “roe1”, “roe2” and “roe3” increase the
explanatory power of the model? Why or why not?