Report
Report
Xinyi Hu
2024.08.08
0.1 Introduction
The objective of this project is to understand and analyze various factors that influ-
ence housing values in the Chicago metropolitan area. These factors include structural
characteristics of the house, geographic location, accessibility, and local government ex-
penditure policies. By constructing an econometric model, we study how these factors
individually or collectively affect the market price of houses.
This topic is worth researching because housing is the largest asset for most house-
holds, and understanding the determinants of housing prices is crucial for families, in-
vestors, and policymakers. Especially in metropolitan areas, fluctuations in housing prices
can have broad economic impacts.
• Independent Variables:
1
– OHARE (Located Near O’Hare Airport)
Note: Some variables are transformed into logarithmic form for regression analysis.
ϵ ∼ N (0, σ 2 ) (2)
2
0.3.4 Exogeneity Assumption
It is assumed that the explanatory variables are exogenous, meaning there is no
correlation between the explanatory variables and the error term:
E(ϵ|X) = 0 (3)
ϵ ∼ N (0, σ 2 ) (4)
3
(0.5565) (0.0421) (0.0384) (0.0098)
(0.0114) (0.0591) (0.0286) (0.0161)
(0.0519) (0.0211) (0.0068) (0.0138)
(0.0043) (0.0003) (0.0040) (0.0282) (0.0317)
Note: The numbers in parentheses are the standard errors of the estimated coeffi-
cients.
The F-statistic and R² of the estimated model are as follows:
• F-statistic: 141.09
• R²: 0.5324
0.5.2 Interpreting R²
The R² value of 0.5324 indicates that the explanatory variables explain 53.24% of the
variation in the dependent variable. This means that the model has a relatively strong
explanatory power for housing price variation.
• F( 1, 1983) = 74.14
4
0.6 Variable Selection
Conduct a stepwise regression analysis, removing insignificant variables DFNI and
COOK, and obtain the final regression equation.
The final regression equation is as follows:
• F-statistic: 160.77
• R²: 0.5314
The subset test results show that there is no significant difference in the overall significance
of the model after removing insignificant variables, indicating that our final model still
explains the variation in the dependent variable well.
0.8 Conclusion
5
2. Living Area (ln_LVAREA): Living area has a significant positive impact on hous-
ing prices. Each 1% increase in living area increases the housing price by approximately
24.33%.
3. Effective Age of House (ln_HAGEEFF): House age has a significant negative
impact on housing prices. Each 1% increase in house age decreases the housing price by
approximately 8.42%.
4. Land Size (ln_LSIZE): Land size has a significant positive impact on housing
prices. Each 1% increase in land size increases the housing price by approximately 10.65%.
5. Property Taxes (ln_PTAXES): Property taxes have a significant negative impact
on housing prices. Each 1% increase in property taxes decreases the housing price by
approximately 38.40%.
6. Median Income (ln_MEDINC): Median income in the neighborhood has a sig-
nificant positive impact on housing prices. Each 1% increase in median income increases
the housing price by approximately 41.73%.
7. Distance to City Center (ln_DFCL): The farther the house is from the city center,
the lower the housing price. Each 1% increase in distance decreases the housing price by
approximately 24.06%.
8. School Expenditure (ln_SSPEND): School expenditure has a significant positive
impact on housing prices. Each 1% increase in school expenditure increases the housing
price by approximately 17.04%.
9. Municipal Expenditure (ln_MSPEND): Municipal expenditure has a significant
negative impact on housing prices. Each 1% increase in municipal expenditure decreases
the housing price by approximately 6.85%.
10. Air Conditioning (AIRCON): Houses with air conditioning have higher prices,
increasing by approximately 4.24%.
11. Number of Bathrooms (NBATH): An increase in the number of bathrooms has
a significant positive impact on housing prices. Each additional bathroom increases the
housing price by approximately 3.59%.
12. Garage (GARAGE): Houses with a garage have higher prices, increasing by
approximately 1.35%.
13. Percentage of White Population (PCTWHT): The percentage of the white pop-
ulation has a significant positive impact on housing prices. Each 1% increase in the
percentage increases the housing price by approximately 0.29%.
14. Located Near O’Hare Airport (OHARE): Houses located near O’Hare Airport
have higher prices, increasing by approximately 10.84%.
6
0.8.2 Advantages and Disadvantages of the Model and Results
Advantages
• High Explanatory Power: The model’s R² value is 0.5314, indicating that the
explanatory variables can explain 53.14% of the variation in the dependent variable,
showing strong explanatory power.
Disadvantages
• Potential Omitted Variables: There may be factors that affect housing prices
not included in the model, leading to incomplete explanatory power.
• Negative Impact of House Age on Housing Prices: Newer houses are more
valuable than older ones. Each additional year of house age decreases the price,
indicating that buying a new house might be a better investment choice.
• High Tax Rates Lower Housing Prices: High property tax rates significantly
lower housing prices. This means that houses in areas with high tax rates are
cheaper, which could affect home buying decisions.
7
• Higher Prices Near City Center: Houses closer to the city center have higher
prices, indicating that the convenience of the city center makes these houses more
desirable.
• Higher Prices Near O’Hare Airport: Houses near O’Hare Airport have higher
prices, possibly due to the convenience of transportation and concentration of eco-
nomic activities.
These findings provide valuable information for home buyers, investors, and policy-
makers, helping them make more informed decisions.
8
0.9 Appendix
Loading the data:
describe
Since some variables have zero values, only suitable variables are transformed into loga-
rithmic form. The Stata commands are as follows:
• F-statistic: 141.09
• R²: 0.5324
Run the regression model and view the results using the following Stata commands:
9
Residual | 113.487647 1,983 .057230281 R-squared = 0.5324
-------------+---------------------------------- Adj R-squared = 0.5286
Total | 242.679755 1,999 .121400578 Root MSE = .23923
------------------------------------------------------------------------------
ln_SPRICE | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
ln_NROOMS | .1740158 .0421356 4.13 0.000 .0913811 .2566506
ln_LVAREA | .2433724 .0383658 6.34 0.000 .1681309 .3186139
ln_HAGEEFF | -.0843861 .0098007 -8.61 0.000 -.1036069 -.0651653
ln_LSIZE | .1066477 .0114179 9.34 0.000 .0842553 .1290401
ln_PTAXES | -.465165 .0591439 -7.86 0.000 -.5811558 -.3491742
ln_MEDINC | .4239539 .0285668 14.84 0.000 .3679299 .479978
ln_DFCL | -.2344655 .0161426 -14.52 0.000 -.2661239 -.2028073
ln_SSPEND | .1158414 .051852 2.23 0.026 .0141512 .2175316
ln_MSPEND | -.0537627 .021136 -2.54 0.011 -.0952138 -.0123115
AIRCON | .0420512 .0067691 6.21 0.000 .0287759 .0553265
NBATH | .0356002 .013793 2.58 0.010 .0085498 .0626506
GARAGE | .0137483 .0043079 3.19 0.001 .0052998 .0221969
PCTWHT | .0029339 .0003053 9.61 0.000 .0023352 .0035326
DFNI | -.0023108 .0039563 -0.58 0.559 -.0100698 .0054482
COOK | .0491633 .0282415 1.74 0.082 -.0062228 .1045494
OHARE | .1025445 .0316506 3.24 0.001 .0404726 .1646164
_cons | 4.810413 .5564745 8.64 0.000 3.719077 5.901749
------------------------------------------------------------------------------
10
( 8) ln_SSPEND = 0
( 9) ln_MSPEND = 0
(10) AIRCON = 0
(11) NBATH = 0
(12) GARAGE = 0
(13) PCTWHT = 0
(14) DFNI = 0
(15) COOK = 0
(16) OHARE = 0
test ln_HAGEEFF
( 1) ln_HAGEEFF = 0
F( 1, 1983) = 74.14
Prob > F = 0.0000
Stepwise regression analysis, removing insignificant variables step by step, the Stata
command and results are as follows:
11
-------------+---------------------------------- Adj R-squared = 0.5281
Total | 242.679755 1,999 .121400578 Root MSE = .23936
------------------------------------------------------------------------------
ln_SPRICE | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
ln_NROOMS | .1713756 .042117 4.07 0.000 .0887774 .2539739
ln_LVAREA | .2432909 .0383839 6.34 0.000 .1680139 .3185679
ln_HAGEEFF | -.0842359 .0097641 -8.63 0.000 -.1033848 -.065087
ln_LSIZE | .1064532 .0113601 9.37 0.000 .0841743 .1287321
ln_PTAXES | -.3839956 .0409537 -9.38 0.000 -.4643124 -.3036788
ln_MEDINC | .4173116 .0283497 14.72 0.000 .3617132 .4729099
ln_DFCL | -.2405759 .0158333 -15.19 0.000 -.2716275 -.2095242
ln_SSPEND | .1704485 .0439271 3.88 0.000 .0843004 .2565966
ln_MSPEND | -.0685232 .0198029 -3.46 0.001 -.1073598 -.0296866
AIRCON | .0424246 .0067674 6.27 0.000 .0291526 .0556966
NBATH | .0358732 .0137957 2.60 0.009 .0088176 .0629288
GARAGE | .0134979 .0043023 3.14 0.002 .0050603 .0219354
PCTWHT | .0028833 .0002941 9.80 0.000 .0023066 .00346
OHARE | .108432 .0313736 3.46 0.001 .0469034 .1699606
_cons | 4.379715 .5111181 8.57 0.000 3.37733 5.382099
------------------------------------------------------------------------------
Save the complete model and final model, and conduct a subset test:
12
Model | 128.953411 14 9.21095791 Prob > F = 0.0000
Residual | 113.726345 1,985 .057292869 R-squared = 0.5314
-------------+---------------------------------- Adj R-squared = 0.5281
Total | 242.679755 1,999 .121400578 Root MSE = .23936
------------------------------------------------------------------------------
ln_SPRICE | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
ln_NROOMS | .1713756 .042117 4.07 0.000 .0887774 .2539739
ln_LVAREA | .2432909 .0383839 6.34 0.000 .1680139 .3185679
ln_HAGEEFF | -.0842359 .0097641 -8.63 0.000 -.1033848 -.065087
ln_LSIZE | .1064532 .0113601 9.37 0.000 .0841743 .1287321
ln_PTAXES | -.3839956 .0409537 -9.38 0.000 -.4643124 -.3036788
ln_MEDINC | .4173116 .0283497 14.72 0.000 .3617132 .4729099
ln_DFCL | -.2405759 .0158333 -15.19 0.000 -.2716275 -.2095242
ln_SSPEND | .1704485 .0439271 3.88 0.000 .0843004 .2565966
ln_MSPEND | -.0685232 .0198029 -3.46 0.001 -.1073598 -.0296866
AIRCON | .0424246 .0067674 6.27 0.000 .0291526 .0556966
NBATH | .0358732 .0137957 2.60 0.009 .0088176 .0629288
GARAGE | .0134979 .0043023 3.14 0.002 .0050603 .0219354
PCTWHT | .0028833 .0002941 9.80 0.000 .0023066 .00346
OHARE | .108432 .0313736 3.46 0.001 .0469034 .1699606
_cons | 4.379715 .5111181 8.57 0.000 3.37733 5.382099
------------------------------------------------------------------------------
13
(9) full_model: ln_MSPEND = 0
(10) full_model: AIRCON = 0
(11) full_model: NBATH = 0
(12) full_model: GARAGE = 0
(13) full_model: PCTWHT = 0
(14) full_model: OHARE = 0
(15) final_model: ln_NROOMS = 0
(16) final_model: ln_LVAREA = 0
(17) final_model: ln_HAGEEFF = 0
(18) final_model: ln_LSIZE = 0
(19) final_model: ln_PTAXES = 0
(20) final_model: ln_MEDINC = 0
(21) final_model: ln_DFCL = 0
(22) final_model: ln_SSPEND = 0
(23) final_model: ln_MSPEND = 0
(24) final_model: AIRCON = 0
(25) final_model: NBATH = 0
(26) final_model: GARAGE = 0
(27) final_model: PCTWHT = 0
(28) final_model: OHARE = 0
14