8000 Clustered standard errors in statsmodels · Issue #15941 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Clustered standard errors in statsmodels #15941

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tomdavidoff opened this issue Apr 8, 2020 · 4 comments
Closed

Clustered standard errors in statsmodels #15941

tomdavidoff opened this issue Apr 8, 2020 · 4 comments

Comments

@tomdavidoff
Copy link

I find that trying to run clustered standard errors, I get a "ValueError: only two groups are supported." But clustered standard errors should be fine with many clusters. I have about 300 clusters X 30 observations each.

Reproducing code example:

print(sm.OLS.from_formula('price_change ~ log_F_permits + log_A_permits + C(year) + C(metro)',data=data_frame).fit(cov_type="cluster",cov_kwds={"groups":metro}).summary())

The regression works fine with the .fit() part empty. But when run with these parts (a bunch of metro areas in a panel with about 30 observations per cluster), I get: the following

data_frame is populated by the variables listed above.

Traceback (most recent call last):
File "Documents/williams/scripts/annual.py", line 102, in
print(sm.OLS.from_formula('price_change ~ log_F_permits + log_A_permits + C(year) + C(metro)',data=data_frame).fit(cov_type="cluster",cov_kwds={"groups":metro}).summary())
File "/usr/lib/python3.8/site-packages/statsmodels/regression/linear_model.py", line 342, in fit
lfit = OLSResults(
File "/usr/lib/python3.8/site-packages/statsmodels/regression/linear_model.py", line 1556, in init
self.get_robustcov_results(cov_type=cov_type, use_self=True,
File "/usr/lib/python3.8/site-packages/statsmodels/regression/linear_model.py", line 2464, in get_robustcov_results
raise ValueError('only two groups are supported')
ValueError: only two groups are supported
#2 ~/Documents/williams/scripts/annual.py_ou

@tomdavidoff tomdavidoff changed the title Clustered standard errors Clustered standard errors in statsmodels Apr 8, 2020
@rossbar
Copy link
Contributor
rossbar commented Apr 8, 2020

I'm not clear on how this issue relates to numpy. Did you intend to report to statsmodels?

@charris
Copy link
Member
charris commented Apr 8, 2020

Statsmodels is at https://github.com/statsmodels/statsmodels.

@tomdavidoff
Copy link
Author

Fair enough -- I'll move the comment over there. Thanks.

@lbyiuou0329
Copy link

In cov_kwds, groups need to be actual data, i.e. data_frame['metro'] in your case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
0