-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
PoissonRegressor lbfgs solver giving coefficients of 0 and Runtime Warning #27016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@lorentzenchr wondering if you may have some insight on this |
Quick update on this: It seems the issue is not scikit-learn specific, but rather it is specific to lbfgs solver. Even with statsmodels, fitting a Poisson GLM using lbfgs gives errors # Fit Poisson regression using formula interface WITH lbfgs solver
import statsmodels.formula.api as smf
data = sm.datasets.get_rdataset('Insurance', package='MASS').data
formula = "Claims ~ C(District, Treatment(1)) + C(Group, Treatment('<1l')) + C(Age, Treatment('<25')) + Holders"
model_smf_lbfgs = smf.poisson(formula=formula, data=data).fit(method="lbfgs")
print(type(model_smf_lbfgs))
print(model_smf_lbfgs.summary()) gives the following error too:
|
@akaashp2000 Could you run it with |
@lorentzenchr sorry for the late response on this. With the same Code: # one-hot encode the categorical columns, and drop the baseline column
# with lbfgs solver
verbose = 10
model_sklearn_lbfgs_verbose = PoissonRegressor(alpha=0, verbose=verbose).fit(X_train, y_train)
print("verbose", verbose)
print(model_sklearn_lbfgs_verbose.intercept_)
print(model_sklearn_lbfgs_verbose.coef_) Output:
For different values of |
Could you print the value of |
I tried the reproducer with
Note that the warnings happens after the CAUCHY subroutine. I printed the |
I have tried to change the penalty ( Not sure what the fix would be but we could at least raise an error or a convergence warning with a more user friendly error message if intermediate computation produce non-finite gradient values. |
I suspect that the columns "Holders" with a max value of 3e3 produces an overflow when exponentiated. If that's true, there is not much we can do about it.
|
We could detect the non-finite gradient values and issue a |
following your suggestion to print the value at each iteration, I added a line I have now tried fitting the GLM with LBFGS, now on the same X_train but with the "Holders" column scaled, i.e. X_train_scaled = X_train.copy()
X_train_scaled["Holders"] = X_train_scaled["Holders"]/X_train_scaled["Holders"].max() and it does indeed converge now! And to the same coefficients as from |
I would be happy to help by adding a My initial thought is this could be done in the |
An explicit check for finiteness could impact the fit time. Catching the runtime warning could be a way to go. |
We could indeed force numpy to raise an exception during the matmul and rewrap it with a with np.errstate(all='raise'):
# do the matmul here
... https://numpy.org/doc/stable/reference/generated/numpy.errstate.html#numpy.errstate |
I am trying the approach you have suggested above - from what I understand using with np.errstate(all="raise"):
try:
grad[:n_features] = X.T @ grad_pointwise + l2_reg_strength * weights
except FloatingPointError:
raise ValueError(
"Overflow detected. Try scaling the target variable or"
" features, or using a different solver"
) from None (I believe the
Is this what you meant? Could something similar could be done for multiclass case too (i.e. when |
@akaashp2000 Yes, such an error message is much more informative. Could you also check that actually scaling first does solve the problem? If so, then a PR for the improved error message would be much appreciated. |
I gave this another shot and made a suggestion in #29681, based on @akaashp2000 's PR, but including tests. However, I found another edge case that is not detectable (see https://github.com/scikit-learn/scikit-learn/pull/29681/files#r1719809543), to repro: import sys
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd
from sklearn.linear_model import PoissonRegressor
from sklearn.preprocessing import (
OneHotEncoder,
)
data = sm.datasets.get_rdataset('Insurance', package='MASS').data
X_train_ohe = OneHotEncoder(sparse_output=False, drop=[1, "<1l", "<25"]).fit(data[["District", "Group", "Age"]])
X_train_ohe = pd.DataFrame(X_train_ohe.transform(data[["District", "Group", "Age"]]), columns=X_train_ohe.get_feature_names_out())
# NOTE: Use just the last colum here
X_train = data[["Holders"]]
y_train = data["Claims"]
model_sklearn_lbfgs = PoissonRegressor(alpha=0, tol = 1e-8, max_iter = 10000, verbose = True).fit(X_train, y_train)
print(model_sklearn_lbfgs.intercept_)
print(model_sklearn_lbfgs.coef_)
# with newton-cholesky solver
model_sklearn_nc = PoissonRegressor(alpha=0, solver='newton-cholesky', tol = 1e-8, max_iter = 1000).fit(X_train, y_train)
print(model_sklearn_nc.intercept_)
print(model_sklearn_nc.coef_) which gives
How to proceed? |
Describe the bug
See the following stack exchange post (the solution to my original issue was to use newton-cholesky solver)
When fitting a Poisson Regression (without regularization) to some dummy data I encounter:
Some people on StackExchange have mentioned it is worth submitting an issue (there was a similar one faced with Logistic Regression).
Steps/Code to Reproduce
Expected Results
Expected intercept/coefficients (from statsmodels and sklearn with newton-cholesky solver):
Actual Results
Result with lbfgs:
Versions
The text was updated successfully, but these errors were encountered: