MAINT: Fix the sign when using Newton's method #9466

rob-sil · 2024-12-26T21:35:14Z

This PR cleans up a small question in the code when using Newton's method to fit a generic maximum-likelihood estimator.

statsmodels/statsmodels/base/model.py

Line 542 in ddeaa63

# TODO: why are score and hess positive?

The code flips the sign of the gradient and Hessian for Newton's method, but not any other minimization method (BFGS, Nelder-Mead, etc.). All other methods minimize negative log likelihood, but Netwon's gets the gradient and Hessian of the (positive) log likelihood function. Newton's method still uses the negative log likelihood as the objective function.

Newton's Method

While the gradient and Hessian don't match the objective function, Newton's method works fine because it doesn't use the objective function at all. Instead, Newton's method finds zeros of the derivative, which correspond to the critical points: local minimum, local maximum, and saddle points. In the existing implementation, Newton's method finds the maximum of the log likelihood function, which equals the minimum of the negative log likelihood problem (assuming all critical points are global maxima). You can always multiply the gradient and Hessian by -1, or any nonzero scalar, and get the same results.

For clarity, this PR removes the sign flip for Newton's method, since it will work equally well in either case. Without the sign flip, I think the code is easier to read.

Ridge Regularization

The sign change has a slight impact on a ridge factor applied in _fit_newton:

statsmodels/statsmodels/base/optimizer.py

Line 448 in 3be76b8

H[np.diag_indices(H.shape[0])] += ridge_factor

I changed the sign of the Hessian, but not the ridge factor. The resulting matrix changes its values beyond the intended sign flip. I think adding rather than subtracting the ridge factor is proper here. We'd want a square penalty when maximizing log likelihood, which would correspond to subtracting the ridge factor. However, we're working with negative log likelihood, so we flip the sign and add the ridge factor.

(Also, the code doesn't adjust the gradient, but I think it should. I haven't made that change in this PR.)

josef-pkt · 2025-01-04T14:10:31Z

Thanks,
looks fine
I guess the only disadvantage is that negative of hessian requires a copy, which might have been the reason to avoid the sign flip.

H[np.diag_indices(H.shape[0])] += ridge_factor

AFAIR, diagonal of H should be positive now. So this is now correct.
I think this was wrong before if diag(H) is negative.

one test failure on one machine
FAILED statsmodels/discrete/tests/test_predict.py::TestZINegativeBinomialPPredict::test_predict

Also, the code doesn't adjust the gradient, but I think it should.

Do you mean adjusting the score for the the ridge factor?
I had added the ridge factor only to make the hessian less likely to be ill conditioned. It's not computing a ridge regression.

josef-pkt · 2025-07-19T08:44:46Z

sign change of hessian needs decision

sign of ridge factor needs more review to see if current version is really a bug.

Possibly check if adding the corresponding ridge factor also to the score improves optimization. (When I used the ridge factor to stabilize hessian during optimization in the past, I never adjusted score or similar. The ridge factor here should just move numerical noise in the right direction.)

Fix the sign when using Newton's method

80bcecf

josef-pkt added type-bug comp-base type-refactor labels Jul 19, 2025

josef-pkt added this to the 0.15 milestone Jul 19, 2025

josef-pkt added the prio-elev label Jul 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MAINT: Fix the sign when using Newton's method #9466

MAINT: Fix the sign when using Newton's method #9466

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MAINT: Fix the sign when using Newton's method #9466

Are you sure you want to change the base?

MAINT: Fix the sign when using Newton's method #9466

Uh oh!

Conversation

Newton's Method

Ridge Regularization

Uh oh!

Uh oh!

Uh oh!

Uh oh!