8000 TST / DEBUG: better warnings and tests when facing singular hessian problems by ogrisel · Pull Request #6 · lorentzenchr/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

TST / DEBUG: better warnings and tests when facing singular hessian problems #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ogrisel
Copy link
Collaborator
@ogrisel ogrisel commented Jun 1, 2022

Here is an attempt (failed or successful depending on the viewpoint) at improving the tests for the fallback inner solver for the singular Hessian case in the context of the "newton-cholesky" solver introduced in scikit-learn#23314.

I think in it's current state, it's pretty useless:

  • either with lstsq or indefinite_factorization, the model never converges, all line searches fail, the deviance stays very high and we issue a ton of useless warnings.
  • lstsq seems to be a bit faster but it's not important since neither method is helpful at making the model converge.

Possible things to try:

  • a) do not try to do a linear search on a singular hessian solution and instead continue warning but instead do a simple gradient step with small learning rate. But then we are not sure how small a learning rate. Or maybe we could try to run the line search with the raw gradient direction?
  • b) when facing a singular hessian, just stop the solver with a helpful convergence warning that suggests less collinear features or stronger regularization
  • c) alternatively, we could warn and try to fit the linear model again with strong regularization automatically (without ever trying to find a solution for the first encountered singular hessian problem)

Note that the suggestion to slightly increase the regularization works as shown in the last section of the test.

f"Line search of Newton solver {self.__class__.__name__} did not "
"converge after 21 line search refinement iterations.",
f"Line search of Newton solver {self.__class__.__name__} at iteration"
f" #{self.iteration} did not converge after 21 line search refinement"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be helpful to include the iteration number in the warning but I am not sure if we should use the one-based convention "#{self.iteration}" or the 0-based convention used elsewhere in the code with #{self.iteration - 1}.

@@ -31,37 +31,47 @@
from .._linear_loss import LinearModelLoss


def _solve_singular(H, g):
def _solve_singular(H, g, method="lstsq"):
Copy link
Collaborator Author
@ogrisel ogrisel Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to privately expose the method param to quickly switch between the 2 approaches to make it easier to investigate the singular Hessian problem but as I found that neither seems to be helpful I do not know which to keep. If we can actually make use of the indefinite_factorization variant, I think we should unittest it individually.

@lorentzenchr
Copy link
Owner

Nocedal & Wright write in Chapter 3.4:

As this discussion shows, there is a great deal of freedom in devising modification strategies, and there is currently no agreement on which strategy is best.

Let's see, if we find a satisfying solution.

@ogrisel
Copy link
Collaborator Author
ogrisel commented Jun 3, 2022

I moved to a new place and my copy of N & W is still somewhere buried in a cardboard box :)

@lorentzenchr
Copy link
Owner

I finally went with option a) simple gradient steps, see scikit-learn#23314 (comment).
I think I can close with scikit-learn@8a108bb and scikit-learn@82287af.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0