8000 Add IRLS solver for linear models · Issue #16634 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Add IRLS solver for linear models #16634

New issue 8000

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rth opened this issue Mar 4, 2020 · 3 comments · Fixed by #24637 · May be fixed by #25462
Closed

Add IRLS solver for linear models #16634

rth opened this issue Mar 4, 2020 · 3 comments · Fixed by #24637 · May be fixed by #25462

Comments

@rth
Copy link
Member
rth commented Mar 4, 2020

The Iteratively reweighted least squares (IRLS) solver, initially proposed in #9405 could be used in several linear models, for instance,

  • LogisticRegression
  • TwieedieRegression (including Poisson & Gamma) etc

Preliminary benchmarks for TweedieRegression were done in #9405 (comment) and demonstrated that IRLS could be competitive (or faster than other solvers including LBFGS) when n_features ≤ 10-20. More detailed benchmarks with a careful consideration of the stopping criterion would be necessary.

Also see: #16635 #16637

@rth rth added the New Feature label Mar 4, 2020
@rth
Copy link
Member Author
rth commented Mar 4, 2020

that IRLS could be competitive (or faster than other solvers including LBFGS) when n_features ≤ 10-20.

Benchmarks should additionally be done for 32 bit input, as currently one limitation of LBFGS is that it only supports 64bit (and upcasts 32bit input).

@ogrisel
Copy link
Member
ogrisel commented Mar 4, 2020

In addition to the ratio of n_samples / n_features one should also evaluate the impact of the condition number / rank deficiency of the input feature matrix as discussed in: #15583 (comment)

@lorentzenchr
Copy link
Member
lorentzenchr commented Mar 4, 2020

IRLS, in the end, is the very same as (quasi-) Newton convex optimization. So, an outer loop approximates the objective with a 2nd order Taylor series. The inner loop minimizes the Taylor series which is the same as solving least squares. There are then several ways to solve the inner loop's least squares problem:

  1. Use dedicated linear algebra least squares solvers (often using qr or command & conquer svd). This is very stable and usually the preferred approach.
  2. Solve the normal equations. This is faster for small n_features but potentially numerically unstable (doubles the condition number). This approach was implemented in [MRG] META Add Generalized Linear Models #9405.

Depending on the approach, including the L2-penalty and the intercept are tricky parts.
We should decide, which approach(es) to try. Also, for solving normal equations, I would change some parts of the implementation of #9405.

Some references:

I'd like to mention, that more recent scipy versions have added some very good trust-region optimizers (2nd order Taylor, but then solve inner loop with trust-region methods), e.g. trust-constr in scipy 1.1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
0