8000 ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss by lorentzenchr · Pull Request #26721 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss #26721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Oct 6, 2023

Conversation

lorentzenchr
Copy link
Member
@lorentzenchr lorentzenchr commented Jun 28, 2023

Reference Issues/PRs

Fixes #24752, #18074.
Alternative to #27191.

What does this implement/fix? Explain your changes.

This PR changes LinearModelLoss to always use 1/n factor in front, such that the loss is
$\frac{1}{n} \sum_i loss(i) + penalty$.

This is a pure internal change, no API is touched. But coefficients of log reg models may change due to different convergence / stopping of "newton-cg" and "lbfgs" solvers.

Any other comments?

This PR is incomplete and test do not pass. If someone else wants to continue, please go on.
With this PR, the default setting of LogisticRegression might produce less accurate coefficients. So we might consider increasing tol.

@lorentzenchr lorentzenchr marked this pull request as draft June 28, 2023 06:11
@github-actions
Copy link
github-actions bot commented Jun 28, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: d5dea86. Link to the linter CI: here

@lorentzenchr lorentzenchr force-pushed the linear_loss_normalize branch from b6254c5 to fc0f0e7 Compare June 28, 2023 06:59
@lorentzenchr lorentzenchr marked this pull request as ready for review September 15, 2023 18:27
@lorentzenchr lorentzenchr changed the title WIP 1/n * loss in LinearModelLoss ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss Sep 15, 2023
@lorentzenchr
Copy link
Member Author

@ogrisel friendly ping. This might interest you. I consider it as alternative to #27191.

@lorentzenchr
Copy link
Member Author
lorentzenchr commented Sep 18, 2023

As of ead068f with the same benchmark code as in #24752 (comment), I get
image
image

Note that suboptimality is capped at 1e-12.

To make newton-cg pass, I added e1c2128. Nevertheless, newton-cg seems broken for high accuracy (low tol).

@lorentzenchr
Copy link
Member Author

As of fdc0fa3 with fixed curvature criterion in conjugate gradient computation:
image
image
Note that suboptimality is capped at 1e-12.

Summary

@lorentzenchr
Copy link
Member Author

I could put fdc0fa3 into a separate small PR.

@lorentzenchr
Copy link
Member Author

For a better comparison, here main vs this PR:
image

image

Numbers in details

MAIN:

{'solver': 'lbfgs', 'tol': 0.1, 'train_time': 0.9280499740000003, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 0.1, 'train_time': 0.0053529480000023, 'train_loss': 0.14734670997261817, 'n_iter': 1, 'converged': True, 'coef_sq_norm': 6.64684135078569e-07}
{'solver': 'newton-cg', 'tol': 0.1, 'train_time': 0.3840892850000017, 'train_loss': 0.14042987625601847, 'n_iter': 27, 'converged': True, 'coef_sq_norm': 10.88314051844814}
{'solver': 'newton-cholesky', 'tol': 0.1, 'train_time': 0.05441616799999949, 'train_loss': 0.14043025337938708, 'n_iter': 3, 'converged': True, 'coef_sq_norm': 4.631295154288293}
{'solver': 'lbfgs', 'tol': 0.01, 'train_time': 0.9241804390000006, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 0.01, 'train_time': 0.006164442000002879, 'train_loss': 0.1473286985544048, 'n_iter': 2, 'converged': True, 'coef_sq_norm': 1.828115482134254e-06}
{'solver': 'newton-cg', 'tol': 0.01, 'train_time': 0.4239984650000004, 'train_loss': 0.14042984699893613, 'n_iter': 30, 'converged': True, 'coef_sq_norm': 10.928226802785039}
{'solver': 'newton-cholesky', 'tol': 0.01, 'train_time': 0.05675666000000135, 'train_loss': 0.14043025337938708, 'n_iter': 3, 'converged': True, 'coef_sq_norm': 4.631295154288293}
{'solver': 'lbfgs', 'tol': 0.001, 'train_time': 0.9546782530000009, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 0.001, 'train_time': 0.04396353799999986, 'train_loss': 0.1410037853237034, 'n_iter': 28, 'converged': True, 'coef_sq_norm': 2.2072662996482353}
{'solver': 'newton-cg', 'tol': 0.001, 'train_time': 0.44321807600000085, 'train_loss': 0.14042984696849187, 'n_iter': 31, 'converged': True, 'coef_sq_norm': 10.928352061376081}
{'solver': 'newton-cholesky', 'tol': 0.001, 'train_time': 0.07081544800000117, 'train_loss': 0.14042984697134275, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 4.6253367908381}
{'solver': 'lbfgs', 'tol': 0.0001, 'train_time': 1.1397508389999977, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 0.0001, 'train_time': 0.3360903180000001, 'train_loss': 0.14044675942032672, 'n_iter': 210, 'converged': True, 'coef_sq_norm': 4.700817129197633}
{'solver': 'newton-cg', 'tol': 0.0001, 'train_time': 0.5054758409999991, 'train_loss': 0.1404298469669997, 'n_iter': 32, 'converged': True, 'coef_sq_norm': 10.928575906731304}
{'solver': 'newton-cholesky', 'tol': 0.0001, 'train_time': 0.0731701890000025, 'train_loss': 0.14042984697134275, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 4.6253367908381}
{'solver': 'lbfgs', 'tol': 1e-05, 'train_time': 0.9715823770000007, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-05, 'train_time': 0.712943328999998, 'train_loss': 0.14043009862811637, 'n_iter': 438, 'converged': True, 'coef_sq_norm': 5.259620136224546}
{'solver': 'newton-cg', 'tol': 1e-05, 'train_time': 0.49284152599999587, 'train_loss': 0.140429846966983, 'n_iter': 33, 'converged': True, 'coef_sq_norm': 10.928557742500834}
{'solver': 'newton-cholesky', 'tol': 1e-05, 'train_time': 0.09648041899999527, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-06, 'train_time': 1.001717947000003, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-06, 'train_time': 1.2391906020000008, 'train_loss': 0.14042984933998484, 'n_iter': 786, 'converged': True, 'coef_sq_norm': 5.3230611460017}
{'solver': 'newton-cg', 'tol': 1e-06, 'train_time': 0.9129552300000014, 'train_loss': 0.14042984696383132, 'n_iter': 36, 'converged': True, 'coef_sq_norm': 4.625320094695539}
{'solver': 'newton-cholesky', 'tol': 1e-06, 'train_time': 0.08922410899999988, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-07, 'train_time': 1.0032121049999958, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-07, 'train_time': 1.958135256999995, 'train_loss': 0.14042984715738735, 'n_iter': 989, 'converged': True, 'coef_sq_norm': 5.320317088094557}
{'solver': 'newton-cg', 'tol': 1e-07, 'train_time': 0.9189918009999971, 'train_loss': 0.14042984696383132, 'n_iter': 36, 'converged': True, 'coef_sq_norm': 4.625320094695539}
{'solver': 'newton-cholesky', 'tol': 1e-07, 'train_time': 0.08916410199999802, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-08, 'train_time': 0.9249113079999987, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-08, 'train_time': 1.5889090729999964, 'train_loss': 0.14042984715738735, 'n_iter': 989, 'converged': True, 'coef_sq_norm': 5.320317088094557}
{'solver': 'newton-cg', 'tol': 1e-08, 'train_time': 0.9037554300000039, 'train_loss': 0.14042984696383132, 'n_iter': 36, 'converged': True, 'coef_sq_norm': 4.625320094695539}
{'solver': 'newton-cholesky', 'tol': 1e-08, 'train_time': 0.09718773900000599, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-09, 'train_time': 1.0128275730000027, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-09, 'train_time': 1.5980588339999997, 'train_loss': 0.14042984715738735, 'n_iter': 989, 'converged': True, 'coef_sq_norm': 5.320317088094557}
{'solver': 'newton-cg', 'tol': 1e-09, 'train_time': 0.9853368599999968, 'train_loss': 0.14042984696383132, 'n_iter': 37, 'converged': True, 'coef_sq_norm': 4.625320094498774}
{'solver': 'newton-cholesky', 'tol': 1e-09, 'train_time': 0.08738435000000067, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-10, 'train_time': 0.9442086889999999, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-10, 'train_time': 1.5569018680000042, 'train_loss': 0.14042984715738735, 'n_iter': 989, 'converged': True, 'coef_sq_norm': 5.320317088094557}
{'solver': 'newton-cg', 'tol': 1e-10, 'train_time': 82.164663167, 'train_loss': 0.14042984696383135, 'n_iter': 10000, 'converged': False, 'coef_sq_norm': 4.625320094498916}
{'solver': 'newton-cholesky', 'tol': 1e-10, 'train_time': 0.10802290200001607, 'train_loss': 0.14042984696383132, 'n_iter': 6, 'converged': True, 'coef_sq_norm': 4.625320521470346}

PR

{'solver': 'lbfgs', 'tol': 0.1, 'train_time': 0.0547862280000011, 'train_loss': 0.16615178354221358, 'n_iter': 6, 'converged': True, 'coef_sq_norm': 0.0032200617446862126}
{'solver': 'lbfgs2', 'tol': 0.1, 'train_time': 0.006202304999998631, 'train_loss': 0.14734670997261817, 'n_iter': 1, 'converged': True, 'coef_sq_norm': 6.64684135078569e-07}
{'solver': 'newton-cg', 'tol': 0.1, 'train_time': 0.05506661599999774, 'train_loss': 0.1661796185551342, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 0.003174488192600458}
{'solver': 'newton-cholesky', 'tol': 0.1, 'train_time': 0.06028625900000151, 'train_loss': 0.14043025337938708, 'n_iter': 3, 'converged': True, 'coef_sq_norm': 4.631295154289646}
{'solver': 'lbfgs', 'tol': 0.01, 'train_time': 0.04548156500000289, 'train_loss': 0.1661475348029601, 'n_iter': 7, 'converged': True, 'coef_sq_norm': 0.0032013835969754412}
{'solver': 'lbfgs2', 'tol': 0.01, 'train_time': 0.007011985999998416, 'train_loss': 0.1473286985544048, 'n_iter': 2, 'converged': True, 'coef_sq_norm': 1.8281154821342873e-06}
{'solver': 'newton-cg', 'tol': 0.01, 'train_time': 0.10966338899999784, 'train_loss': 0.14253932350531165, 'n_iter': 10, 'converged': True, 'coef_sq_norm': 4.54448379889186}
{'solver': 'newton-cholesky', 'tol': 0.01, 'train_time': 0.05508541299999692, 'train_loss': 0.14043025337938708, 'n_iter': 3, 'converged': True, 'coef_sq_norm': 4.631295154289646}
{'solver': 'lbfgs', 'tol': 0.001, 'train_time': 0.1575039270000005, 'train_loss': 0.1410926596996775, 'n_iter': 62, 'converged': True, 'coef_sq_norm': 7.215018478958477}
{'solver': 'lbfgs2', 'tol': 0.001, 'train_time': 0.04400413899999833, 'train_loss': 0.14100378532371347, 'n_iter': 28, 'converged': True, 'coef_sq_norm': 2.2072662996761547}
{'solver': 'newton-cg', 'tol': 0.001, 'train_time': 0.16056455000000014, 'train_loss': 0.1406298353755613, 'n_iter': 13, 'converged': True, 'coef_sq_norm': 8.354132471375014}
{'solver': 'newton-cholesky', 'tol': 0.001, 'train_time': 0.07720229700000303, 'train_loss': 0.14042984697134275, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 4.6253367908387455}
{'solver': 'lbfgs', 'tol': 0.0001, 'train_time': 0.400554906, 'train_loss': 0.14048531587933652, 'n_iter': 144, 'converged': True, 'coef_sq_norm': 9.526369333845913}
{'solver': 'lbfgs2', 'tol': 0.0001, 'train_time': 0.26052846500000015, 'train_loss': 0.14045747921245566, 'n_iter': 163, 'converged': True, 'coef_sq_norm': 4.269596245598988}
{'solver': 'newton-cg', 'tol': 0.0001, 'train_time': 0.26503037400000196, 'train_loss': 0.14042984703087433, 'n_iter': 18, 'converged': True, 'coef_sq_norm': 10.929916601433632}
{'solver': 'newton-cholesky', 'tol': 0.0001, 'train_time': 0.07589458500000035, 'train_loss': 0.14042984697134275, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 4.6253367908387455}
{'solver': 'lbfgs', 'tol': 1e-05, 'train_time': 1.0971080340000015, 'train_loss': 0.14043008004308652, 'n_iter': 424, 'converged': True, 'coef_sq_norm': 10.937908493250825}
{'solver': 'lbfgs2', 'tol': 1e-05, 'train_time': 0.7380069820000017, 'train_loss': 0.14043002383049066, 'n_iter': 448, 'converged': True, 'coef_sq_norm': 5.338573122916383}
{'solver': 'newton-cg', 'tol': 1e-05, 'train_time': 0.2655814160000034, 'train_loss': 0.14042984703087433, 'n_iter': 18, 'converged': True, 'coef_sq_norm': 10.929916601433632}
{'solver': 'newton-cholesky', 'tol': 1e-05, 'train_time': 0.08978460400000188, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-06, 'train_time': 1.8921726939999957, 'train_loss': 0.14042984970062802, 'n_iter': 790, 'converged': True, 'coef_sq_norm': 10.939877674597483}
{'solver': 'lbfgs2', 'tol': 1e-06, 'train_time': 1.3669803999999957, 'train_loss': 0.1404298484389439, 'n_iter': 862, 'converged': True, 'coef_sq_norm': 5.322413335884059}
{'solver': 'newton-cg', 'tol': 1e-06, 'train_time': 0.26148147599999305, 'train_loss': 0.14042984703087433, 'n_iter': 18, 'converged': True, 'coef_sq_norm': 10.929916601433632}
{'solver': 'newton-cholesky', 'tol': 1e-06, 'train_time': 0.09173132100000458, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-07, 'train_time': 2.754424798999999, 'train_loss': 0.14042984731509203, 'n_iter': 1031, 'converged': True, 'coef_sq_norm': 10.93184458138736}
{'solver': 'lbfgs2', 'tol': 1e-07, 'train_time': 1.703804587999997, 'train_loss': 0.14042984702462633, 'n_iter': 1063, 'converged': True, 'coef_sq_norm': 5.323130518770033}
{'solver': 'newton-cg', 'tol': 1e-07, 'train_time': 0.3141715430000005, 'train_loss': 0.1404298469669831, 'n_iter': 19, 'converged': True, 'coef_sq_norm': 10.92855846362707}
{'solver': 'newton-cholesky', 'tol': 1e-07, 'train_time': 0.09161471999999549, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-08, 'train_time': 2.468692041000004, 'train_loss': 0.14042984731509203, 'n_iter': 1031, 'converged': True, 'coef_sq_norm': 10.93184458138736}
{'solver': 'lbfgs2', 'tol': 1e-08, 'train_time': 1.6634506729999998, 'train_loss': 0.14042984702462633, 'n_iter': 1063, 'converged': True, 'coef_sq_norm': 5.323130518770033}
{'solver': 'newton-cg', 'tol': 1e-08, 'train_time': 0.3170536709999965, 'train_loss': 0.1404298469669831, 'n_iter': 19, 'converged': True, 'coef_sq_norm': 10.92855846362707}
{'solver': 'newton-cholesky', 'tol': 1e-08, 'train_time': 0.09082423600000311, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-09, 'train_time': 2.485558386000001, 'train_loss': 0.14042984731509203, 'n_iter': 1031, 'converged': True, 'coef_sq_norm': 10.93184458138736}
{'solver': 'lbfgs2', 'tol': 1e-09, 'train_time': 1.765275618000004, 'train_loss': 0.14042984702462633, 'n_iter': 1063, 'converged': True, 'coef_sq_norm': 5.323130518770033}
{'solver': 'newton-cg', 'tol': 1e-09, 'train_time': 0.5012236150000007, 'train_loss': 0.14042984696383132, 'n_iter': 20, 'converged': True, 'coef_sq_norm': 4.625320095188761}
{'solver': 'newton-cholesky', 'tol': 1e-09, 'train_time': 0.08879031500000423, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-10, 'train_time': 2.4986560609999984, 'train_loss': 0.14042984731509203, 'n_iter': 1031, 'converged': True, 'coef_sq_norm': 10.93184458138736}
{'solver': 'lbfgs2', 'tol': 1e-10, 'train_time': 1.7365250249999988, 'train_loss': 0.14042984702462633, 'n_iter': 1063, 'converged': True, 'coef_sq_norm': 5.323130518770033}
{'solver': 'newton-cg', 'tol': 1e-10, 'train_time': 0.4797208270000013, 'train_loss': 0.14042984696383132, 'n_iter': 20, 'converged': True, 'coef_sq_norm': 4.625320095188761}
{'solver': 'newton-cholesky', 'tol': 1e-10, 'train_time': 0.10664123600000153, 'train_loss': 0.14042984696383132, 'n_iter': 6, 'converged': True, 'coef_sq_norm': 4.62532052146911}
@agramfort @TomDLT @rth @ogrisel ping for linear model fans.

@agramfort
Copy link
Member

I am confused here @lorentzenchr

is it a bug fix for Newton CG?

2 thoughts:

  • SVM and Ridge with L2 reg do not have the 1/n. This allows the C=1 default to be much more reasonable for different n_samples. Basically if n_samples grows you need less regularization.
  • are you suggesting an API breaking change?

@lorentzenchr
Copy link
Member Author
lorentzenchr commented Sep 30, 2023

@agramfort This PR does not change any API. It just translates to and uses the objective 1/n sum(loss) + penalty internally, i.e. for lbfgs and newton-cg (newton-cholesky already uses this). This has very favorable properties for these optimizers which the graphs illustrate.

Please observe the stuck lbfgs (~single point) in the issue #24752 and the curve it has with this PR: Now tol has an effect on the suboptimality (which it does not have in main).

The performance improvements of newton-cg are more of a nice side effect.

Copy link
Member
@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it !

thx @lorentzenchr

@glemaitre glemaitre self-requested a review October 3, 2023 16:12
@glemaitre glemaitre removed their request for review October 3, 2023 16:54
This is needed for more  reliable convergence.
Tests like test_logistic_regressioncv_class_weights then don't raise a convergence error.
@lorentzenchr
Copy link
Member Author

@glemaitre Thank you so much.

Comment on lines 69 to 71
if ret[0] is None:
# line search failed: try different one.
ret = line_search_wolfe2(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info: Codecov complains about no test coverage of those lines. It seems that the new convergence check just above makes this call to line_search_wolfe2 obsolete.

Comment on lines +121 to 125
if 0 <= curv <= 16 * np.finfo(np.float64).eps * psupi_norm2:
# See https://arxiv.o 9E88 rg/abs/1803.02924, Algo 1 Capped Conjugate Gradient.
break
elif curv < 0:
if i > 0:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info: Codecov complains about no test coverage of those lines.
We have only one very simple test for _newton_cg in test_optimize.py. Also the tests for LogisticRegression with newton-cg don't hit these branches.

I don't think blowing up the tests is beneficial for this PR. I could add a TODO note in the code, though.

@lorentzenchr lorentzenchr added this to the 1.4 milestone Oct 6, 2023
@@ -39,8 +40,37 @@ def _line_search_wolfe12(f, fprime, xk, pk, gfk, old_fval, old_old_fval, **kwarg
"""
ret = line_search_wolfe1(f, fprime, xk, pk, gfk, old_fval, old_old_fval, **kwargs)

if ret[0] is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this change should be reflected in an entry in the changelog as well.

Copy link
Member Author
@lorentzenchr lorentzenchr Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean more details than just

...have much better convergence for solvers "lbfgs" and "newton-cg"

I can add a changelog later today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to have an entry in the sklearn.linear_model section specifically tagged as an enhancement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a 2nd entry or add that detail to the existing entry (of this PR)? The latter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a second entry.

Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am convinced with the benchmark and the reasoning here.

@lorentzenchr
Copy link
Member Author

@glemaitre Ready for merge.

@glemaitre glemaitre merged commit dda6337 into scikit-learn:main Oct 6, 2023
@glemaitre
Copy link
Member

Thanks @lorentzenchr

@agramfort
Copy link
Member

thx @lorentzenchr !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix scaling of LogisticRegression objective for LBFGS
4 participants
0