ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss #26721

lorentzenchr · 2023-06-28T06:10:41Z

Reference Issues/PRs

Fixes #24752, #18074.
Alternative to #27191.

What does this implement/fix? Explain your changes.

This PR changes LinearModelLoss to always use 1/n factor in front, such that the loss is
$\frac{1}{n} \sum_i loss(i) + penalty$.

This is a pure internal change, no API is touched. But coefficients of log reg models may change due to different convergence / stopping of "newton-cg" and "lbfgs" solvers.

Any other comments?

~~This PR is incomplete and test do not pass. If someone else wants to continue, please go on.~~
With this PR, the default setting of LogisticRegression might produce less accurate coefficients. So we might consider increasing tol.

github-actions · 2023-06-28T06:13:01Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: d5dea86. Link to the linter CI: here}

sklearn/linear_model/_logistic.py

lorentzenchr · 2023-09-15T18:31:32Z

@ogrisel friendly ping. This might interest you. I consider it as alternative to #27191.

lorentzenchr · 2023-09-18T17:00:03Z

As of ead068f with the same benchmark code as in #24752 (comment), I get

Note that suboptimality is capped at 1e-12.

To make newton-cg pass, I added e1c2128. Nevertheless, newton-cg seems broken for high accuracy (low tol).

lorentzenchr · 2023-09-18T18:12:30Z

As of fdc0fa3 with fixed curvature criterion in conjugate gradient computation:

Note that suboptimality is capped at 1e-12.

Summary

The lbfgs now depends on tol and follows lbfgs2 as it should. The difference in timing might come from different starting values.
Note the huge perfomance boost of newton-cg as compared to current main as shown in Fix scaling of LogisticRegression objective for LBFGS #24752 (comment).

lorentzenchr · 2023-09-18T18:14:04Z

I could put fdc0fa3 into a separate small PR.

lorentzenchr · 2023-09-18T22:28:04Z

For a better comparison, here main vs this PR:

Numbers in details

MAIN:

{'solver': 'lbfgs', 'tol': 0.1, 'train_time': 0.9280499740000003, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 0.1, 'train_time': 0.0053529480000023, 'train_loss': 0.14734670997261817, 'n_iter': 1, 'converged': True, 'coef_sq_norm': 6.64684135078569e-07}
{'solver': 'newton-cg', 'tol': 0.1, 'train_time': 0.3840892850000017, 'train_loss': 0.14042987625601847, 'n_iter': 27, 'converged': True, 'coef_sq_norm': 10.88314051844814}
{'solver': 'newton-cholesky', 'tol': 0.1, 'train_time': 0.05441616799999949, 'train_loss': 0.14043025337938708, 'n_iter': 3, 'converged': True, 'coef_sq_norm': 4.631295154288293}
{'solver': 'lbfgs', 'tol': 0.01, 'train_time': 0.9241804390000006, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 0.01, 'train_time': 0.006164442000002879, 'train_loss': 0.1473286985544048, 'n_iter': 2, 'converged': True, 'coef_sq_norm': 1.828115482134254e-06}
{'solver': 'newton-cg', 'tol': 0.01, 'train_time': 0.4239984650000004, 'train_loss': 0.14042984699893613, 'n_iter': 30, 'converged': True, 'coef_sq_norm': 10.928226802785039}
{'solver': 'newton-cholesky', 'tol': 0.01, 'train_time': 0.05675666000000135, 'train_loss': 0.14043025337938708, 'n_iter': 3, 'converged': True, 'coef_sq_norm': 4.631295154288293}
{'solver': 'lbfgs', 'tol': 0.001, 'train_time': 0.9546782530000009, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 0.001, 'train_time': 0.04396353799999986, 'train_loss': 0.1410037853237034, 'n_iter': 28, 'converged': True, 'coef_sq_norm': 2.2072662996482353}
{'solver': 'newton-cg', 'tol': 0.001, 'train_time': 0.44321807600000085, 'train_loss': 0.14042984696849187, 'n_iter': 31, 'converged': True, 'coef_sq_norm': 10.928352061376081}
{'solver': 'newton-cholesky', 'tol': 0.001, 'train_time': 0.07081544800000117, 'train_loss': 0.14042984697134275, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 4.6253367908381}
{'solver': 'lbfgs', 'tol': 0.0001, 'train_time': 1.1397508389999977, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 0.0001, 'train_time': 0.3360903180000001, 'train_loss': 0.14044675942032672, 'n_iter': 210, 'converged': True, 'coef_sq_norm': 4.700817129197633}
{'solver': 'newton-cg', 'tol': 0.0001, 'train_time': 0.5054758409999991, 'train_loss': 0.1404298469669997, 'n_iter': 32, 'converged': True, 'coef_sq_norm': 10.928575906731304}
{'solver': 'newton-cholesky', 'tol': 0.0001, 'train_time': 0.0731701890000025, 'train_loss': 0.14042984697134275, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 4.6253367908381}
{'solver': 'lbfgs', 'tol': 1e-05, 'train_time': 0.9715823770000007, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-05, 'train_time': 0.712943328999998, 'train_loss': 0.14043009862811637, 'n_iter': 438, 'converged': True, 'coef_sq_norm': 5.259620136224546}
{'solver': 'newton-cg', 'tol': 1e-05, 'train_time': 0.49284152599999587, 'train_loss': 0.140429846966983, 'n_iter': 33, 'converged': True, 'coef_sq_norm': 10.928557742500834}
{'solver': 'newton-cholesky', 'tol': 1e-05, 'train_time': 0.09648041899999527, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-06, 'train_time': 1.001717947000003, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-06, 'train_time': 1.2391906020000008, 'train_loss': 0.14042984933998484, 'n_iter': 786, 'converged': True, 'coef_sq_norm': 5.3230611460017}
{'solver': 'newton-cg', 'tol': 1e-06, 'train_time': 0.9129552300000014, 'train_loss': 0.14042984696383132, 'n_iter': 36, 'converged': True, 'coef_sq_norm': 4.625320094695539}
{'solver': 'newton-cholesky', 'tol': 1e-06, 'train_time': 0.08922410899999988, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-07, 'train_time': 1.0032121049999958, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-07, 'train_time': 1.958135256999995, 'train_loss': 0.14042984715738735, 'n_iter': 989, 'converged': True, 'coef_sq_norm': 5.320317088094557}
{'solver': 'newton-cg', 'tol': 1e-07, 'train_time': 0.9189918009999971, 'train_loss': 0.14042984696383132, 'n_iter': 36, 'converged': True, 'coef_sq_norm': 4.625320094695539}
{'solver': 'newton-cholesky', 'tol': 1e-07, 'train_time': 0.08916410199999802, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-08, 'train_time': 0.9249113079999987, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-08, 'train_time': 1.5889090729999964, 'train_loss': 0.14042984715738735, 'n_iter': 989, 'converged': True, 'coef_sq_norm': 5.320317088094557}
{'solver': 'newton-cg', 'tol': 1e-08, 'train_time': 0.9037554300000039, 'train_loss': 0.14042984696383132, 'n_iter': 36, 'converged': True, 'coef_sq_norm': 4.625320094695539}
{'solver': 'newton-cholesky', 'tol': 1e-08, 'train_time': 0.09718773900000599, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-09, 'train_time': 1.0128275730000027, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-09, 'train_time': 1.5980588339999997, 'train_loss': 0.14042984715738735, 'n_iter': 989, 'converged': True, 'coef_sq_norm': 5.320317088094557}
{'solver': 'newton-cg', 'tol': 1e-09, 'train_time': 0.9853368599999968, 'train_loss': 0.14042984696383132, 'n_iter': 37, 'converged': True, 'coef_sq_norm': 4.625320094498774}
{'solver': 'newton-cholesky', 'tol': 1e-09, 'train_time': 0.08738435000000067, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.6253205214094635}
{'solver': 'lbfgs', 'tol': 1e-10, 'train_time': 0.9442086889999999, 'train_loss': 0.14043095980396708, 'n_iter': 371, 'converged': True, 'coef_sq_norm': 10.681106815883926}
{'solver': 'lbfgs2', 'tol': 1e-10, 'train_time': 1.5569018680000042, 'train_loss': 0.14042984715738735, 'n_iter': 989, 'converged': True, 'coef_sq_norm': 5.320317088094557}
{'solver': 'newton-cg', 'tol': 1e-10, 'train_time': 82.164663167, 'train_loss': 0.14042984696383135, 'n_iter': 10000, 'converged': False, 'coef_sq_norm': 4.625320094498916}
{'solver': 'newton-cholesky', 'tol': 1e-10, 'train_time': 0.10802290200001607, 'train_loss': 0.14042984696383132, 'n_iter': 6, 'converged': True, 'coef_sq_norm': 4.625320521470346}

PR

{'solver': 'lbfgs', 'tol': 0.1, 'train_time': 0.0547862280000011, 'train_loss': 0.16615178354221358, 'n_iter': 6, 'converged': True, 'coef_sq_norm': 0.0032200617446862126}
{'solver': 'lbfgs2', 'tol': 0.1, 'train_time': 0.006202304999998631, 'train_loss': 0.14734670997261817, 'n_iter': 1, 'converged': True, 'coef_sq_norm': 6.64684135078569e-07}
{'solver': 'newton-cg', 'tol': 0.1, 'train_time': 0.05506661599999774, 'train_loss': 0.1661796185551342, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 0.003174488192600458}
{'solver': 'newton-cholesky', 'tol': 0.1, 'train_time': 0.06028625900000151, 'train_loss': 0.14043025337938708, 'n_iter': 3, 'converged': True, 'coef_sq_norm': 4.631295154289646}
{'solver': 'lbfgs', 'tol': 0.01, 'train_time': 0.04548156500000289, 'train_loss': 0.1661475348029601, 'n_iter': 7, 'converged': True, 'coef_sq_norm': 0.0032013835969754412}
{'solver': 'lbfgs2', 'tol': 0.01, 'train_time': 0.007011985999998416, 'train_loss': 0.1473286985544048, 'n_iter': 2, 'converged': True, 'coef_sq_norm': 1.8281154821342873e-06}
{'solver': 'newton-cg', 'tol': 0.01, 'train_time': 0.10966338899999784, 'train_loss': 0.14253932350531165, 'n_iter': 10, 'converged': True, 'coef_sq_norm': 4.54448379889186}
{'solver': 'newton-cholesky', 'tol': 0.01, 'train_time': 0.05508541299999692, 'train_loss': 0.14043025337938708, 'n_iter': 3, 'converged': True, 'coef_sq_norm': 4.631295154289646}
{'solver': 'lbfgs', 'tol': 0.001, 'train_time': 0.1575039270000005, 'train_loss': 0.1410926596996775, 'n_iter': 62, 'converged': True, 'coef_sq_norm': 7.215018478958477}
{'solver': 'lbfgs2', 'tol': 0.001, 'train_time': 0.04400413899999833, 'train_loss': 0.14100378532371347, 'n_iter': 28, 'converged': True, 'coef_sq_norm': 2.2072662996761547}
{'solver': 'newton-cg', 'tol': 0.001, 'train_time': 0.16056455000000014, 'train_loss': 0.1406298353755613, 'n_iter': 13, 'converged': True, 'coef_sq_norm': 8.354132471375014}
{'solver': 'newton-cholesky', 'tol': 0.001, 'train_time': 0.07720229700000303, 'train_loss': 0.14042984697134275, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 4.6253367908387455}
{'solver': 'lbfgs', 'tol': 0.0001, 'train_time': 0.400554906, 'train_loss': 0.14048531587933652, 'n_iter': 144, 'converged': True, 'coef_sq_norm': 9.526369333845913}
{'solver': 'lbfgs2', 'tol': 0.0001, 'train_time': 0.26052846500000015, 'train_loss': 0.14045747921245566, 'n_iter': 163, 'converged': True, 'coef_sq_norm': 4.269596245598988}
{'solver': 'newton-cg', 'tol': 0.0001, 'train_time': 0.26503037400000196, 'train_loss': 0.14042984703087433, 'n_iter': 18, 'converged': True, 'coef_sq_norm': 10.929916601433632}
{'solver': 'newton-cholesky', 'tol': 0.0001, 'train_time': 0.07589458500000035, 'train_loss': 0.14042984697134275, 'n_iter': 4, 'converged': True, 'coef_sq_norm': 4.6253367908387455}
{'solver': 'lbfgs', 'tol': 1e-05, 'train_time': 1.0971080340000015, 'train_loss': 0.14043008004308652, 'n_iter': 424, 'converged': True, 'coef_sq_norm': 10.937908493250825}
{'solver': 'lbfgs2', 'tol': 1e-05, 'train_time': 0.7380069820000017, 'train_loss': 0.14043002383049066, 'n_iter': 448, 'converged': True, 'coef_sq_norm': 5.338573122916383}
{'solver': 'newton-cg', 'tol': 1e-05, 'train_time': 0.2655814160000034, 'train_loss': 0.14042984703087433, 'n_iter': 18, 'converged': True, 'coef_sq_norm': 10.929916601433632}
{'solver': 'newton-cholesky', 'tol': 1e-05, 'train_time': 0.08978460400000188, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-06, 'train_time': 1.8921726939999957, 'train_loss': 0.14042984970062802, 'n_iter': 790, 'converged': True, 'coef_sq_norm': 10.939877674597483}
{'solver': 'lbfgs2', 'tol': 1e-06, 'train_time': 1.3669803999999957, 'train_loss': 0.1404298484389439, 'n_iter': 862, 'converged': True, 'coef_sq_norm': 5.322413335884059}
{'solver': 'newton-cg', 'tol': 1e-06, 'train_time': 0.26148147599999305, 'train_loss': 0.14042984703087433, 'n_iter': 18, 'converged': True, 'coef_sq_norm': 10.929916601433632}
{'solver': 'newton-cholesky', 'tol': 1e-06, 'train_time': 0.09173132100000458, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-07, 'train_time': 2.754424798999999, 'train_loss': 0.14042984731509203, 'n_iter': 1031, 'converged': True, 'coef_sq_norm': 10.93184458138736}
{'solver': 'lbfgs2', 'tol': 1e-07, 'train_time': 1.703804587999997, 'train_loss': 0.14042984702462633, 'n_iter': 1063, 'converged': True, 'coef_sq_norm': 5.323130518770033}
{'solver': 'newton-cg', 'tol': 1e-07, 'train_time': 0.3141715430000005, 'train_loss': 0.1404298469669831, 'n_iter': 19, 'converged': True, 'coef_sq_norm': 10.92855846362707}
{'solver': 'newton-cholesky', 'tol': 1e-07, 'train_time': 0.09161471999999549, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-08, 'train_time': 2.468692041000004, 'train_loss': 0.14042984731509203, 'n_iter': 1031, 'converged': True, 'coef_sq_norm': 10.93184458138736}
{'solver': 'lbfgs2', 'tol': 1e-08, 'train_time': 1.6634506729999998, 'train_loss': 0.14042984702462633, 'n_iter': 1063, 'converged': True, 'coef_sq_norm': 5.323130518770033}
{'solver': 'newton-cg', 'tol': 1e-08, 'train_time': 0.3170536709999965, 'train_loss': 0.1404298469669831, 'n_iter': 19, 'converged': True, 'coef_sq_norm': 10.92855846362707}
{'solver': 'newton-cholesky', 'tol': 1e-08, 'train_time': 0.09082423600000311, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-09, 'train_time': 2.485558386000001, 'train_loss': 0.14042984731509203, 'n_iter': 1031, 'converged': True, 'coef_sq_norm': 10.93184458138736}
{'solver': 'lbfgs2', 'tol': 1e-09, 'train_time': 1.765275618000004, 'train_loss': 0.14042984702462633, 'n_iter': 1063, 'converged': True, 'coef_sq_norm': 5.323130518770033}
{'solver': 'newton-cg', 'tol': 1e-09, 'train_time': 0.5012236150000007, 'train_loss': 0.14042984696383132, 'n_iter': 20, 'converged': True, 'coef_sq_norm': 4.625320095188761}
{'solver': 'newton-cholesky', 'tol': 1e-09, 'train_time': 0.08879031500000423, 'train_loss': 0.14042984696383132, 'n_iter': 5, 'converged': True, 'coef_sq_norm': 4.625320521410856}
{'solver': 'lbfgs', 'tol': 1e-10, 'train_time': 2.4986560609999984, 'train_loss': 0.14042984731509203, 'n_iter': 1031, 'converged': True, 'coef_sq_norm': 10.93184458138736}
{'solver': 'lbfgs2', 'tol': 1e-10, 'train_time': 1.7365250249999988, 'train_loss': 0.14042984702462633, 'n_iter': 1063, 'converged': True, 'coef_sq_norm': 5.323130518770033}
{'solver': 'newton-cg', 'tol': 1e-10, 'train_time': 0.4797208270000013, 'train_loss': 0.14042984696383132, 'n_iter': 20, 'converged': True, 'coef_sq_norm': 4.625320095188761}
{'solver': 'newton-cholesky', 'tol': 1e-10, 'train_time': 0.10664123600000153, 'train_loss': 0.14042984696383132, 'n_iter': 6, 'converged': True, 'coef_sq_norm': 4.62532052146911}

@agramfort @TomDLT @rth @ogrisel ping for linear model fans.

agramfort · 2023-09-30T15:42:55Z

I am confused here @lorentzenchr

is it a bug fix for Newton CG?

2 thoughts:

SVM and Ridge with L2 reg do not have the 1/n. This allows the C=1 default to be much more reasonable for different n_samples. Basically if n_samples grows you need less regularization.
are you suggesting an API breaking change?

lorentzenchr · 2023-09-30T18:35:37Z

@agramfort This PR does not change any API. It just translates to and uses the objective 1/n sum(loss) + penalty internally, i.e. for lbfgs and newton-cg (newton-cholesky already uses this). This has very favorable properties for these optimizers which the graphs illustrate.

Please observe the stuck lbfgs (~single point) in the issue #24752 and the curve it has with this PR: Now tol has an effect on the suboptimality (which it does not have in main).

The performance improvements of newton-cg are more of a nice side effect.

agramfort

got it !

thx @lorentzenchr

This is needed for more reliable convergence. Tests like test_logistic_regressioncv_class_weights then don't raise a convergence error.

lorentzenchr · 2023-10-03T17:47:05Z

@glemaitre Thank you so much.

This reverts commit ef26813.

lorentzenchr · 2023-10-04T11:01:05Z

sklearn/utils/optimize.py

    if ret[0] is None:
        # line search failed: try different one.
        ret = line_search_wolfe2(


Info: Codecov complains about no test coverage of those lines. It seems that the new convergence check just above makes this call to line_search_wolfe2 obsolete.

lorentzenchr · 2023-10-04T11:10:45Z

sklearn/utils/optimize.py

+        if 0 <= curv <= 16 * np.finfo(np.float64).eps * psupi_norm2:
+            # See https://arxiv.org/abs/1803.02924, Algo 1 Capped Conjugate Gradient.
            break
        elif curv < 0:
            if i > 0:


Info: Codecov complains about no test coverage of those lines.
We have only one very simple test for _newton_cg in test_optimize.py. Also the tests for LogisticRegression with newton-cg don't hit these branches.

I don't think blowing up the tests is beneficial for this PR. I could add a TODO note in the code, though.

glemaitre · 2023-10-06T12:50:09Z

sklearn/utils/optimize.py

@@ -39,8 +40,37 @@ def _line_search_wolfe12(f, fprime, xk, pk, gfk, old_fval, old_old_fval, **kwarg
    """
    ret = line_search_wolfe1(f, fprime, xk, pk, gfk, old_fval, old_old_fval, **kwargs)

+    if ret[0] is None:


I think that this change should be reflected in an entry in the changelog as well.

You mean more details than just

...have much better convergence for solvers "lbfgs" and "newton-cg"

I can add a changelog later today.

I meant to have an entry in the sklearn.linear_model section specifically tagged as an enhancement.

Do you mean a 2nd entry or add that detail to the existing entry (of this PR)? The latter?

I meant a second entry.

glemaitre

I am convinced with the benchmark and the reasoning here.

lorentzenchr · 2023-10-06T17:10:19Z

@glemaitre Ready for merge.

glemaitre · 2023-10-06T21:49:43Z

Thanks @lorentzenchr

agramfort · 2023-10-07T07:14:48Z

thx @lorentzenchr !

…it-learn#26721) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

github-actions bot added the module:linear_model label Jun 28, 2023

lorentzenchr marked this pull request as draft June 28, 2023 06:11

lorentzenchr mentioned this pull request Jun 28, 2023

Fix scaling of LogisticRegression objective for LBFGS #24752

Closed

WIP to be continued

fc0f0e7

lorentzenchr force-pushed the linear_loss_normalize branch from b6254c5 to fc0f0e7 Compare June 28, 2023 06:59

OmarManzoor reviewed Aug 10, 2023

View reviewed changes

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved

lorentzenchr added 3 commits September 15, 2023 14:48

Merge branch 'main' into linear_loss_normalize

ab75fae

ENH improve line search of newton_cg for tiny loss improvements

e1c2128

ENH sample weight rescaling after class weights

ead068f

lorentzenchr marked this pull request as ready for review September 15, 2023 18:27

lorentzenchr changed the title ~~WIP 1/n * loss in LinearModelLoss~~ ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss Sep 15, 2023

ENH fix curvature condition in CG

fdc0fa3

lorentzenchr added 2 commits September 18, 2023 23:58

ENH add verbose to _newton_cg

ef26813

Merge branch 'main' into linear_loss_normalize

a6f0989

lorentzenchr mentioned this pull request Sep 19, 2023

ENH inplace addition in newton_cg #27417

Merged

lorentzenchr added Performance Numerical Stability labels Sep 19, 2023

lorentzenchr mentioned this pull request Sep 19, 2023

ENH: optimize: small curvature check in _minimize_newtoncg scipy/scipy#19268

Draft

agramfort approved these changes Oct 1, 2023

View reviewed changes

lorentzenchr added 3 commits October 1, 2023 14:17

DOC add whatsnew

b1aae34

DOC add changed models entry

fc96f0d

Merge branch 'main' into linear_loss_normalize

514f605

glemaitre self-requested a review October 3, 2023 16:12

apply pre-commit on _logistic.py

fce33e6

glemaitre removed their request for review October 3, 2023 16:54

ENH increase maxls in lbfgs like in GLMs

3a4b7b4

This is needed for more reliable convergence. Tests like test_logistic_regressioncv_class_weights then don't raise a convergence error.

lorentzenchr added 2 commits October 3, 2023 20:55

TST increase tol of LogisticRegressionCV in test_balance_property

573fea1

Merge branch 'main' into linear_loss_normalize

4962c9f

lorentzenchr mentioned this pull request Oct 3, 2023

ENH add verbosity to newton-cg solver #27526

Merged

Revert "ENH add verbose to _newton_cg"

f32c90f

This reverts commit ef26813.

lorentzenchr commented Oct 4, 2023

View reviewed changes

lorentzenchr added this to the 1.4 milestone Oct 6, 2023

MNT add TODO note for old line search branch

511b142

lorentzenchr mentioned this pull request Oct 6, 2023

Allow LogisticRegression with lbfgs solver to control maxfun parameter of solver #27484

Open

glemaitre self-requested a review October 6, 2023 11:47

glemaitre reviewed Oct 6, 2023

View reviewed changes

glemaitre approved these changes Oct 6, 2023

View reviewed changes

DOC add 2nd whatsnew entry

d5dea86

glemaitre merged commit dda6337 into scikit-learn:main Oct 6, 2023

lorentzenchr deleted the linear_loss_normalize branch October 7, 2023 08:45

This was referenced Oct 7, 2023

LogisticRegression with lbfgs solver terminates before convergence #18074

Closed

FIX LogisticRegression's handling of the tol parameter with solver="lbfgs" #27191

Closed

ogrisel mentioned this pull request Oct 24, 2023

TST stability problem for test_logistic_regressioncv_class_weights #27649

Merged

glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Oct 31, 2023

ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss (scik…

56e283c

…it-learn#26721) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023

ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss (scik…

518398a

…it-learn#26721) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

jeremiedbb mentioned this pull request Mar 26, 2024

BUG loss of precision in LogisticRegression as of version 1.4 #28700

Closed

Alexsandruss mentioned this pull request Jul 1, 2024

Fix for Logistic Regression loss scaling uxlfoundation/scikit-learn-intelex#1908

Merged

8 tasks

mergify bot mentioned this pull request Jul 17, 2024

Fix for Logistic Regression loss scaling (backport #1908) uxlfoundation/scikit-learn-intelex#1943

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss #26721

ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss #26721

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss #26721

ENH scaling of LogisticRegression loss as 1/n * LinearModelLoss #26721

Uh oh!

Conversation

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Uh oh!

✔️ Linting Passed

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!