-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG] Add elastic net penalty to LogisticRegression #11646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Add elastic net penalty to LogisticRegression #11646
Conversation
So you want to set it to None but have the behavior be 0.5? I don't have a strong feeling about it. Otherwise looks like it's going in the right direction :) |
It seems to be working for binary classification and for multiclass when multi_class='ovr'. I'm having a hard time figuring out the intricacies of multi_class='multinomial'.
I just committed a first shot at LogisticRegressionCV. I looks OK for binary classification and multiclass when multiclass='ovr', but does not work for multiclass='multinomial'. I'm pretty sure that the problem comes from this line, which was fine when there was only one parameter to optimize (C), but now it messes up with the format of the scores array. So the computation of the best_index (index of the best parameters), which depends on the score array, is incorrect. I tried various ways of dealing with this but every single change I tried would break other parts of the code. I'll focus on the doc and the examples for now, but any help / insight would be helpful :) |
added warning message is user sets l1_ratio while penalty is not elastic-net
…into elastic_net_linear_classifier
One of the doctests in 3.6 if failing simply because of a different formatting for floats in numpy arrays... Is there any clean way to ignore it? I guess changing the expected output string would break the other versions tests... |
we only do those doctests with the latest numpy, so you should conform
|
…into elastic_net_linear_classifier
Also added some comments that are hopefully clear. Still need to fix refit=False
Marking this as MRG for a first review, looks like LogisticRegressionCV is finally working as expected \o/ |
…into elastic_net_linear_classifier
please ping after the upcoming release
|
…/scikit-learn into elastic_net_linear_classifier
doc/whats_new/v0.21.rst
Outdated
with the 'saga' solver. :issue:`11646` by :user:`Nicolas Hug <NicolasHug>`. | ||
|
||
- |Fix| Fixed a bug in the 'saga' solver where the weights would not be | ||
correctly updated in some cases :issue:`11646` by `Tom Dupre la Tour`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @TomDLT , please feel free to modify this entry for a more accurate description. I don't understand the bug fix in details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a . after "some cases"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !
@@ -799,8 +813,8 @@ In a nutshell, the following table summarizes the penalties supported by each so | |||
| Robust to unscaled datasets | yes | yes | yes | no | no | | |||
+------------------------------+-----------------+-------------+-----------------+-----------+------------+ | |||
|
|||
The "saga" solver is often the best choice but requires scaling. The "liblinear" solver is | |||
used by default for historical reasons. | |||
The "saga" solver is often the best choice but requires scaling. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated but we're moving the default to lbfgs and there is no mention of this in the summary. That's a different PR though.
doc/modules/linear_model.rst
Outdated
"multinomial", an optimal C is obtained by minimizing the cross-entropy | ||
loss. | ||
multiclass case, if `multi_class` option is set to "ovr", an optimal C and | ||
l1_ratio is obtained for each class and if the `multi_class` option is set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, we're not using accuracy here or scoring? That seems odd. Is that wording really correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LogisticRegressionCV selects best parameters according to its scoring
parameter.
The default scoring option used is 'accuracy'.
We should clarify the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NicolasHug can you please fix this, even though it wasn't your error?
sklearn/linear_model/logistic.py
Outdated
it can be changed using the cv parameter. In the case of newton-cg and | ||
lbfgs solvers, we warm start along the path i.e guess the initial | ||
coefficients of the present fit to be the coefficients got after | ||
convergence in the previous fit, so it is supposed to be faster for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The language here is not that great but not your fault...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nitpick...
doc/whats_new/v0.21.rst
Outdated
with the 'saga' solver. :issue:`11646` by :user:`Nicolas Hug <NicolasHug>`. | ||
|
||
- |Fix| Fixed a bug in the 'saga' solver where the weights would not be | ||
correctly updated in some cases :issue:`11646` by `Tom Dupre la Tour`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a . after "some cases"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good apart from nitpicks.
@pytest.mark.parametrize('C', np.logspace(-3, 2, 4)) | ||
@pytest.mark.parametrize('l1_ratio', [.1, .5, .9]) | ||
def test_elastic_net_versus_sgd(C, l1_ratio): | ||
# Compare elasticnet penatly in LogisticRegression() and SGD(loss='log') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
penalty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry yeah, will do it tomorrow!
cv=n_folds, l1_ratios=l1_ratios, | ||
random_state=0) | ||
lrcv.fit(X, y) | ||
coefs_paths = np.asarray(list(lrcv.coefs_paths_.values())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if l1_ratios is not specified, do we squeeze? Otherwise this would be a backward-incompatible change, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not fixed yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to squeeze because if l1_ratios
is None then we just don't even add the new dimension (see here).
The existing test that check the shapes of the attributes is left unchanged (test_logistic_cv
)
# make sure LogisticRegressionCV gives same best params (l1 and C) as | ||
# GridSearchCV when penalty is elasticnet | ||
|
||
if multi_class == 'ovr': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so binary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, multiclass ovr is treated below in test_LogisticRegressionCV_GridSearchCV_elastic_net_ovr
, just added a comment
sklearn/linear_model/logistic.py
Outdated
# _log_reg_scoring_path will output different shapes depending on the | ||
# multi_class param, so we need to reshape the outputs accordingly. | ||
# After reshaping, | ||
# - scores is of shape (n_classes X n_folds X n_Cs . n_l1_ratios) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the shape of Cs and why do we always have to take C[0]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cs is of shape (n_classes . n_folds . n_l1_ratios, n_Cs)
and all the rows are equal, so we just take the first one.
I added a comment.
) | ||
# equiv to coefs_paths = np.moveaxis(coefs_paths, (0, 1, 2, 3), | ||
# (1, 2, 0, 3)) | ||
coefs_paths = np.swapaxes(coefs_paths, 0, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you find two swapaxes easier to read than one rollaxis? I don't actually have an opinion as the comment above makes it pretty clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did at the time when I was trying to break it down but not anymore ^^
test failing... |
Aaah... Sorry |
thanks! |
Hurray!!
|
Great ! 🎊 |
Exciting!
|
* First draft on elasticnet penaly for LogisticRegression * Some basic tests * Doc update * First draft for LogisticRegressionCV. It seems to be working for binary classification and for multiclass when multi_class='ovr'. I'm having a hard time figuring out the intricacies of multi_class='multinomial'. * Changed default to None for l1_ratio. added warning message is user sets l1_ratio while penalty is not elastic-net * Some more doc * Updated example to plot elastic net sparsity * Fixed flake8 * Fixed test by not modifying attribute in fit * Fixed doc issues * WIP * Partially fixed logistic_reg_CV for multinomial. Also added some comments that are hopefully clear. Still need to fix refit=False * Fixed doc issue * WIP * Fixed test for refit=False in LogisticRegressionCV * Fixed Python 2 numpy version issue * minor doc updates * Weird doc error... * Added test to ensure that elastic net is at least as good as L1 or L2 once l1_ratio has been optimized with grid search Also addressed minor reviews * Fixed test * addressed comments * Added back ignore warning on tests * Added a functional test * Scale data in test... Now failing * elastic-net --> elasticnet * Updated doc for some attributes and checked their shape in tests * Added l1_ratio dimension to coefs_paths and scores attr * improve example + fix test * FIX incorrect lagged_update in SAGA * Add non-regression test for SAGA's bug * FIX flake8 and warning * Re fixed warning * Updated some tests * Addressed comments * more comments and added dimension to LogisticRegressionCV.n_iter_ attribute * Updated whatsnew for 0.21 * better doc shape looks * Fixed whatnew entry after merges * Added dot * Addressed comments + standardized optional default param docstrings * Addessed comments * use swapaxes instead of unsupported moveaxis (hopefully fixes tests)
…earn#11646)" This reverts commit ff2b430.
…earn#11646)" This reverts commit ff2b430.
* First draft on elasticnet penaly for LogisticRegression * Some basic tests * Doc update * First draft for LogisticRegressionCV. It seems to be working for binary classification and for multiclass when multi_class='ovr'. I'm having a hard time figuring out the intricacies of multi_class='multinomial'. * Changed default to None for l1_ratio. added warning message is user sets l1_ratio while penalty is not elastic-net * Some more doc * Updated example to plot elastic net sparsity * Fixed flake8 * Fixed test by not modifying attribute in fit * Fixed doc issues * WIP * Partially fixed logistic_reg_CV for multinomial. Also added some comments that are hopefully clear. Still need to fix refit=False * Fixed doc issue * WIP * Fixed test for refit=False in LogisticRegressionCV * Fixed Python 2 numpy version issue * minor doc updates * Weird doc error... * Added test to ensure that elastic net is at least as good as L1 or L2 once l1_ratio has been optimized with grid search Also addressed minor reviews * Fixed test * addressed comments * Added back ignore warning on tests * Added a functional test * Scale data in test... Now failing * elastic-net --> elasticnet * Updated doc for some attributes and checked their shape in tests * Added l1_ratio dimension to coefs_paths and scores attr * improve example + fix test * FIX incorrect lagged_update in SAGA * Add non-regression test for SAGA's bug * FIX flake8 and warning * Re fixed warning * Updated some tests * Addressed comments * more comments and added dimension to LogisticRegressionCV.n_iter_ attribute * Updated whatsnew for 0.21 * better doc shape looks * Fixed whatnew entry after merges * Added dot * Addressed comments + standardized optional default param docstrings * Addessed comments * use swapaxes instead of unsupported moveaxis (hopefully fixes tests)
Reference Issues/PRs
Closes #8288
What does this implement/fix? Explain your changes.
This PR allows the use of
penalty='elastic-net'
for theLogisticRegression
class. Elastic net is already available in the saga solver, it's just not exposed yet.Any other comments?
I set thel1_ratio
parameter to 0.5 by default, because that's the default inElasticNet
. Should I set it toNone
? This would allow to raise an error if it's set to something else whilepenalty != 'elastic-net'
.EDIT: I set it to None and issue warning if penalty != elastic-net
TODO
penalty='elastic-net'
inLogisticRegression
penalty='elastic-net'
inLogisticRegressionCV