8000 DEP loss "log" in favor of "log loss" in SGDClassifier by lorentzenchr · Pull Request #23046 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DEP loss "log" in favor of "log loss" in SGDClassifier #23046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 7, 2022

Conversation

lorentzenchr
Copy link
Member
@lorentzenchr lorentzenchr commented Apr 4, 2022

Reference Issues/PRs

Partially addresses #18248

What does this implement/fix? Explain your changes.

This PR introduces loss="log_loss" for SGDClassifier and deprecates loss="log".

Any other comments?

@lorentzenchr lorentzenchr added this to the 1.1 milestone Apr 4, 2022
@ArturoAmorQ
Copy link
Member
ArturoAmorQ commented Apr 6, 2022

There are some tests that still pass loss="log" to the SGDClassifier and fail when trying to match the expected error and obtaining our new deprecation warning. For example:

diff --git a/sklearn/tests/test_multioutput.py b/sklearn/tests/test_multioutput.py
index ded47c4fe..17b9bd0d7 100644
--- a/sklearn/tests/test_multioutput.py
+++ b/sklearn/tests/test_multioutput.py
@@ -257,7 +257,7 @@ def test_multi_output_classification_partial_fit():
 
 
 def test_multi_output_classification_partial_fit_no_first_classes_exception():
-    sgd_linear_clf = SGDClassifier(loss="log", random_state=1, max_iter=5)
+    sgd_linear_clf = SGDClassifier(loss="log_loss", random_state=1, max_iter=5)
     multi_target_linear = MultiOutputClassifier(sgd_linear_clf)
     msg = "classes must be passed on the first call to partial_fit."
     with pytest.raises(ValueError, match=msg):
A possibly non-exhaustive list of failing tests:
test_elastic_net_versus_sgd[0.1-0.046415888336127795]
test_elastic_net_versus_sgd[0.1-2.1544346900318843]
test_elastic_net_versus_sgd[0.1-100.0]
test_elastic_net_versus_sgd[0.5-0.001]
test_elastic_net_versus_sgd[0.5-0.046415888336127795]
test_elastic_net_versus_sgd[0.5-2.1544346900318843]
test_elastic_net_versus_sgd[0.5-100.0]
test_elastic_net_versus_sgd[0.9-0.001]
test_elastic_net_versus_sgd[0.9-0.046415888336127795]
test_elastic_net_versus_sgd[0.9-2.1544346900318843]
test_elastic_net_versus_sgd[0.9-100.0]
test_stochastic_gradient_loss_param
test_cross_val_predict_method_checking
test_hasattr_multi_output_predict_proba
test_multi_output_classification_partial_fit
test_multi_output_classification_partial_fit_no_first_classes_exception

Another possibility is to use the following line to find the pattern of such tests:
$ git grep -A 10 'SGDClassifier(' | grep -B 10 'loss="log"'

@jeremiedbb
Copy link
Member
jeremiedbb commented Apr 6, 2022

What's new after #23036 is merged.

#23036 is merged :)

Copy link
Member
@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member
@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM up to a few modifications.

Comment on lines +101 to +103
- For :class:`linear_model.SGDClassifier`, the `loss` parameter name
"log" is deprecated in favor of the new name "log_loss".
:pr:`23046` by :user:`Christian Lorentzen <lorentzenchr>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- For :class:`linear_model.SGDClassifier`, the `loss` parameter name
"log" is deprecated in favor of the new name "log_loss".
:pr:`23046` by :user:`Christian Lorentzen <lorentzenchr>`.
- For :class:`linear_model.SGDClassifier`, the `loss` parameter name
`"log"` is deprecated in favor of the new name `"log_loss"`.
:pr:`23046` by :user:`Christian Lorentzen <lorentzenchr>`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We merged #23036 without the backticks, let's keep local consistency

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Copy link
Member
@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeremiedbb jeremiedbb merged commit 0c20ba7 into scikit-learn:main Apr 7, 2022
@lorentzenchr
Copy link
Member Author

@jeremiedbb Again, thanks for finishing this.

jjerphan added a commit to jjerphan/scikit-learn that referenced this pull request Apr 29, 2022
…23046)

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
@lorentzenchr lorentzenchr deleted the call_it_log_loss_sgd branch October 25, 2022 14:14
eddiebergman added a commit to automl/auto-sklearn that referenced this pull request Nov 15, 2022
rusty1s pushed a commit to pyg-team/pytorch_geometric that referenced this pull request Mar 27, 2024
scikit-learn/scikit-learn/pull
82F8
/23046 Introduces `'log_loss'` in
`SGDClassifier` and deprecates `'log'`. This PR addresses this issue in
the repository (no other files in the repo need to be modified for this
issue)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants
0