DEP Deprecate n_classes_ in GradientBoostingRegressor #17702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEP Deprecate n_classes_ in GradientBoostingRegressor #17702
Conversation
I think that there is a cleaner solution which might require a bit more refactoring. |
```diff
diff --git a/sklearn/ensemble/_gb.py b/sklearn/ensemble/_gb.py
index 179eb81816..f3223c2d36 100644
--- a/sklearn/ensemble/_gb.py
+++ b/sklearn/ensemble/_gb.py
@@ -165,6 +165,10 @@ class BaseGradientBoosting(BaseEnsemble, metaclass=ABCMeta):
self.n_iter_no_change = n_iter_no_change
self.tol = tol
@@ -265,11 +271,9 @@ class BaseGradientBoosting(BaseEnsemble, metaclass=ABCMeta):
@@ -405,7 +409,11 @@ class BaseGradientBoosting(BaseEnsemble, metaclass=ABCMeta):
@@ -711,15 +719,6 @@ class BaseGradientBoosting(BaseEnsemble, metaclass=ABCMeta):
@@ -1575,6 +1574,11 @@ class GradientBoostingRegressor(RegressorMixin, BaseGradientBoosting):
diff --git a/sklearn/ensemble/_gb_losses.py b/sklearn/ensemble/_gb_losses.py
@@ -445,8 +442,8 @@ class QuantileLossFunction(RegressionLossFunction):
|
So since |
We have an additional change for |
We need to update the test of the loss function where we should not specify the parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go for the next step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It starts to look good. We will need an entry in whats_new/v_0_24.rst in the ensemble section to announce the deprecation
sklearn/ensemble/tests/test_gradient_boosting_loss_functions.py
Outdated
Show resolved
Hide resolved
sklearn/ensemble/tests/test_gradient_boosting_loss_functions.py
Outdated
Show resolved
Hide resolved
sklearn/ensemble/tests/test_gradient_boosting_loss_functions.py
Outdated
Show resolved
Hide resolved
Where is this file? I cannot find it. |
It's If you use vs code, type "ctrl-p" followed by "24" to navigate to it quickly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @simonamaggio !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @thomasjpfan are all changes fine with you?
@deprecated("Attribute n_classes_ was deprecated " # type: ignore | ||
"in version 0.24 and will be removed in 0.26.") | ||
@property | ||
def n_classes_(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This passed our common test? This seems like n_classes_
is defined without calling fit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the common test? I tested with pytest -v sklearn/ensemble/tests/test_gradient_boosting.py
and all tests passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We test all our estimators including GradientBoostingRegressor
with common test (in sklearn/tests/test_common.py
). The one I am thinking about is:
scikit-learn/sklearn/utils/estimator_checks.py
Lines 2425 to 2426 in e97fd14
def check_no_attributes_set_in_init(name, estimator_orig): | |
"""Check setting during init. """ |
More specifically n_classes_
is defined even if the estimator is not fitted:
from sklearn.ensemble import GradientBoostingRegressor
gb = GradientBoostingRegressor()
gb.n_classes_
# 1
From looking at the test, it looks like vars(estimator)
does not pick up the n_classes_
property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GradientBoostingRegressor passed all common tests. In particular:
sklearn/tests/test_common.py::test_estimators[GradientBoostingRegressor()-check_no_attributes_set_in_init] PASSED [ 26%]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is incorrectly passing. n_classes_
should not be defined before fit
is called.
As for this PR, lets update the method:
@property
def n_classes_(self):
try:
check_is_fitted(self)
except NotFittedError as nfe:
raise AttributeError(
"{} object has no n_classes_ attribute."
.format(self.__class__.__name__)
) from nfe
and then have a test to make sure the AttributeError
is raised.
We can update the check_no_attributes_set_in_init
to catch these cases in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the method as suggested. Instead I'm not sure how to create the issue regarding check_no_attributes_set_in_init
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into check_no_attributes_set_in_init
, this PR is almost ready.
Can we add a test to make sure AttributeError
is raised when GradientBoostingRegressor
is not fitted yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this PR is almost ready!
@deprecated("Attribute n_classes_ was deprecated " # type: ignore | ||
"in version 0.24 and will be removed in 0.26.") | ||
@property | ||
def n_classes_(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into check_no_attributes_set_in_init
, this PR is almost ready.
Can we add a test to make sure AttributeError
is raised when GradientBoostingRegressor
is not fitted yet?
Done it now. Sorry I didn't get it the first time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment.
Otherwise LGTM
msg = f"{GradientBoostingRegressor.__name__} object " \ | ||
"has no n_classes_ attribute." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msg = f"{GradientBoostingRegressor.__name__} object " \ | |
"has no n_classes_ attribute." | |
msg = ("GradientBoostingRegressor object " | |
"has no n_classes_ attribute.") |
I solve the conflict and make the small change requested by @thomasjpfan |
Thanks @simonamaggio |
…7702) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Fix #17673
In base class use private
_n_classes
and for classifier create public attributen_classes_
.