-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
BUG Adds attributes back to check_is_fitted #15947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8c5252b
9c970fa
699dcfe
9d776ff
3122da3
129d261
851b319
6dc82d7
34505cb
10d8e4d
6208f48
24e5011
111ee04
be86ffa
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,6 +60,14 @@ Changelog | |
- |Fix| :func:`utils.check_array` now correctly converts pandas DataFrame with | ||
boolean columns to floats. :pr:`15797` by `Thomas Fan`_. | ||
|
||
- |Fix| :func:`utils.check_is_fitted` accepts back an explicit ``attributes`` | ||
argument to check for specific attributes as explicit markers of a fitted | ||
estimator. When no explicit ``attributes`` are provided, only the attributes | ||
ending with a single "_" are used as "fitted" markers. The ``all_or_any`` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should be |
||
argument is also no longer deprecated. This change is made to | ||
restore some backward compatibility with the behavior of this utility in | ||
version 0.21. :pr:`15947` by `Thomas Fan`_. | ||
|
||
.. _changes_0_22: | ||
|
||
Version 0.22.0 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -852,18 +852,29 @@ def check_symmetric(array, tol=1E-10, raise_warning=True, | |
return array | ||
|
||
|
||
def check_is_fitted(estimator, msg=None): | ||
def check_is_fitted(estimator, attributes=None, msg=None, all_or_any=all): | ||
"""Perform is_fitted validation for estimator. | ||
|
||
Checks if the estimator is fitted by verifying the presence of | ||
fitted attributes (ending with a trailing underscore) and otherwise | ||
raises a NotFittedError with the given message. | ||
|
||
This utility is meant to be used internally by estimators themselves, | ||
typically in their own predict / transform methods. | ||
|
||
Parameters | ||
---------- | ||
estimator : estimator instance. | ||
estimator instance for which the check is performed. | ||
|
||
attributes : str, list or tuple of str, default=None | ||
Attribute name(s) given as string or a list/tuple of strings | ||
Eg.: ``["coef_", "estimator_", ...], "coef_"`` | ||
|
||
If `None`, `estimator` is considered fitted if there exist an | ||
attribute that ends with a underscore and does not start with double | ||
underscore. | ||
|
||
msg : string | ||
The default error message is, "This %(name)s instance is not fitted | ||
yet. Call 'fit' with appropriate arguments before using this | ||
|
@@ -874,6 +885,9 @@ def check_is_fitted(estimator, msg=None): | |
|
||
Eg. : "Estimator, %(name)s, must be fitted before sparsifying". | ||
|
||
all_or_any : callable, {all, any}, default all | ||
Specify whether all or any of the given attributes must exist. | ||
|
||
Returns | ||
------- | ||
None | ||
|
@@ -892,9 +906,13 @@ def check_is_fitted(estimator, msg=None): | |
if not hasattr(estimator, 'fit'): | ||
raise TypeError("%s is not an estimator instance." % (estimator)) | ||
|
||
attrs = [v for v in vars(estimator) | ||
if (v.endswith("_") or v.startswith("_")) | ||
and not v.startswith("__")] | ||
if attributes is not None: | ||
if not isinstance(attributes, (list, tuple)): | ||
attributes = [attributes] | ||
attrs = all_or_any([hasattr(estimator, attr) for attr in attributes]) | ||
else: | ||
attrs = [v for v in vars(estimator) | ||
if v.endswith("_") and not v.startswith("__")] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do we keep There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because all dunder attributes end with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And more generally, anything that starts with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You're right, but perhaps it's better to remove, because we do not forbid users to create attributes start with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we remove it we will have false positives. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. perhaps a clearer solution is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to keep excluding based on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
but @rth we're talking about double underscore, not single underscore (private variables). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Anyway I guess this is not so important. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I mean technically private variables as in |
||
|
||
if not attrs: | ||
raise NotFittedError(msg % {'name': type(estimator).__name__}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attributes
andall_or_any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's correct syntax highlighting, there should be no need to write each occurrence of attributes as
attribute
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see your point, added a mention about
all_or_any
to what's new.