-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Defines missing base.is_transformer API #30368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defines missing base.is_transformer API #30368
Conversation
CC @adrinjalali @glemaitre I'm putting this on the 1.6.1 milestone for now, but I'll be okay to have it in 1.6. |
So in one of the previous PR, I could detect that some vectorizer in scikit-learn implementing some Should we change those considering that since they have the right method, they should be considered as transformer? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the mention of is_transformer
instead, or have it check for the presence of the methods instead of relying on the tags.
Alternatively, the tag can be a list of extimator types.
Another reason why I haven't thought of doing a is_transformer
is that I'm not sure what is a transformer. We have estimators which only implement a fit_transform
and no transform
, are they a transformer? Not really, but they kinda are(?)
@@ -857,6 +857,7 @@ class TransformerMixin(_SetOutputMixin): | |||
|
|||
def __sklearn_tags__(self): | |||
tags = super().__sklearn_tags__() | |||
8000 | tags.estimator_type = "transformer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is gonna break many things, since we have a bunch of estimators that are transformers, and another type.
I had noticed this before, and I think it makes sense to only check for presence of transform
and/or fit_transform
to see if something is a transformer.
If an estimator can have multiple types, then it feels better to have an enum like Closing in favor of #30374, which removes the reference to |
Reference Issues/PRs
Related to #29677
What does this implement/fix? Explain your changes.
This
is_transformer
is referenced in theapi_reference
but never defined:scikit-learn/doc/api_reference.py
Line 126 in 1f593bf
This PR defines it.
Any other comments?
I do not think we had a
_estimator_type="transformer"
, so there is no need to check for it.