[MRG] Allow only fit_transform to be present in pipeline#16714
[MRG] Allow only fit_transform to be present in pipeline#16714jnboehm wants to merge 43 commits intoscikit-learn:mainfrom
Conversation
This makes it possible to plug in algorithms like TSNE in the pipeline, although they do not support `transform`. A warnign will be raised that the pipeline is not reusable.
There was a problem hiding this comment.
Thanks! It would be nice if you could add some tests in the sklearn/tests/test_pipeline.py file to make sure that the warning and the error are raised in relevant cases.
Sure! Didn't see that earlier. |
According to scikit-learn#16714 (comment) changed the syntax but not semantics
According to the style guide
Also change the test for FeatureUnion
Should be taken care of by the user
flake8 E127
As per scikit-learn#16714 (comment)
|
I'm a bit unsure why the doc test failed, since I didn't change anything there. Could it be related to #17051? |
There was a problem hiding this comment.
Merging with master may fix your issue.
Update doc/whats_new/v0.23.rst Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>
|
The CI crashed again, but now everything seems to be working. Let me know if anything needs fixing. |
|
ping @cmarmo this might be worth a 'waiting for reviewer' tag |
There was a problem hiding this comment.
So my only concern here is someone training a pipeline with a tsne in the middle, expecting that they'll then be able to predict with it. They wait an hour for the model to fit and then ... bam! No predictions for you. I don't want to waste a user's time like that.
The only way I can see around that is to have a parameter control whether this feature is available, or going further, to introduce a TransductivePipeline.
Pipeline is very much designed to solve needs around train/test setups, so I don't want to weaken its usability for that use-case.
Your thoughts?
Yes. |
|
Persons interested in such a usage pattern should open an issue for discussion (not me). Also pinging @cathelineparisot. |
This makes it possible to plug in algorithms like TSNE in the
pipeline, although they do not support
transform. A warning will beraised that the pipeline is not reusable.
Reference Issues/PRs
Fixes #16710
What does this implement/fix? Explain your changes.
This allows algorithms that only implement a
fit_transformfunction, but nottransform(such as TSNE) to be used in a Pipeline in sklearn.