-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Allow the creation of a pipeline for non-parametric functions like TSNE #22162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
35d4cff
to
4046912
Compare
4046912
to
0a6edc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @cathelineparisot ! I think we still need an answer to #16714 (review) before we can continue. I currently agree with the comment:
Pipeline is very much designed to solve needs around train/test setups, so I don't want to weaken its usability for that use-case.
A user can spend a long time fitting a pipeline with a TNSE
in the middle which works because TSNE
has fit_transform
. pipe.predict
would fail because a step in the pipeline does not define transform
. This is a poor user experience because they can spend a long time time in fitting but only finds out at the end that
8000
they can not use it for prediction.
Thank you for your feedback. I understand what you mean. In this case, setting a warning might already be a solution (as it was done in #16714). It is also necessary to see if creating a Pipeline with a TNSE is useful in other cases or only to make a prediction. I don't have the knowledge for that. |
Please update the title to better describe what the PR does. (Same comment for your other #22165 PR) |
I changed the title of my two pull requests. I hope it suits you. I apologize for the bad titles. |
I am closing this PR in favor of #16714 since that PR contains tests and has more discussion around the topic. Thank you for looking into this issue! |
Reference Issues/PRs
Fixes #16710. See also #16714.
What does this implement/fix? Explain your changes.
Calling a pipeline for a non-parametric function like TSNE() returns an error because the transform() function is missing. However, for non-parametric functions, there is no transform() method because there is no projection or mapping, yet we can use dimensionality reduction.
So, we must be able to create a pipeline even if the transform() method does not exist. This PR allows it
Any other comments?
However, I have the impression that the constructor of the current library no longer calls "validate_step()" as I added it in the "init". What is the problem? It was this function that was creating the error and I have modified this function. I am waiting for your feedback.