-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
FEA Implement classical MDS #31322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEA Implement classical MDS #31322
Conversation
Pinging @antoinebaker for a review :-) I have been doing some updates to this PR but am now finished with it. |
Thanks for the PR @dkobak ! Could you maybe use this PR to implement ClassicalMDS only, and do the enhancements of MDS in a separate / follow up PR ? It will ease the reviewing / merging process I think. |
All right. I took all changes to the |
Could you please add scikit-learn/sklearn/utils/_test_common/instance_generator.py Lines 185 to 187 in 4493f86
Then it will be tested for common checks on estimators in |
Hi @antoinebaker. I think the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here a first round of reviews
Ah my bad, I think you're right. If |
doc/whats_new/upcoming_changes/sklearn.manifold/31322.feature.rst
Outdated
Show resolved
Hide resolved
I'm not sure if we discussed this, but I favor this comment (#22330 (comment)) (except the default value) as an overall API, instead of introducing a new class. Any blockers for doing so? |
@adrinjalali: Yes, we did discuss it. Just below the comment you linked to, @antoinebaker gave detailed reasons for why he prefers a separate class, please see here: #22330 (comment). He convinced me, and you wrote "I'm happy with the suggestions here" (#22330 (comment)), which is why I implemented a separate class... |
Hmm. Yeah fair enough. Just to avoid future surprises, maybe @lorentzenchr @GaelVaroquaux could also give their opinion? |
The TLDR of #22330 (comment) is that EDIT: removed outdated comment. |
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Regarding the user guide, it is already there. Regarding examples, good point, I have now added classical MDS (and also non-metric MDS) to (Edit: checked the rendered docs and examples are fine.) |
@adrinjalali See above -- I have added examples. Let me know if there is anything else I can do on my side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the examples, and overall this LGTM !
The nonmetric MDS seems to perform badly on S-curve and sphere (plot_compare_methods
and plot_manifold_sphere
). Is this an expected limitation of this method on such dataset ? Or is there something wrong in term of hyperparameters (eg convergence controlled by eps
) ? Note that it's not directly related to this PR (which is on ClassicalMDS which seems to perform okay), so I still approve the changes. But it could be nice to fix this nonmetric mds example if possible.
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Yes, I also noticed that. It is not expected and should be fixed. Optimization of non-metric MDS is finicky and I am not sure what exactly is going wrong there. But I would like to switch both metric and non-metric MDS to use classical MDS as default initialization (rather than random initialization), and I think this should take care of this issue. Once this PR is merged, I will prepare a follow-up PR that
|
Hi @adrinjalali, just a gentle ping to take another look here. But if you are on holidays then enjoy your holidays :-) |
Thanks for the work so far. I'm happy with what we have, except the docstring / userguide for this class. In none of the two places the formulation is explained well enough I feel. Could you please improve the small section in the user guide and also the docstring please? |
@adrinjalali Fair enough. I added some details, take a look if this is sufficient now. |
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks. Since it's a new estimator, I'd like another pair of eyes on this before merging. @adam2392 or @OmarManzoor could have a look maybe?
Dear @adam2392, hope you are having a nice summer! Just a gentle reminder that it would be amazing if you could look at this PR :-) Cheers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @dkobak
Fixes #15272. Supersedes #22330.
This PR implements classical MDS, also known as principal coordinates analysis (PCoA) or Torgerson's scaling, see https://en.wikipedia.org/wiki/Multidimensional_scaling#Classical_multidimensional_scaling. As discussed in #22330, it is implemented as a new class
ClassicalMDS
.Simple demonstration:
Classical MDS is also set as default initialization for metric/non-metric MDS in theMDS()
class.For consistency, this PR also adds support for non-Euclidean metrics to theMDS
class.