FEA Implement classical MDS#31322
Conversation
|
Pinging @antoinebaker for a review :-) I have been doing some updates to this PR but am now finished with it. |
|
Thanks for the PR @dkobak ! Could you maybe use this PR to implement ClassicalMDS only, and do the enhancements of MDS in a separate / follow up PR ? It will ease the reviewing / merging process I think. |
All right. I took all changes to the |
|
Could you please add scikit-learn/sklearn/utils/_test_common/instance_generator.py Lines 185 to 187 in 4493f86 Then it will be tested for common checks on estimators in |
|
Hi @antoinebaker. I think the |
There was a problem hiding this comment.
Here a first round of reviews
Ah my bad, I think you're right. If |
doc/whats_new/upcoming_changes/sklearn.manifold/31322.feature.rst
Outdated
Show resolved
Hide resolved
|
I'm not sure if we discussed this, but I favor this comment (#22330 (comment)) (except the default value) as an overall API, instead of introducing a new class. Any blockers for doing so? |
|
@adrinjalali: Yes, we did discuss it. Just below the comment you linked to, @antoinebaker gave detailed reasons for why he prefers a separate class, please see here: #22330 (comment). He convinced me, and you wrote "I'm happy with the suggestions here" (#22330 (comment)), which is why I implemented a separate class... |
|
Hmm. Yeah fair enough. Just to avoid future surprises, maybe @lorentzenchr @GaelVaroquaux could also give their opinion? |
|
The TLDR of #22330 (comment) is that EDIT: removed outdated comment. |
|
@adrinjalali See above -- I have added examples. Let me know if there is anything else I can do on my side. |
There was a problem hiding this comment.
I read the examples, and overall this LGTM !
The nonmetric MDS seems to perform badly on S-curve and sphere (plot_compare_methods and plot_manifold_sphere). Is this an expected limitation of this method on such dataset ? Or is there something wrong in term of hyperparameters (eg convergence controlled by eps) ? Note that it's not directly related to this PR (which is on ClassicalMDS which seems to perform okay), so I still approve the changes. But it could be nice to fix this nonmetric mds example if possible.
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Yes, I also noticed that. It is not expected and should be fixed. Optimization of non-metric MDS is finicky and I am not sure what exactly is going wrong there. But I would like to switch both metric and non-metric MDS to use classical MDS as default initialization (rather than random initialization), and I think this should take care of this issue. Once this PR is merged, I will prepare a follow-up PR that
|
|
Hi @adrinjalali, just a gentle ping to take another look here. But if you are on holidays then enjoy your holidays :-) |
|
Thanks for the work so far. I'm happy with what we have, except the docstring / userguide for this class. In none of the two places the formulation is explained well enough I feel. Could you please improve the small section in the user guide and also the docstring please? |
|
@adrinjalali Fair enough. I added some details, take a look if this is sufficient now. |
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
There was a problem hiding this comment.
LGTM. Thanks. Since it's a new estimator, I'd like another pair of eyes on this before merging. @adam2392 or @OmarManzoor could have a look maybe?
|
Dear @adam2392, hope you are having a nice summer! Just a gentle reminder that it would be amazing if you could look at this PR :-) Cheers. |
There was a problem hiding this comment.
LGTM. Thanks @dkobak
|
Dear all, as discussed above, now that this PR is merged, I prepared a follow-up PR that modifies the |
Fixes #15272. Supersedes #22330.
This PR implements classical MDS, also known as principal coordinates analysis (PCoA) or Torgerson's scaling, see https://en.wikipedia.org/wiki/Multidimensional_scaling#Classical_multidimensional_scaling. As discussed in #22330, it is implemented as a new class
ClassicalMDS.Simple demonstration:
Classical MDS is also set as default initialization for metric/non-metric MDS in theMDS()class.For consistency, this PR also adds support for non-Euclidean metrics to theMDSclass.