-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ENH Add eigh as a solver in MDS #22330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
One minor note is that since we default On that note, re: @smarie 's comment The term
I do, however, think that the term is confusing at best since it's an overloaded name. Perhaps we ought to consider rephrasing the condition, or clarifying what Euclidean means in this specific context. Also, regarding why this method uses the eigen-solver, the classical metric MDS strategy directly uses the eigenvalue decomposition. |
Ignore all of that. I have removed |
@NicolasHug @adrinjalali I wonder if either of you would be interested in reviewing this |
We recently update our version of black to the stable versionersino. Installing For merge with main and pre-commit should detect the new config and install the new version of scikit-learn/.pre-commit-config.yaml Lines 8 to 9 in abbee57
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this called the "Classical MDS algorithm"?
Can we run a benchmark on a decently sized dataset to compare the new solver and SMACOF?
Sorry for an ignorant question, is this PR missing something or any chance it will make it in 1.2? Excitedly following from the sidelines :) |
Sorry for the late reply -- it is currently waiting on reviewers, so it is not likely to make it into 1.2 unfortunately. |
Has this been integrated into any beta versions to use? |
While I support this PR in principle, I don't like the suggested API and some of the suggested wording in the docs. This PR implements "classical MDS" (aka "principal coordinates analysis", PCoA). Classical MDS optimizes a different loss function compared to metric MDS. The former optimizes what is called "strain" while the latter optimizes what is called "stress". For the "strain" there is an exact solution in terms of eigendecomposition. For the stress there is not, which is why SMACOF algorithm is used instead. These two things are not approximations of each other, and are not supposed to give the same result. For that reason, I don't like Alternatively, we could make classical MDS available via |
@antoinebaker would you be able to have a look here maybe? |
I favor the |
@Micky774 IMHO the default should be |
@Micky774 Are you still planning on updating this PR according to what was discussed above, or should somebody take over? |
Thanks @Micky774 for the PR! For the API, I think we should consider implementing a new estimator As correctly pointed out by @dkobak
I feel that adding a
Keeping What do you think @Micky774, @dkobak and @adrinjalali ? |
Good points @antoinebaker. I don't have a strong opinion here. I find that many people first think of classical MDS when I say "MDS", so that would be an argument in favor of having classical MDS inside the Regarding the name of a new class, I think in the ML community more people are familiar with the term "MDS" than with the term "PCoA", but on the other hand |
|
cc @scikit-learn/core-devs I think this is a nice opportunity to clean up our MDS support a bit so I want to get some opinions on the way forwards here |
I'm happy with the suggestions here. |
Hi @Micky774. Are you still planning to get back to this PR and rework it to implement a new class |
@dkobak I think you can go ahead and have a PR superseding this one. |
@adrinjalali Thanks, I will try to find time for this after my pending PR #31117 gets merged. If you could take a look there, it'd be much appreciated by the way! |
@adrinjalali @antoinebaker I took a stab at it at #31322. |
Reference Issues/PRs
Fixes #15272
Resolves #16067 (Stalled)
What does this implement/fix? Explain your changes.
PR #16067:
Adds implementation of Multidimensional scaling (MDS), which uses Singular Value Decompostion (SVD) method. SVD works much faster and far more accurate for Euclidean matrixes than SMACOF algorithm (current implementation of MDS in sklearn/manifold module).
This PR:
Addresses review comments and provides further forum for discussion regarding API.
Any other comments?