-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Clarification on Kruskal Stress as an Optimization Target in Metric and Non-metric MDS #30240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
No -- the stress formula is optimized by metric MDS! Note that the square root and the normalization by the summed squared original distances do not change the loss minimum, so essentially the loss is simply squared error between high-dim and low-dim distances. That's minimized by metric MDS. Non-metric MDS allows arbitrary monotonic transformation of low-dim distances, please see here: https://en.wikipedia.org/wiki/Multidimensional_scaling#Non-metric_multidimensional_scaling_(NMDS). Note that I think non-metric MDS in sklearn is broken: #27028 |
Thank you very much for your detailed explanation of metric and nonmetric MDS. I have reviewed Kruskal's seminal papers from 1964: "Nonmetric multidimensional scaling: A numerical method" and "Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis". These works emphasize minimizing stress as the core optimization criterion for nonmetric MDS(As shown in the figure below), and this has also been cited in the sklearn documentation. Your explanation mentioned that metric MDS minimizes stress. Should I understand this as Kruskal stress being originally proposed for nonmetric MDS but having a formula applicable to metric MDS as well? Additionally, you mentioned that nonmetric MDS optimizes for rank preservation instead of directly minimizing Kruskal stress. My understanding was that Kruskal introduced the stress metric specifically as a key optimization target for nonmetric MDS. Furthermore, I noticed that the Wikipedia page you shared describes a stress formula for nonmetric MDS, replacing dij I apologize for my limited understanding and truly value your clarification on this matter. Thank you for your time and expertise! |
Wikipedia's formula is the same as the formulas in Kruskal 1964 papers, but the notation is different. In Kruskal's papers, In Wikipedia, Either way, this is non-metric MDS. If you leave out the monotonic transformation, you get "metric MDS". But I don't think this is explicitly called like that in Kruskal 1964 papers. In fact, I don't know who first coined the term "metric MDS". |
Thank you so much for your detailed explanation! !It has significantly clarified my understanding. I have one further question: When we refer to "Kruskal Stress" today, does it implicitly assume the use of the monotonic transformations required for non-metric MDS? Or can the term "Kruskal Stress" now broadly refer to the squared error between the original high-dimensional distances and the low-dimensional embedding distances, regardless of whether the original distances undergo monotonic transformation? I deeply appreciate your insights and apologize for my limited understanding. Your clarification is invaluable to me! |
I am not sure. I would say that "stress" (without Kruskal's name attached to it) often refers to the loss without monotonic transformation, i.e. metric MDS loss. I guess if you say "Kruskal's stress", then maybe it assumes the monotonic transformation in there? In general, I find that terminology surrounding MDS can be very confusing/ambiguous, so I would recommend to be very precise and give a formula or explain in words what you mean exactly. |
Describe the issue linked to the documentation
I am working on research involving the optimization targets used in metric and non-metric MDS, and I have some questions regarding how scikit-learn's implementation of MDS defines and calculates stress, particularly Kruskal Stress. While reviewing the official documentation, I noticed that specific formulas for stress calculations are not explicitly provided, and I would appreciate some clarification.
Non-metric MDS: My understanding is that non-metric MDS typically minimizes Kruskal Stress, defined as:
in the reduced space. Could you confirm if scikit-learn's non-metric MDS implementation uses this definition, or if it employs an alternative method?
Metric MDS: Does metric MDS in scikit-learn also optimize for Kruskal Stress, or does it use a different stress formula? If a different approach is used, would it be possible to provide some insight or references on the stress function applied here?
Suggest a potential alternative/fix
Documentation Clarification: It would be incredibly helpful if the documentation could include specific details on the stress formulas used in both metric and non-metric MDS. This addition would help researchers and users better understand the theoretical underpinnings of the algorithm in scikit-learn.
Thank you very much for your guidance and clarification on these points. Your insights would be instrumental in my work with MDS.
The text was updated successfully, but these errors were encountered: