[MRG] Feature: calculate normed stress (Stress-1) in sklearn.manifold.MDS #13042

matthieu-pa · 2019-01-24T08:31:22Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This is a follow-up on the stale PRs referenced above, the main diff is the fix for the previously failing unit test:

https://travis-ci.org/scikit-learn/scikit-learn/jobs/437566342#L2818

        stress1 = mds.smacof(sim, normalize=True)[1]
        stress2 = mds.smacof(k * sim, normalize=True)[1]
    
        # Normed stress should be the same for
        # values multiplied by some factor "k"
>       assert_allclose(stress1, stress2)
k          = 2
sim        = array([[0, 5, 3, 4],
       [5, 0, 2, 2],
       [3, 2, 0, 1],
       [4, 2, 1, 0]])
stress1    = 0.025998852705994606
stress2    = 0.033375197002665315
/home/travis/build/scikit-learn/scikit-learn/sklearn/manifold/tests/test_mds.py:78: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python2.7/dist-packages/numpy/testing/utils.py:1183: in assert_allclose
    verbose=verbose, header=header)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

To my understanding, even using normalized stress, smacof() needs to be initialized at same configuration for the property Normed stress should be the same for values multiplied by some factor "k" to be true so I set random_state of smacof() to a fixed value. Dissimilarity matrix also needs to be large enough.

Any other comments?

The previous reviewer was @glemaitre . To my understanding review comments have been addressed but if something is missing, I'll do my best to fix it.

…actor "k"

ADEscobar · 2020-04-19T14:57:31Z

Is there anything still missing for the merge?
The Stress-1 feature is actually quite fundamental to understand if the fit is meaningless or not.

matthieu-pa · 2020-04-20T00:43:17Z

I need to rebase on the latest version of sklearn. If anything else is needed, please let me know ☺️ I have some time to do it within the next 24 hours.

…

On Sun, Apr 19, 2020, 23:57 Antonio Escobar ***@***.***> wrote: Is there anything still missing for the merge? The Stress-1 feature is actually quite fundamental to understand if the fit is meaningless or not. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#13042 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGG7Z5BCQZR7Z7ZH43ZOMFLRNMGOTANCNFSM4GSBPCHA> .

matthieu-pa · 2020-04-21T00:52:06Z

I merged the latest master version into this branch and solved the merge conflict.

ADEscobar · 2020-04-21T10:19:57Z

I merged the latest master version into this branch and solved the merge conflict.

Is it required to compute the Stress-1 in every iteration, or can it be just done for the final (returned) Stress value?

Maybe it can be just a new returned value, instead of a new option. Keeping stress and adding stress_one or stress_normalized

jnothman · 2020-04-22T01:45:42Z

Is it required to compute the Stress-1 in every iteration, or can it be just done for the final (returned) Stress value?

This is a very good question. Does norming in every iteration affect the result too?

ADEscobar · 2020-04-22T08:02:29Z

Is it required to compute the Stress-1 in every iteration, or can it be just done for the final (returned) Stress value?

This is a very good question. Does norming in every iteration affect the result too?

The stop condition (eps) is checked using the normalized stress, so it might stop prematurely and perform less iterations, since the epsilon in the normalized stress is comparatively smaller.

Not a big deal, one could just decrease the eps if using the normalized option, but I think it can anyway be more efficient doing the normalization just at the end.

matthieu-pa · 2020-04-22T08:40:57Z

Not a big deal, one could just decrease the eps if using the normalized option, but I think it can anyway be more efficient doing the normalization just at the end.

Thank you for raising a very good point. I also agree that checking the normalized stress at every iteration is not very likely to cause MDS to stop early whereas it is quite more computing intensive.

Next week, to be thorough, I could benchmark a version calculating the normalized stress at every iteration and one just at the end over a few randomly generated distance matrices. I would mainly focus on comparing execution time and the number of iterations required to converge.

cmarmo · 2022-05-02T19:12:08Z

Closing as superseded by #22562.

Łukasz Borchmann and others added 11 commits November 23, 2017 20:58

Calculate Stress-1 instead of raw Stress if normalize=True

9d7f60f

Assert that normed stress is the same for values multiplied by some f…

a8256dd

…actor "k"

Use predefined matrix instead of randomly generated

bef194a

add whats_new note about new MDS parameter and use assert_allclose

99bb03f

Merge branch 'master' into Borchmann-stress1

fea77bb

add versionadded directive

8b263b7

fix typo in import statement

aa51079

fix test_normed_stress for MDS

681a4e8

make normalize parameter accessible from init

163b715

Merge remote-tracking branch 'upstream/master' into stress1

4d8cef2

fixed sklearn/manifold/tests/test_mds.py

27bd20e

amueller added the Needs Benchmarks A tag for the issues and PRs which require some benchmarks label Aug 5, 2019

jnothman added the Waiting for Reviewer label Oct 17, 2019

github-actions bot added the module:manifold label Mar 2, 2020

Merge latest master into stress1

6e6a923

joshuacwnewton mentioned this pull request Aug 7, 2020

Update _mds.py #18094

Closed

Base automatically changed from master to main January 22, 2021 10:50

cmarmo added Superseded PR has been replace by a newer PR and removed Waiting for Reviewer labels Mar 25, 2021

Micky774 mentioned this pull request Feb 20, 2022

ENH Calculate normed stress (Stress-1) in manifold.MDS #22562

Merged

thomasjpfan mentioned this pull request Apr 27, 2022

DOC Detail superseded workflow for PRs #23220

Merged

cmarmo closed this May 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] Feature: calculate normed stress (Stress-1) in sklearn.manifold.MDS #13042

[MRG] Feature: calculate normed stress (Stress-1) in sklearn.manifold.MDS #13042

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[MRG] Feature: calculate normed stress (Stress-1) in sklearn.manifold.MDS #13042

[MRG] Feature: calculate normed stress (Stress-1) in sklearn.manifold.MDS #13042

Uh oh!

Conversation

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!