MCD Exact Fit: Fixes Issue #3367 #4635

ThatGeoGuy · 2015-04-26T18:31:29Z

The following patch fixes issue #3367, to the best of my knowledge. However, while testing with various "exact fit" scenarios, I encountered an issue which I believe may be somewhere else in the code. The data I used to test this is a simple plane (all Z = 0) with a few outliers present. The exact fit scenario works and no issues arise.

However, taking the same data as above, converting it to the affine subspace spanned by the data, which is basically a rotation about the principal axes and a translation by the mean, the data occasionally raises an error where the determinant is larger than that of the previous determinant. Both datasets can be found at https://gist.github.com/ThatGeoGuy/713bd1355b87ea2d5d07, where good_data.txt is the original data, and bad_data.txt is the same data transformed into its affine subspace.

The final result is still correct in the end despite a couple of the trials triggering the above issue, and it only happens with the bad_data.txt dataset. I am unsure what is causing this, but it appears to be something different than what caused #3367.

amueller · 2015-04-27T22:10:04Z

Thanks for the PR. I don't suppose @VirgileFritsch is around to review.
I'll try to find time to study the paper, but it might be a while, sorry.

ThatGeoGuy · 2015-07-14T05:42:37Z

Hey all, just wondering if this commit ever had any problems raised. I'm in no rush as it has thus far worked without flaw for me (compares almost exactly to LIBRA in terms of results, minus some floating point differences), but I'm wondering if there's any additional material / resources this needs in order to move forward.

As far as I can see the project is quite busy so I don't mean to rush, but it has been a few months, so I wanted to check up to make sure.

agramfort · 2015-07-14T20:37:56Z

any chance to add a test?

ThatGeoGuy · 2015-07-20T16:38:33Z

I could add a test, but I am unsure how to add tests. I've never used continuous integration before, and am unsure how to commit a test to the repo. Is there a guide somewhere that can direct me?

amueller · 2015-07-30T22:54:35Z

The test should be a small synthetic dataset (such as data_good.txt) for which there was an error before but not after your fix.
You should add a function that generates that data and tests whether the result is correct to ``sklearn/covariance/tests/test_robust_covariance.py. The function name should start with test_`.

Run the tests using nosetests -sv sklearn/covariance/tests/test_robust_covariance.py. It should fail on master and work on your branch.

ThatGeoGuy · 2017-02-09T17:38:39Z

I had unfortunately ignored this as I was working through quite a bit of things during my Master's. I'm closing this as my patch has diverged from upstream quite significantly. I need to take a look at this again and tackle the issue fresh, so I'm closing this PR. However, I have a much better idea now of how to get the tests built and running, so I imagine my next attempt will be much better. I sincerely apologize for ignoring this for so long.

ThatGeoGuy added 6 commits April 3, 2015 16:46

Fixes issue where 1d arrays improperly reshape

b17d375

Changes warning message if 1Darray passed to fast_mcd

71d016f

temp change to singular covariance

8ec134f

Adds break condition to determinant check in _c_step

b124b1c

Fixes issue with singular matrices

2226a65

Fixes convergence criteria for exact fit scenario

ea9b2e5

ThatGeoGuy mentioned this pull request Apr 26, 2015

Robust covaraince (fast_mcd) does not handle singular covariance matrices #3367

Closed

amueller added the Bug label Apr 27, 2015

ThatGeoGuy closed this Feb 9, 2017

ThatGeoGuy mentioned this pull request Feb 9, 2017

[MRG+1] Fixes issue #3367 -> MCD fails on data with singular covariance matrix #8328

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MCD Exact Fit: Fixes Issue #3367 #4635

MCD Exact Fit: Fixes Issue #3367 #4635

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MCD Exact Fit: Fixes Issue #3367 #4635

MCD Exact Fit: Fixes Issue #3367 #4635

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!