8000 Robust covaraince (fast_mcd) does not handle singular covariance matrices · Issue #3367 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Robust covaraince (fast_mcd) does not handle singular covariance matrices #3367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ojy opened this issue Jul 11, 2014 · 3 comments · Fixed by #8328
Closed

Robust covaraince (fast_mcd) does not handle singular covariance matrices #3367

ojy opened this issue Jul 11, 2014 · 3 comments · Fixed by #8328
Labels
Milestone

Comments

@ojy
Copy link
ojy commented Jul 11, 2014

When the c_step of fast_mcd encounters a covariance matrix with a 0 determinant, it raises an error. According to the paper (Rousseeuw & Van Driessen 1999), it shouldn't be the case. Technically, 0-determinant covariance is the optimal solution. A singular case requires a special treatment. (Section 5 of the paper). I'm new here, but I can contribute to this.

@amueller amueller added the Bug label Jan 22, 2015
@amueller
Copy link
Member

Sorry for the long silence. If you are still interested a fix would be very welcome.

@ThatGeoGuy
Copy link
Contributor

Hey, sorry for the long wait, I haven't taken the time to commit to this issue since I have been busy with final projects / exams. I submitted a PR which I believe fixes this issue (PR #4635), but does manage to produce some warnings with one of the sample datasets I used to test the fix.

I believe the issue with that sample dataset (described in the PR) is different than what caused this issue. Overall, I think this issue is solved, as despite the warnings on some of the trials, this still produces the correct answer for exact fit scenarios.

@amueller amueller modified the milestone: 0.19 Sep 29, 2016
@ThatGeoGuy
Copy link
Contributor

I'm starting fresh to try and tackle this again, two years later (sorry, it seems I really delayed my old fix, which is now incompatible to an extent with the old code). I'm gonna try and write some test cases and submit a new PR. Sorry for being such a wreck about actually fixing this, I think I was quite close to getting everything correct last time.

For the record, I will also test whatever data I generate against the reference implementation in LIBRA as well, which should let me know if I change things too much.

agramfort pushed a commit that referenced this issue Jun 8, 2017
…ce matrix (#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request #8328

* Adds entry for PR #8328 to what's new

* Adds review changes from @jnothman
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this issue Jun 14, 2017
…lar covariance matrix (scikit-learn#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\scikit-learn#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request scikit-learn#8328

* Adds entry for PR scikit-learn#8328 to what's new

* Adds review changes from @jnothman
dmohns pushed a commit to dmohns/scikit-learn that referenced this issue Aug 7, 2017
…lar covariance matrix (scikit-learn#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\scikit-learn#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request scikit-learn#8328

* Adds entry for PR scikit-learn#8328 to what's new

* Adds review changes from @jnothman
dmohns pushed a commit to dmohns/scikit-learn that referenced this issue Aug 7, 2017
…lar covariance matrix (scikit-learn#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\scikit-learn#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request scikit-learn#8328

* Adds entry for PR scikit-learn#8328 to what's new

* Adds review changes from @jnothman
NelleV pushed a commit to NelleV/scikit-learn that referenced this issue Aug 11, 2017
…lar covariance matrix (scikit-learn#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\scikit-learn#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request scikit-learn#8328

* Adds entry for PR scikit-learn#8328 to what's new

* Adds review changes from @jnothman
paulha pushed a commit to paulha/scikit-learn that referenced this issue Aug 19, 2017
…lar covariance matrix (scikit-learn#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\scikit-learn#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request scikit-learn#8328

* Adds entry for PR scikit-learn#8328 to what's new

* Adds review changes from @jnothman
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this issue Aug 29, 2017
…lar covariance matrix (scikit-learn#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\scikit-learn#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request scikit-learn#8328

* Adds entry for PR scikit-learn#8328 to what's new

* Adds review changes from @jnothman
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this issue Nov 15, 2017
…lar covariance matrix (scikit-learn#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\scikit-learn#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request scikit-learn#8328

* Adds entry for PR scikit-learn#8328 to what's new

* Adds review changes from @jnothman
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this issue Dec 18, 2017
…lar covariance matrix (scikit-learn#8328)

* Adds failing test for issue 3367

* Adds extra comments to describe what I'm testing

Basically in this instance I discovered this bug independently from
\scikit-learn#3367 because I was trying to use principle components to estimate
plane normals / geometry of points in 3D space. When you have a set of
points that specify a perfect plane though (or in the case of wanting to
use MCD, you have a subset of your points that specify a perfect plane),
then the code fails because the covariance matrix is singular. However,
if your covariance matrix is singular, you've already found the set of
points with the lowest determinant. As per Rousseeuw & Van Driessen
1999, at this point you can stop searching.

The code did stop searching, however, it raised a ValueError on singular
matrices for no reason. So the correct fix should be to remove that.

* Fixes issue 3367

This should work with the test case provided.

* Adds missing argument to test

* Style corrections to pass flake runner

Implements the style corrections as mentioned in pull request scikit-learn#8328

* Adds entry for PR scikit-learn#8328 to what's new

* Adds review changes from @jnothman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0