-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
FIX numerical rounding issue causing test_poisson_zero_nodes to fail at random #18737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX numerical rounding issue causing test_poisson_zero_nodes to fail at random #18737
Conversation
Actually this might be a real bug @lorentzenchr. We are never supposed to get negative values, whatever the seed of the dataset right? |
@ogrisel That's right. Very concerning. |
I can reproduce on my machine. The error could still be in the criterion or in the splitter logic. Maybe the splitter does not like INFINITY? |
I think the problem sits in scikit-learn/sklearn/tree/_criterion.pyx Line 1366 in 4d7e611
When I replace it with the code from children_impurity , i.e. without the algebraic simplifications and throwing constant terms apart, the test test_poisson_zero_nodes passes (but 2 other tests fail).
What do we do if we can't fix it in the next 2 days? |
We will revert it. |
Which one are the failing tests? Are these critical tests? |
|
I agree this is fishy, but here we compare on the held-out test set. Maybe it's just a consequence of the finite size of the training and test sets? Edit: actually the way the test is written, we don't know if this is the training set or the testing set that causes the failure. |
Here is what happens:
The Criterion internally uses a Example Debug output:
Why does it call |
Do we have a test that the implicitly used The code is quite none-trivial. Maybe, I just misunderstood part of it. |
If I get it right, it means that at this node/leaf all samples have |
@lorentzenchr if you wish, we could make a quick call. I could catch up with your debugging and invest some time helping debugging. |
I think this is only a numerical error.
I will investigate a bit more to check the way we compute this mean to ensure that this is the case. |
I would not be surprised that we don't get scikit-learn/sklearn/tree/_criterion.pyx Line 859 in 51acc9d
|
This is the test in @glemaitre feel free to push directly into this PR to share your experiments to fix this. |
When writing this test, I didn't think about rounding errors in |
I made the changes with 1 * EPS. Let see if this is enough. It was working locally. |
You can also increase the range of seeds to 1000 or 10000 in this PR just to make sure that we fix the problem. |
Yep but it was still failing for some seed. |
around 280 times / 10000 times |
It seems better now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, the fix looks right to me and the extensive seed test look stable. We can now reduce the number of seeds and merge this if @lorentzenchr agrees.
Well done @glemaitre!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after reducing then range in the test.
Great work @glemaitre, thank you very much!
I cannot approve this PR because I created it but +1 for merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks all, LGTM when https://github.com/scikit-learn/scikit-learn/pull/18737/files#r517464475 and https://github.com/scikit-learn/scikit-learn/pull/18737/files#r517460123 are addressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve my changes :)
As I've introduced this bug 👀, I'm happy to merge the fix😏 @glemaitre and @ogrisel 👏 |
test_poisson_zero_nodes
is failing at random on test data more complex than usual.Here is an example of the original failure observed in #18727: