-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Test test_weighted_vs_repeated is somehow flaky #11236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Sound like this test fails for some A short term solution could be to generate |
Hey, just taken a look at this and I think the error is coming from the Here we have two datasets that we fit the same model to with the same parameters and check that the output is the same. One is of length If we have different sample sizes rng1 = np.random.RandomState(42)
rng2 = np.random.RandomState(42)
rng1.randint(0, 100, 10)
array([51, 92, 14, 71, 60, 20, 82, 86, 74, 74])
rng2.randint(0, 100, 10)
array([51, 92, 14, 71, 60, 20, 82, 86, 74, 74])
rng1 = np.random.RandomState(42)
rng3.randint(0, 250, 10)
array([102, 179, 92, 14, 106, 71, 188, 20, 102, 121]) So if the indices aren't the same then there is no reason to think that they should always converge on the same solution which explains why sometimes they don't always label the same. As a solution we can either remove I also had a go at changing around some of the model parameters such as very small tolerance and increased batch sizes but it still has instabilities eventually. |
A few things to do:
Ping @jnhansen |
Cheers, I hadn't seen this issue before the ping. Will work on a fix, though possibly not before next week. |
thanks! as it is a blocker for an upcoming release, someone might beat you
to it, but we'll see.
|
Is there any set of steps to reliably make the test fail? I haven't been able to reproduce the error so far. I have a feeling though that this is because Fixing the random state to a value that is known to deliver correct results feels a bit like cheating and I'd raise the question if the test is at all appropriate for |
an alternative is to check that it passes for a reasonable percentage of
random_states. but it may be simpler to check that by hand, then fix one
value here.
|
@jnhansen Here is the script I used to reproduce the error (on Windows)
At a glance, I tend to agree with @g-walsh 's comment (#11236 (comment)). Adding a random state seems an acceptable solution from my side.
It might be reasonable to skip the check for labels of each point as well. So removing MiniBatchKMeans also seems acceptable from my side. (It seems reasonable for MiniBatchKMeans to reach suboptimal results sometimes.) |
I would rather fix the random seed and maybe relax the condition to have a |
would we expect exact consistency with an infinite batch_size? is this an
appropriate test?
|
For MiniBatchKMeans, maybe not from my side. I've proposed #11523, which uses a fixed random state to avoid test failure. To further make the test appropriate, a possible way might be @ogrisel 's comment : have a v_measure over 0.9 for the MiniBatchKMeans case. |
@jnothman I think we wouldn't expect consistency here as I think the issue with the sampling of the mini batches comes from the difference in Also with an infinite batch size I think it would be very similar to testing standard Kmeans anyway rather than MiniBatchKmeans. |
We might a have a test which somehow flaky:
This is the second time that I got a CI failing on this one. I could not find any other issue related to that.
The text was updated successfully, but these errors were encountered: