Test test_weighted_vs_repeated is somehow flaky

We might a have a test which somehow flaky:

[00:11:40] ================================== FAILURES ===================================
[00:11:40] __________________________ test_weighted_vs_repeated __________________________
[00:11:40] 
[00:11:40]     def test_weighted_vs_repeated():
[00:11:40]         # a sample weight of N should yield the same result as an N-fold
[00:11:40]         # repetition of the sample
[00:11:40]         sample_weight = np.random.randint(1, 5, size=n_samples)
[00:11:40]         X_repeat = np.repeat(X, sample_weight, axis=0)
[00:11:40]         estimators = [KMeans(init="k-means++", n_clusters=n_clusters,
[00:11:40]                              random_state=42),
[00:11:40]                       KMeans(init="random", n_clusters=n_clusters,
[00:11:40]                              random_state=42),
[00:11:40]                       KMeans(init=centers.copy(), n_clusters=n_clusters,
[00:11:40]                              random_state=42),
[00:11:40]                       MiniBatchKMeans(n_clusters=n_clusters, batch_size=10,
[00:11:40]                                       random_state=42)]
[00:11:40]         for estimator in estimators:
[00:11:40]             est_weighted = clone(estimator).fit(X, sample_weight=sample_weight)
[00:11:40]             est_repeated = clone(estimator).fit(X_repeat)
[00:11:40]             repeated_labels = np.repeat(est_weighted.labels_, sample_weight)
[00:11:40]             assert_almost_equal(v_measure_score(est_repeated.labels_,
[00:11:40] >                                               repeated_labels), 1.0)
[00:11:40] E           AssertionError: 
[00:11:40] E           Arrays are not almost equal to 7 decimals
[00:11:40] E            ACTUAL: 0.95443625305609903
[00:11:40] E            DESIRED: 1.0
[00:11:40] 
[00:11:40] X_repeat   = array([[ 0.1777796 ,  0.24368721,  0.24496657,  4.49305682,  0.52896169],
[00:11:40]        [ 0.41278093,  5.82206016,  1.8967929...367, -0.56629773,  0.09965137, -0.50347565],
[00:11:40]        [ 2.19045563,  4.00946367, -0.56629773,  0.09965137, -0.50347565]])
[00:11:40] est_repeated = MiniBatchKMeans(batch_size=10, compute_labels=True, init='k-means++',
[00:11:40]         init_size=None, max_iter=100, max_no_improvement=10, n_clusters=3,
[00:11:40]         n_init=3, random_state=42, reassignment_ratio=0.01, tol=0.0,
[00:11:40]         verbose=0)
[00:11:40] est_weighted = MiniBatchKMeans(batch_size=10, compute_labels=True, init='k-means++',
[00:11:40]         init_size=None, max_iter=100, max_no_improvement=10, n_clusters=3,
[00:11:40]         n_init=3, random_state=42, reassignment_ratio=0.01, tol=0.0,
[00:11:40]         verbose=0)
[00:11:40] estimator  = MiniBatchKMeans(batch_size=10, compute_labels=True, init='k-means++',
[00:11:40]         init_size=None, max_iter=100, max_no_improvement=10, n_clusters=3,
[00:11:40]         n_init=3, random_state=42, reassignment_ratio=0.01, tol=0.0,
[00:11:40]         verbose=0)
[00:11:40] estimators = [KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
[00:11:40]     n_clusters=3, n_init=10, n_jobs=1, precompu..._improvement=10, n_clusters=3,
[00:11:40]         n_init=3, random_state=42, reassignment_ratio=0.01, tol=0.0,
[00:11:40]         verbose=0)]
[00:11:40] repeated_labels = array([1, 2, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1,
[00:11:40]        1, 0, 0, 0, 0, 2, 1, 1, 0, 2, 2, 2,...1, 2, 2, 2, 2, 2,
[00:11:40]        2, 0, 0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1,
[00:11:40]        0, 0, 2, 2, 2, 2])
[00:11:40] sample_weight = array([1, 1, 4, 1, 1, 2, 2, 2, 2, 4, 4, 4, 1, 2, 1, 4, 3, 4, 1, 1, 4, 2, 3,
[00:11:40]        3, 1, 4, 3, 1, 2, 4, 1, 4, 2, 4, 4,...3, 4, 4, 3,
[00:11:40]        4, 2, 1, 4, 2, 4, 4, 2, 2, 3, 3, 1, 4, 1, 3, 1, 2, 2, 3, 2, 2, 4, 3,
[00:11:40]        3, 3, 4, 3, 2, 4, 2, 4])
[00:11:40] 
[00:11:40] c:\python36\lib\site-packages\sklearn\cluster\tests\test_k_means.py:935: AssertionError

This is the second time that I got a CI failing on this one. I could not find any other issue related to that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions