8000 Test test_weighted_vs_repeated is somehow flaky · Issue #11236 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Test test_weighted_vs_repeated is somehow flaky #11236
Closed
@glemaitre

Description

@glemaitre

We might a have a test which somehow flaky:

[00:11:40] ================================== FAILURES ===================================
[00:11:40] __________________________ test_weighted_vs_repeated __________________________
[00:11:40] 
[00:11:40]     def test_weighted_vs_repeated():
[00:11:40]         # a sample weight of N should yield the same result as an N-fold
[00:11:40]         # repetition of the sample
[00:11:40]         sample_weight = np.random.randint(1, 5, size=n_samples)
[00:11:40]         X_repeat = np.repeat(X, sample_weight, axis=0)
[00:11:40]         estimators = [KMeans(init="k-means++", n_clusters=n_clusters,
[00:11:40]                              random_state=42),
[00:11:40]                       KMeans(init="random", n_clusters=n_clusters,
[00:11:40]                              random_state=42),
[00:11:40]                       KMeans(init=centers.copy(), n_clusters=n_clusters,
[00:11:40]                              random_state=42),
[00:11:40]                       MiniBatchKMeans(n_clusters=n_clusters, batch_size=10,
[00:11:40]                                       random_state=42)]
[00:11:40]         for estimator in estimators:
[00:11:40]             est_weighted = clone(estimator).fit(X, sample_weight=sample_weight)
[00:11:40]             est_repeated = clone(estimator).fit(X_repeat)
[00:11:40]             repeated_labels = np.repeat(est_weighted.labels_, sample_weight)
[00:11:40]             assert_almost_equal(v_measure_score(est_repeated.labels_,
[00:11:40] >                                               repeated_labels), 1.0)
[00:11:40] E           AssertionError: 
[00:11:40] E           Arrays are not almost equal to 7 decimals
[00:11:40] E            ACTUAL: 0.95443625305609903
[00:11:40] E            DESIRED: 1.0
[00:11:40] 
[00:11:40] X_repeat   = array([[ 0.1777796 ,  0.24368721,  0.24496657,  4.49305682,  0.52896169],
[00:11:40]        [ 0.41278093,  5.82206016,  1.8967929...367, -0.56629773,  0.09965137, -0.50347565],
[00:11:40]        [ 2.19045563,  4.00946367, -0.56629773,  0.09965137, -0.50347565]])
[00:11:40] est_repeated = MiniBatchKMeans(batch_size=10, compute_labels=True, init='k-means++',
[00:11:40]         init_size=None, max_iter=100, max_no_improvement=10, n_clusters=3,
[00:11:40]         n_init=3, random_state=42, reassignment_ratio=0.01, tol=0.0,
[00:11:40]         verbose=0)
[00:11:40] est_weighted = MiniBatchKMeans(batch_size=10, compute_labels=True, init='k-means++',
[00:11:40]         init_size=None, max_iter=100, max_no_improvement=10, n_clusters=3,
[00:11:40]         n_init=3, random_state=42, reassignment_ratio=0.01, tol=0.0,
[00:11:40]         verbose=0)
[00:11:40] estimator  = MiniBatchKMeans(batch_size=10, compute_labels=True, init='k-means++',
[00:11:40]         init_size=None, max_iter=100, max_no_improvement=10, n_clusters=3,
[00:11:40]         n_init=3, random_state=42, reassignment_ratio=0.01, tol=0.0,
[00:11:40]         verbose=0)
[00:11:40] estimators = [KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
[00:11:40]     n_clusters=3, n_init=10, n_jobs=1, precompu..._improvement=10, n_clusters=3,
[00:11:40]         n_init=3, random_state=42, reassignment_ratio=0.01, tol=0.0,
[00:11:40]         verbose=0)]
[00:11:40] repeated_labels = array([1, 2, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1,
[00:11:40]        1, 0, 0, 0, 0, 2, 1, 1, 0, 2, 2, 2,...1, 2, 2, 2, 2, 2,
[00:11:40]        2, 0, 0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1,
[00:11:40]        0, 0, 2, 2, 2, 2])
[00:11:40] sample_weight = array([1, 1, 4, 1, 1, 2, 2, 2, 2, 4, 4, 4, 1, 2, 1, 4, 3, 4, 1, 1, 4, 2, 3,
[00:11:40]        3, 1, 4, 3, 1, 2, 4, 1, 4, 2, 4, 4,...3, 4, 4, 3,
[00:11:40]        4, 2, 1, 4, 2, 4, 4, 2, 2, 3, 3, 1, 4, 1, 3, 1, 2, 2, 3, 2, 2, 4, 3,
[00:11:40]        3, 3, 4, 3, 2, 4, 2, 4])
[00:11:40] 
[00:11:40] c:\python36\lib\site-packages\sklearn\cluster\tests\test_k_means.py:935: AssertionError

This is the second time that I got a CI failing on this one. I could not find any other issue related to that.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0