FIX Test failures in MacPython nightly builds #14192

rth · 2019-06-26T08:36:11Z

There are a few test failures in https://github.com/MacPython/scikit-learn-wheels/commits/master cron job, that would need fixing before the 0.21.3 release (#14188). Currently at least the following fails,

=================================== FAILURES ===================================
_______________________________ test_extract_xi ________________________________
    def test_extract_xi():
        # small and easy test (no clusters around other clusters)
        # but with a clear noise data.
        rng = np.random.RandomState(0)
        n_points_per_cluster = 5
    
        C1 = [-5, -2] + .8 * rng.randn(n_points_per_cluster, 2)
        C2 = [4, -1] + .1 * rng.randn(n_points_per_cluster, 2)
        C3 = [1, -2] + .2 * rng.randn(n_points_per_cluster, 2)
        C4 = [-2, 3] + .3 * rng.randn(n_points_per_cluster, 2)
        C5 = [3, -2] + .6 * rng.randn(n_points_per_cluster, 2)
        C6 = [5, 6] + .2 * rng.randn(n_points_per_cluster, 2)
    
        X = np.vstack((C1, C2, C3, C4, C5, np.array([[100, 100]]), C6))
        expected_labels = np.r_[[2] * 5, [0] * 5, [1] * 5, [3] * 5, [1] * 5,
                                -1, [4] * 5]
        X, expected_labels = shuffle(X, expected_labels, random_state=rng)
    
        clust = OPTICS(min_samples=3, min_cluster_size=2,
                       max_eps=20, cluster_method='xi',
                       xi=0.4).fit(X)
        assert_array_equal(clust.labels_, expected_labels)
    
        X = np.vstack((C1, C2, C3, C4, C5, np.array([[100, 100]] * 2), C6))
        expected_labels = np.r_[[1] * 5, [3] * 5, [2] * 5, [0] * 5, [2] * 5,
                                -1, -1, [4] * 5]
        X, expected_labels = shuffle(X, expected_labels, random_state=rng)
    
        clust = OPTICS(min_samples=3, min_cluster_size=3,
                       max_eps=20, cluster_method='xi',
                       xi=0.1).fit(X)
        # this may fail if the predecessor correction is not at work!
>       assert_array_equal(clust.labels_, expected_labels)
E       AssertionError: 
E       Arrays are not equal
E       
E       Mismatch: 18.8%
E       Max absolute difference: 3
E       Max relative difference: nan
E        x: array([ 0,  0, -1, -1,  1,  3,  3,  2,  0,  3,  3, -1,  1,  1, -1,  2, -1,
E               4,  0, -1,  4,  0,  4,  2, -1,  1,  1,  4,  2,  3,  4, -1])
E        y: array([ 0,  0,  2,  2,  1,  3,  3,  2,  0,  3,  3,  2,  1,  1,  2,  2, -1,
E               4,  0,  2,  4,  0,  4,  2, -1,  1,  1,  4,  2,  3,  4,  2])
C1         = array([[-3.58875812, -1.67987423],
       [-4.21700961, -0.20728544],
       [-3.50595361, -2.7818223 ],
       [-4.23992927, -2.12108577],
       [-5.08257508, -1.6715212 ]])
C2         = array([[ 4.01440436, -0.85457265],
       [ 4.07610377, -0.9878325 ],
       [ 4.04438632, -0.96663257],
       [ 4.14940791, -1.02051583],
       [ 4.03130677, -1.08540957]])
C3         = array([[ 0.48940204, -1.86927628],
       [ 1.17288724, -2.148433  ],
       [ 1.45395092, -2.29087313],
       [ 1.0091517 , -2.03743677],
       [ 1.30655584, -1.70612825]])
C4         = array([[-1.95351577,  3.11344876],
       [-2.26633572,  2.40576106],
       [-2.10437364,  3.04690469],
       [-1.6309128 ,  3.36071395],
       [-2.11619805,  2.90930917]])
C5         = array([[ 2.37086822, -2.85201076],
       [ 1.97623789, -0.82953476],
       [ 2.69420869, -2.26284458],
       [ 2.24832278, -1.53350579],
       [ 2.03166129, -2.12764417]])
C6         = array([[4.82090669, 6.0773805 ],
       [4.89783897, 5.76387356],
       [4.99436355, 6.08566637],
       [5.01330344, 6.06049438],
       [4.87313558, 5.92745177]])
X          = array([[ -2.10437364,   3.04690469],
       [ -1.95351577,   3.11344876],
       [  2.24832278,  -1.53350579],
       ...43677],
       [  4.14940791,  -1.02051583],
       [  4.89783897,   5.76387356],
       [  2.03166129,  -2.12764417]])
clust      = OPTICS(algorithm='auto', cluster_method='xi', eps=None, leaf_size=30,
       max_eps=20, metric='minkowski', metric_params=None, min_cluster_size=3,
       min_samples=3, n_jobs=None, p=2, predecessor_correction=True, xi=0.1)
expected_labels = array([ 0,  0,  2,  2,  1,  3,  3,  2,  0,  3,  3,  2,  1,  1,  2,  2, -1,
        4,  0,  2,  4,  0,  4,  2, -1,  1,  1,  4,  2,  3,  4,  2])
n_points_per_cluster = 5
rng        = <mtrand.RandomState object at 0xe7d444dc>
/venv/lib/python3.6/site-packages/sklearn/cluster/tests/test_optics.py:114: AssertionError
___________________ test_zero_variance_floating_point_error ____________________
    def test_zero_variance_floating_point_error():
        # Test that VarianceThreshold(0.0).fit eliminates features that have
        # the same value in every sample, even when floating point errors
        # cause np.var not to be 0 for the feature.
        # See #13691
    
        data = [[-0.13725701]] * 10
>       assert np.var(data) != 0
E       assert 0.0 != 0
E        +  where 0.0 = <function var at 0xf37f58e4>([[-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], ...])
E        +    where <function var at 0xf37f58e4> = np.var
data       = [[-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], ...]

(I haven't checked all the build so this might need updating).

It might also be good (though less critical) to wait until ARM and PPC builds succeed at conda-forge/scikit-learn-feedstock#98 and fix the potentially failing tests there.

The text was updated successfully, but these errors were encountered:

rth · 2019-06-26T08:40:38Z

Some of these are failures on master that may not affect 0.21.3 but it would be still good to fix those.

jnothman · 2019-06-26T08:40:41Z

Thanks for this. With test_zero_variance_floating_point_error, assert np.var(data) != 0 just checks a baseline condition without which the test is useless. It could become: if np.var(data) == 0: raise SkipTest() so that we ensure, via code coverage of test files, that this is at least true on some platforms??

qinhanmin2014 · 2019-06-26T08:47:33Z

test_extract_xi is known to be not stable and we've skipped the test in 0.20.X branch, so this won't block the release, see #13739
I should take some time to look into it but I'm still under treatment o(╥﹏╥)o

qinhanmin2014 · 2019-07-01T09:43:15Z

I guess we can close?

…cikit-learn#14204)

rth added this to the 0.20.4 milestone Jun 26, 2019

rth mentioned this issue Jun 26, 2019

[MRG] Release 0.21.3 #14188

Merged

rth modified the milestones: 0.20.4, 0.21.3 Jun 26, 2019

jnothman added a commit to jnothman/scikit-learn that referenced this issue Jun 27, 8000 2019

TST attempt to fix the variance threshold part of scikit-learn#14192

1a0f2dd

jnothman mentioned this issue Jun 27, 2019

TST attempt to fix the variance threshold part of #14192 #14204

Merged

rth pushed a commit that referenced this issue Jun 27, 2019

TST attempt to fix the variance threshold part of #14192 (#14204)

f339609

qinhanmin2014 closed this as completed Jul 1, 2019

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this issue Jul 12, 2019

TST attempt to fix the variance threshold part of scikit-learn#14192 (s…

e4a2504

…cikit-learn#14204)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FIX Test failures in MacPython nightly builds #14192

FIX Test failures in MacPython nightly builds #14192

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FIX Test failures in MacPython nightly builds #14192

FIX Test failures in MacPython nightly builds #14192

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!