8000 FIX Test failures in MacPython nightly builds · Issue #14192 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

FIX Test failures in MacPython nightly builds #14192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rth opened this issue Jun 26, 2019 · 4 comments
Closed

FIX Test failures in MacPython nightly builds #14192

rth opened this issue Jun 26, 2019 · 4 comments
Milestone

Comments

@rth
Copy link
Member
rth commented Jun 26, 2019

There are a few test failures in https://github.com/MacPython/scikit-learn-wheels/commits/master cron job, that would need fixing before the 0.21.3 release (#14188). Currently at least the following fails,

=================================== FAILURES ===================================
_______________________________ test_extract_xi ________________________________
    def test_extract_xi():
        # small and easy test (no clusters around other clusters)
        # but with a clear noise data.
        rng = np.random.RandomState(0)
        n_points_per_cluster = 5
    
        C1 = [-5, -2] + .8 * rng.randn(n_points_per_cluster, 2)
        C2 = [4, -1] + .1 * rng.randn(n_points_per_cluster, 2)
        C3 = [1, -2] + .2 * rng.randn(n_points_per_cluster, 2)
        C4 = [-2, 3] + .3 * rng.randn(n_points_per_cluster, 2)
        C5 = [3, -2] + .6 * rng.randn(n_points_per_cluster, 2)
        C6 = [5, 6] + .2 * rng.randn(n_points_per_cluster, 2)
    
        X = np.vstack((C1, C2, C3, C4, C5, np.array([[100, 100]]), C6))
        expected_labels = np.r_[[2] * 5, [0] * 5, [1] * 5, [3] * 5, [1] * 5,
                                -1, [4] * 5]
        X, expected_labels = shuffle(X, expected_labels, random_state=rng)
    
        clust = OPTICS(min_samples=3, min_cluster_size=2,
                       max_eps=20, cluster_method='xi',
                       xi=0.4).fit(X)
        assert_array_equal(clust.labels_, expected_labels)
    
        X = np.vstack((C1, C2, C3, C4, C5, np.array([[100, 100]] * 2), C6))
        expected_labels = np.r_[[1] * 5, [3] * 5, [2] * 5, [0] * 5, [2] * 5,
                                -1, -1, [4] * 5]
        X, expected_labels = shuffle(X, expected_labels, random_state=rng)
    
        clust = OPTICS(min_samples=3, min_cluster_size=3,
                       max_eps=20, cluster_method='xi',
                       xi=0.1).fit(X)
        # this may fail if the predecessor correction is not at work!
>       assert_array_equal(clust.labels_, expected_labels)
E       AssertionError: 
E       Arrays are not equal
E       
E       Mismatch: 18.8%
E       Max absolute difference: 3
E       Max relative difference: nan
E        x: array([ 0,  0, -1, -1,  1,  3,  3,  2,  0,  3,  3, -1,  1,  1, -1,  2, -1,
E               4,  0, -1,  4,  0,  4,  2, -1,  1,  1,  4,  2,  3,  4, -1])
E        y: array([ 0,  0,  2,  2,  1,  3,  3,  2,  0,  3,  3,  2,  1,  1,  2,  2, -1,
E               4,  0,  2,  4,  0,  4,  2, -1,  1,  1,  4,  2,  3,  4,  2])
C1         = array([[-3.58875812, -1.67987423],
       [-4.21700961, -0.20728544],
       [-3.50595361, -2.7818223 ],
       [-4.23992927, -2.12108577],
       [-5.08257508, -1.6715212 ]])
C2         = array([[ 4.01440436, -0.85457265],
       [ 4.07610377, -0.9878325 ],
       [ 4.04438632, -0.96663257],
       [ 4.14940791, -1.02051583],
       [ 4.03130677, -1.08540957]])
C3         = array([[ 0.48940204, -1.86927628],
       [ 1.17288724, -2.148433  ],
       [ 1.45395092, -2.29087313],
       [ 1.0091517 , -2.03743677],
       [ 1.30655584, -1.70612825]])
C4         = array([[-1.95351577,  3.11344876],
       [-2.26633572,  2.40576106],
       [-2.10437364,  3.04690469],
       [-1.6309128 ,  3.36071395],
       [-2.11619805,  2.90930917]])
C5         = array([[ 2.37086822, -2.85201076],
       [ 1.97623789, -0.82953476],
       [ 2.69420869, -2.26284458],
       [ 2.24832278, -1.53350579],
       [ 2.03166129, -2.12764417]])
C6         = array([[4.82090669, 6.0773805 ],
       [4.89783897, 5.76387356],
       [4.99436355, 6.08566637],
       [5.01330344, 6.06049438],
       [4.87313558, 5.92745177]])
X          = array([[ -2.10437364,   3.04690469],
       [ -1.95351577,   3.11344876],
       [  2.24832278,  -1.53350579],
       ...43677],
       [  4.14940791,  -1.02051583],
       [  4.89783897,   5.76387356],
       [  2.03166129,  -2.12764417]])
clust      = OPTICS(algorithm='auto', cluster_method='xi', eps=None, leaf_size=30,
       max_eps=20, metric='minkowski', metric_params=None, min_cluster_size=3,
       min_samples=3, n_jobs=None, p=2, predecessor_correction=True, xi=0.1)
expected_labels = array([ 0,  0,  2,  2,  1,  3,  3,  2,  0,  3,  3,  2,  1,  1,  2,  2, -1,
        4,  0,  2,  4,  0,  4,  2, -1,  1,  1,  4,  2,  3,  4,  2])
n_points_per_cluster = 5
rng        = <mtrand.RandomState object at 0xe7d444dc>
/venv/lib/python3.6/site-packages/sklearn/cluster/tests/test_optics.py:114: AssertionError
___________________ test_zero_variance_floating_point_error ____________________
    def test_zero_variance_floating_point_error():
        # Test that VarianceThreshold(0.0).fit eliminates features that have
        # the same value in every sample, even when floating point errors
        # cause np.var not to be 0 for the feature.
        # See #13691
    
        data = [[-0.13725701]] * 10
>       assert np.var(data) != 0
E       assert 0.0 != 0
E        +  where 0.0 = <function var at 0xf37f58e4>([[-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], ...])
E        +    where <function var at 0xf37f58e4> = np.var
data       = [[-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], [-0.13725701], ...]

(I haven't checked all the build so this might need updating).

It might also be good (though less critical) to wait until ARM and PPC builds succeed at conda-forge/scikit-learn-feedstock#98 and fix the potentially failing tests there.

@rth rth added this to the 0.20.4 milestone Jun 26, 2019
@rth
Copy link
Member Author
rth commented Jun 26, 2019

Some of these are failures on master that may not affect 0.21.3 but it would be still good to fix those.

@jnothman
Copy link
Member

Thanks for this. With test_zero_variance_floating_point_error, assert np.var(data) != 0 just checks a baseline condition without which the test is useless. It could become: if np.var(data) == 0: raise SkipTest() so that we ensure, via code coverage of test files, that this is at least true on some platforms??

@rth rth modified the milestones: 0.20.4, 0.21.3 Jun 26, 2019
@qinhanmin2014
Copy link
Member

test_extract_xi is known to be not stable and we've skipped the test in 0.20.X branch, so this won't block the release, see #13739
I should take some time to look into it but I'm still under treatment o(╥﹏╥)o

@qinhanmin2014
Copy link
Member

I guess we can close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0