8000 Instability in test_ridge.py::test_ridge_sample_weights · Issue #11200 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Instability in test_ridge.py::test_ridge_sample_weights #11200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rth opened this issue Jun 4, 2018 · 4 comments · Fixed by #11587
Closed

Instability in test_ridge.py::test_ridge_sample_weights #11200

rth opened this issue Jun 4, 2018 · 4 comments · Fixed by #11587

Comments

@rth
Copy link
Member
rth commented Jun 4, 2018

The test sklearn/linear_model/tests/test_ridge.py::test_ridge_sample_weights passes on master, however, when it was parametrized as part of #11074 failures were observed (#11074 (comment)).

The relevant diff can be found in master...rth:test_ridge_sample_weights-parametrization (whitespaces are ignored in this diff), where the following runs fail,

alpha = 1.0, intercept = False, solver = 'lsqr', n_samples = 5, n_features = 10
alpha = 1.0, intercept = False, solver = 'sparse_cg', n_samples = 5, n_features = 10
alpha = 0.01, intercept = False, solver = 'sparse_cg', n_samples = 5, n_features = 10

(out of a total of 32 runs).

This means that this test is brittle and depends on the RNG state. Increasing numerical tolerance might be a solution or possibly increasing the number of samples?

@lesteve
Copy link
Member
lesteve commented Jun 6, 2018

Just a stand-alone snippet reproducing the problem, which is a small variation around test_ridge_sample_weights to estimate how sensitive it is on its random state:

from itertools import product

import numpy as np

from sklearn.utils.testing import (assert_array_almost_equal,
                                   assert_almost_equal)
from sklearn.linear_model import Ridge
from scipy import linalg


def test_ridge_sample_weights(rng):
    param_grid = product((1.0, 1e-2), (True, False),
                         ('svd', 'cholesky', 'lsqr', 'sparse_cg'))

    for n_samples, n_features in ((6, 5), (5, 10)):

        y =
8000
 rng.randn(n_samples)
        X = rng.randn(n_samples, n_features)
        sample_weight = 1.0 + rng.rand(n_samples)

        for (alpha, intercept, solver) in param_grid:

            # Ridge with explicit sample_weight
            est = Ridge(alpha=alpha, fit_intercept=intercept, solver=solver)
            est.fit(X, y, sample_weight=sample_weight)
            coefs = est.coef_
            inter = est.intercept_

            # Closed form of the weighted regularized least square
            # theta = (X^T W X + alpha I)^(-1) * X^T W y
            W = np.diag(sample_weight)
            if intercept is False:
                X_aug = X
                I = np.eye(n_features)
            else:
                dummy_column = np.ones(shape=(n_samples, 1))
                X_aug = np.concatenate((dummy_column, X), axis=1)
                I = np.eye(n_features + 1)
                I[0, 0] = 0

            cf_coefs = linalg.solve(X_aug.T.dot(W).dot(X_aug) + alpha * I,
                                    X_aug.T.dot(W).dot(y))

            if intercept is False:
                assert_array_almost_equal(coefs, cf_coefs)
            else:
                assert_array_almost_equal(coefs, cf_coefs[1:])
                assert_almost_equal(inter, cf_coefs[0])


rng = np.random.RandomState(0)

for i in range(100):
    try:
        test_ridge_sample_weights(rng)
    except AssertionError:
        print('failed')

On my machine I get 26 failures out of 100 runs.

@jnothman
Copy link
Member
jnothman commented Jun 6, 2018 via email

@sergulaydore
Copy link
Contributor

I generated a histogram of mismatch percentages for decimal=4 using the script https://gist.github.com/sergulaydore/6767aa908d051cb5d11417600f6161a1. The default test uses decimal=6 but it was hard to see the differences with this value. The issue only happens when solver is lsqr or sparse_cg. I used only one set of param_grid.

mismatch_histogram

@sergulaydore
Copy link
Contributor

So the problem was the default tolerance in Ridge. If we want to use the default tol in assert_array_almost_equal which is 1e-6, we need to make sure Ridge has the same tolerance. When I changed the tolerance in Ridge to 1e-6 (default was 1e-3), I did not get an errors any more. Here is the code I ran https://gist.github.com/sergulaydore/313bbcfa17d287fd97d492d0f62cea59. Of course, the test runs a little slower for a lower tolerance. Here is the comparison for timing:

Took 1.089055061340332 seconds with tol=1e-3 and Failed tests = 26
Took 1.3238649368286133 seconds with tol=1e-6 Failed tests = 0

I am creating a PR to fix this in the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
0