8000 `median_absolute_error` fails `test_regression_sample_weight_invariance` · Issue #30781 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

median_absolute_error fails test_regression_sample_weight_invariance #30781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lucyleeow opened this issue Feb 7, 2025 · 1 comment
Closed
Labels
Bug Needs Triage Issue requires triage

Comments

@lucyleeow
Copy link
Member
lucyleeow commented Feb 7, 2025

Describe the bug

sample_weights was added to median_absolute_error in 0.24 but median_absolute_error was not removed from METRICS_WITHOUT_SAMPLE_WEIGHT.

(Noticed while trying to fix an unrelated problem in median_absolute_error)

Steps/Code to Reproduce

On main, remove median_absolute_error from METRICS_WITHOUT_SAMPLE_WEIGHT and run test_regression_sample_weight_invariance - in particular the check that sample weights of one's is the same as sample_weight=None fails

Expected Results

No error

Actual Results

name = 'median_absolute_error'

    @pytest.mark.parametrize(
        "name",
        sorted(
            set(ALL_METRICS).intersection(set(REGRESSION_METRICS))
            - METRICS_WITHOUT_SAMPLE_WEIGHT
        ),
    )
    def test_regression_sample_weight_invariance(name):
        n_samples = 50
        random_state = check_random_state(0)
        # regression
        y_true = random_state.random_sample(size=(n_samples,))
        y_pred = random_state.random_sample(size=(n_samples,))
        metric = ALL_METRICS[name]
>       check_sample_weight_invariance(name, metric, y_true, y_pred)

sklearn/metrics/tests/test_common.py:1558: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/metrics/tests/test_common.py:1458: in check_sample_weight_invariance
    assert_allclose(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

actual = array(0.36388614), desired = array(0.35997069), rtol = 1e-07, atol = 0.0, equal_nan = True
err_msg = 'For median_absolute_error sample_weight=None is not equivalent to sample_weight=ones', verbose = True

    def assert_allclose(
        actual, desired, rtol=None, atol=0.0, equal_nan=True, err_msg="", verbose=True
    ):
        """dtype-aware variant of numpy.testing.assert_allclose
    
        This variant introspects the least precise floating point dtype
        in the input argument and automatically sets the relative tolerance
        parameter to 1e-4 float32 and use 1e-7 otherwise (typically float64
        in scikit-learn).
    
        `atol` is always left to 0. by default. It should be adjusted manually
        to an assertion-specific value in case there are null values expected
        in `desired`.
    
        The aggregate tolerance is `atol + rtol * abs(desired)`.
    
        Parameters
        ----------
        actual : array_like
            Array obtained.
        desired : array_like
            Array desired.
        rtol : float, optional, default=None
            Relative tolerance.
            If None, it is set based on the provided arrays' dtypes.
        atol : float, optional, default=0.
            Absolute tolerance.
        equal_nan : bool, optional, default=True
            If True, NaNs will compare equal.
        err_msg : str, optional, default=''
            The error message to be printed in case of failure.
        verbose : bool, optional, default=True
            If True, the conflicting values are appended to the error message.
    
        Raises
        ------
        AssertionError
            If actual and desired are not equal up to specified precision.
    
        See Also
        --------
        numpy.testing.assert_allclose
    
        Examples
        --------
        >>> import numpy as np
        >>> from sklearn.utils._testing import assert_allclose
        >>> x = [1e-5, 1e-3, 1e-1]
        >>> y = np.arccos(np.cos(x))
        >>> assert_allclose(x, y, rtol=1e-5, atol=0)
        >>> a = np.full(shape=10, fill_value=1e-5, dtype=np.float32)
        >>> assert_allclose(a, 1e-5)
        """
        dtypes = []
    
        actual, desired = np.asanyarray(actual), np.asanyarray(desired)
        dtypes = [actual.dtype, desired.dtype]
    
        if rtol is None:
            rtols = [1e-4 if dtype == np.float32 else 1e-7 for dtype in dtypes]
            rtol = max(rtols)
    
>       np_assert_allclose(
            actual,
            desired,
            rtol=rtol,
            atol=atol,
            equal_nan=equal_nan,
            err_msg=err_msg,
            verbose=verbose,
        )
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=0
E       For median_absolute_error sample_weight=None is not equivalent to sample_weight=ones
E       Mismatched elements: 1 / 1 (100%)
E       Max absolute difference among violations: 0.00391544
E       Max relative difference among violations: 0.01087712
E        ACTUAL: array(0.363886)
E        DESIRED: array(0.359971)

Versions

main
@lucyleeow lucyleeow added Bug Needs Triage Issue requires triage labels Feb 7, 2025
@lucyleeow
Copy link
Member Author

I forgot about #17370, closing in favour of original

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue requires triage
Projects
None yet
Development

No branches or pull requests

1 participant
0