8000 Refactor tests for sample weights · Issue #11316 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Refactor tests for sample weights #11316
Closed
@jnothman

Description

@jnothman

In various parts of the code, we have tests for sample_weight support, including in metrics, and for individual estimators. we have some common estimator checks for class_weight, but not really for sample_weight functionality (only for weight type invariance).

Recent implementations of sample_weight include #10933 (KMeans) and #10803 (density estimation). But as well as estimators we have things like common tests for evaluation metrics.

Invariance testing for sample weights should include:

  • sample_weight=np.ones(len(X)) makes the same model as sample_weight=None
  • sample_weight=random can make a different model to sample_weight=None
  • sample_weight=s for integer array s makes the same model as X=np.repeat(X, s, axis=0), y=np.repeat(y, s, axis=0) (although there may be exceptions to this depending on how the estimator defines iteration, convergence, etc., as in Test test_weighted_vs_repeated is somehow flaky #11236)
  • sample_weight=s * k for array s and positive constant k makes the same model as sample_weight=s

I wonder if it is possible to establish a generic test for this, e.g. something like:

def check_sample_weight_invariance(data_args, fit, is_equal):
    """
    Parameters
    ----------
    data_args : dict
        Keyword arguments to pass to fit, and which would need to be repeated
        to test equivalence to integer sample weights.
    fit : callable
        Passed data args, returns a model that can be compared with is_equal
    is_equal : callable
        Passed two models returned from fit, returns a bool to indicate equality
        between models
    """

Metadata

Metadata

Assignees

No one assigned

    Labels

    ModerateAnything that requires some knowledge of conventions and best practiceshelp wantedmodule:test-suiteeverything related to our tests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0