8000 Helpers for managing different set of classes · Issue #8100 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Helpers for managing different set of classes #8100
Open
@jnothman

Description

@jnothman

It seems like half the bugs we've solved in the past couple of months surround problems of having different sets of classes, in the context of:

  • cross-validation splitting that yields different subsets of classes in different training or testing subsets (and hence issues in alignment of class-wise outputs from predict_proba, decision_function or metrics, or in normalising macro-averaged scores)
  • partial_fit where classes are specified upfront, but then repeated calls need matching to those classes
  • warm_start where classes_ from the first fit must be identical to the set of classes in y in each call to fit()

These are all subtly different problems, but at the moment it seems like we're handling them on an ad-hoc (and too often a post-hoc) basis.

It would be amazing if someone could review these issues and identify where either API changes (classes as a constructor parameter to classifiers has been suggested) or helper utilities might help avoid these issues in the future.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0