Open
Description
It seems like half the bugs we've solved in the past couple of months surround problems of having different sets of classes, in the context of:
- cross-validation splitting that yields different subsets of classes in different training or testing subsets (and hence issues in alignment of class-wise outputs from
predict_proba
,decision_function
or metrics, or in normalising macro-averaged scores) partial_fit
whereclasses
are specified upfront, but then repeated calls need matching to those classeswarm_start
whereclasses_
from the first fit must be identical to the set of classes iny
in each call tofit()
These are all subtly different problems, but at the moment it seems like we're handling them on an ad-hoc (and too often a post-hoc) basis.
It would be amazing if someone could review these issues and identify where either API changes (classes
as a constructor parameter to classifiers has been suggested) or helper utilities might help avoid these issues in the future.
Metadata
Metadata
Assignees
Labels
No labels