Open
Description
Many classification applications need to deal with skewed input data - recently for several projects I've had to implement techniques to re-weight samples during training to get the best results - this can ideally be supported generically by scikit-learn in https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py
In my use case I was able to get significantly better results by assuming a uniform prior during training with the skewed labels- but it makes sense to have a generic way to add weights to the sampled training distribution for cases where researchers have good reason to incorporate a certain prior.
Metadata
Metadata
Assignees
Labels
No labels