Balanced/Weighted Sampling

Many classification applications need to deal with skewed input data - recently for several projects I've had to implement techniques to re-weight samples during training to get the best results - this can ideally be supported generically by scikit-learn in https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py

In my use case I was able to get significantly better results by assuming a uniform prior during training with the skewed labels- but it makes sense to have a generic way to add weights to the sampled training distribution for cases where researchers have good reason to incorporate a certain prior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions