Add sample_weight support to binning in HGBT

@NicolasHug

Use sample_weight in the binning of HistGradientBoostingClassifier and HistGradientBoostingRegressor, or allow it via an option.

Currently, sample weights are ignored in the _BinMapper.

Some more context and history summarized by @NicolasHug here:

I agree that it would make sense to support SW in the binner (although this is nuanced, see note below). Reading back the original PR, this was discussed extensively:

LightGBM implem doesn't take weights into account when binning ENH Support sample weights in HGBT #14696 (comment)

I had a proposal to support SW in the binning: ENH Support sample weights in HGBT #14696 (comment). Olivier and Andy seemed to be happy with it ENH Support sample weights in HGBT #14696 (comment); there were some concerns from Adrin. To unblock the rest of the work, we were all happy to just not implement SW support in the Binner and leave that as potential future work.

The estimators were still experimental at the time so we had more flexibility. Now that they're stable, bringing SW support in the Binner might require BC mitigations.

Also as a side note, #15657 (comment) may be relevant here: it's not super clear to me how we should handle SW in an estimator that performs some sort of subsambling during the training process (as is the case here during Binning).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions