Description
Use sample_weight
in the binning of HistGradientBoostingClassifier
and HistGradientBoostingRegressor
, or allow it via an option.
Currently, sample weights are ignored in the _BinMapper
.
Some more context and history summarized by @NicolasHug here:
I agree that it would make sense to support SW in the binner (although this is nuanced, see note below). Reading back the original PR, this was discussed extensively:
LightGBM implem doesn't take weights into account when binning ENH Support sample weights in HGBT #14696 (comment)
I had a proposal to support SW in the binning: ENH Support sample weights in HGBT #14696 (comment). Olivier and Andy seemed to be happy with it ENH Support sample weights in HGBT #14696 (comment); there were some concerns from Adrin. To unblock the rest of the work, we were all happy to just not implement SW support in the Binner and leave that as potential future work.
The estimators were still experimental at the time so we had more flexibility. Now that they're stable, bringing SW support in the Binner might require BC mitigations.
Also as a side note, #15657 (comment) may be relevant here: it's not super clear to me how we should handle SW in an estimator that performs some sort of subsambling during the training process (as is the case here during Binning).