8000 FIX Corrects negative gradient of AdaBoost loss in GBDT (#22050) · scikit-learn/scikit-learn@3bb7286 · GitHub
[go: up one dir, main page]

Skip to content

Commit 3bb7286

Browse files
FIX Corrects negative gradient of AdaBoost loss in GBDT (#22050)
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
1 parent c46ac60 commit 3bb7286

File tree

3 files changed

+39
-2
lines changed

3 files changed

+39
-2
lines changed

doc/whats_new/v1.1.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,28 @@ Changelog
128128
``bootstrap=False`` and ``max_samples`` is not ``None``.
129129
:pr:`21295` :user:`Haoyin Xu <PSSF23>`.
130130

131+
- |API| Changed the default of :func:`max_features` to 1.0 for
132+
:class:`ensemble.RandomForestRegressor` and to `"sqrt"` for
133+
:class:`ensemble.RandomForestClassifier`. Note that these give the same fit
134+
results as before, but are much easier to understand. The old default value
135+
`"auto"` has been deprecated and will be removed in version 1.3. The same
136+
changes are also applied for :class:`ensemble.ExtraTreesRegressor` and
137+
:class:`ensemble.ExtraTreesClassifier`.
138+
:pr:`20803` by :user:`Brian Sun <bsun94>`.
139+
140+
- |Fix| Solve a bug in :class:`ensemble.GradientBoostingClassifier` where the
141+
exponential loss was computing the positive gradient instead of the
142+
negative one.
143+
:pr:`22050` by :user:`Guillaume Lemaitre <glemaitre>`.
144+
145+
:mod:`sklearn.feature_extraction.text`
146+
......................................
147+
148+
- |Fix| :class:`feature_extraction.text.TfidfVectorizer` now does not create
149+
a :class:`feature_extraction.text.TfidfTransformer` at `__init__` as required
150+
by our API.
151+
:pr:`21832` by :user:`Guillaume Lemaitre <glemaitre>`.
152+
131153
:mod:`sklearn.impute`
132154
.....................
133155

sklearn/ensemble/_gb_losses.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -935,8 +935,8 @@ def negative_gradient(self, y, raw_predictions, **kargs):
935935
The raw predictions (i.e. values from the tree leaves) of the
936936
tree ensemble at iteration ``i - 1``.
937937
"""
938-
y_ = -(2.0 * y - 1.0)
939-
return y_ * np.exp(y_ * raw_predictions.ravel())
938+
y_ = 2.0 * y - 1.0
939+
return y_ * np.exp(-y_ * raw_predictions.ravel())
940940

941941
def _update_terminal_region(
942942
self,

sklearn/ensemble/tests/test_gradient_boosting_loss_functions.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,3 +320,18 @@ def test_lad_equals_quantiles(seed, alpha):
320320
y_true, raw_predictions, sample_weight=weights, alpha=alpha
321321
)
322322
assert pbl_weighted_loss == approx(ql_weighted_loss)
323+
324+
325+
def test_exponential_loss():
326+
"""Check that we compute the negative gradient of the exponential loss.
327+
328+
Non-regression test for:
329+
https://github.com/scikit-learn/scikit-learn/issues/9666
330+
"""
331+
loss = ExponentialLoss(n_classes=2)
332+
y_true = np.array([0])
333+
y_pred = np.array([0])
334+
# we expect to have loss = exp(0) = 1
335+
assert loss(y_true, y_pred) == pytest.approx(1)
336+
# we expect to have negative gradient = -1 * (1 * exp(0)) = -1
337+
assert_allclose(loss.negative_gradient(y_true, y_pred), -1)

0 commit comments

Comments
 (0)
0