8000 Do not use the term discriminative power which is overloaded/ambiguous · scikit-learn/scikit-learn@2b77584 · GitHub
[go: up one dir, main page]

Skip to content

Commit 2b77584

Browse files
committed
Do not use the term discriminative power which is overloaded/ambiguous
1 parent 3c3b59f commit 2b77584

File tree

1 file changed

+23
-19
lines changed

1 file changed

+23
-19
lines changed

examples/linear_model/plot_poisson_regression_non_normal_loss.py

Lines changed: 23 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -310,18 +310,18 @@ def score_estimator(estimator, df_test):
310310
# The experimental data presents a long tail distribution for ``y``. In all
311311
# models, we predict the expected frequency of a random variable, so we will
312312
# have necessarily fewer extreme values than for the observed realizations of
313-
# that random variable. Additionally, the normal conditional distribution used
314-
# in ``Ridge`` and ``HistGradientBoostingRegressor`` has a constant variance,
315-
# while for the Poisson distribution used in ``PoissonRegressor``, the variance
316-
# is proportional to the predicted expected value.
313+
# that random variable. Additionally, the normal distribution used in ``Ridge``
314+
# and ``HistGradientBoostingRegressor`` has a constant variance, while for the
315+
# Poisson distribution used in ``PoissonRegressor``, the variance is
316+
# proportional to the predicted expected value.
317317
#
318318
# Thus, among the considered estimators, ``PoissonRegressor`` is a-priori
319319
# better suited for modeling the long tail distribution of the non-negative
320320
# data as compared to its ``Ridge`` counter-part.
321321
#
322322
# The ``HistGradientBoostingRegressor`` estimator has more flexibility and is
323323
# able to predict higher expected values while still assuming a normal
324-
# conditional distribution with constant variance for the response variable.
324+
# distribution with constant variance for the response variable.
325325
#
326326
# Evaluation of the calibration of predictions
327327
# --------------------------------------------
@@ -389,6 +389,8 @@ def _mean_frequency_by_risk_group(y_true, y_pred, sample_weight=None,
389389
q, y_true_seg, y_pred_seg = _mean_frequency_by_risk_group(
390390
y_true, y_pred, sample_weight=exposure, n_bins=10)
391391

392+
# Name of the model after the class of the estimator used in the last step
393+
# of the pipeline.
392394
model_name = model.steps[-1][1].__class__.__name__
393395
print(f"Predicted number of claims by {model_name}: "
394396
f"{np.sum(y_pred * exposure):.1f}")
@@ -407,7 +409,8 @@ def _mean_frequency_by_risk_group(y_true, y_pred, sample_weight=None,
407409

408410
###############################################################################
409411
# The dummy regression model predicts on constant frequency. This model is not
410-
# discriminative at all but is none-the-less well calibrated.
412+
# attribute the same tied rank to all samples but is none-the-less well
413+
# calibrated.
411414
#
412415
# The ``Ridge`` regression model can predict very low expected frequencies that
413416
# do not match the data. It can therefore severly under-estimate the risk for
@@ -422,13 +425,13 @@ def _mean_frequency_by_risk_group(y_true, y_pred, sample_weight=None,
422425
# claims in the test set while the other three models can approximately recover
423426
# the total number of claims of the test portfolio.
424427
#
425-
# Evaluation of the discriminative power
426-
# --------------------------------------
428+
# Evaluation of the ranking power
429+
# -------------------------------
427430
#
428431
# For some business applications, we are interested in the ability of the model
429-
# to discriminate the riskiest from the safest policyholders, irrespective of
430-
# the absolute value of the prediction. In this case, the model evaluation
431-
# would cast the problem as a ranking problem rather than a regression problem.
432+
# to rank the riskiest from the safest policyholders, irrespective of the
433+
# absolute value of the prediction. In this case, the model evaluation would
434+
# cast the problem as a ranking problem rather than a regression problem.
432435
#
433436
# To compare the 3 models from this perspective, one can plot the fraction of
434437
# the number of claims vs the fraction of exposure for test samples ordered by
@@ -485,8 +488,8 @@ def lorenz_curve(y_true, y_pred, exposure):
485488
ax.legend(loc="upper left")
486489

487490
##############################################################################
488-
# As expected, the dummy regressor is unable to discriminate and therefore
489-
# performs the worst on this plot.
491+
# As expected, the dummy regressor is unable to correctly rank the samples and
492+
# therefore performs the worst on this plot.
490493
#
491494
# The tree-based model is significantly better at ranking policyholders by risk
492495
# while the two linear models perform similarly.
@@ -507,11 +510,12 @@ def lorenz_curve(y_true, y_pred, exposure):
507510
# Main takeaways
508511
# --------------
509512
#
510-
# - A ideal model is both well-calibrated and discriminative.
513+
# - The performance of the models can be evaluted by their ability to yield
514+
# well-calibrated predictions and a good ranking.
511515
#
512516
# - The Gini index reflects the ability of a model to rank predictions
513517
# irrespective of their absolute values, and therefore only assess their
514-
# discriminative power.
518+
# ranking power.
515519
#
516520
# - The calibration of the model can be assessed by plotting the mean observed
517521
# value vs the mean predicted value on groups of test samples binned by
@@ -524,16 +528,16 @@ def lorenz_curve(y_true, y_pred, exposure):
524528
# - Using the Poisson loss can correct this problem and lead to a
525529
# well-calibrated linear model.
526530
#
527-
# - Despite the improvement in calibration, the discriminative power of both
528-
# linear models are comparable and well below the discriminative power of the
529-
# Gradient Boosting Regression Trees.
531+
# - Despite the improvement in calibration, the ranking power of both linear
532+
# models are comparable and well below the ranking power of the Gradient
533+
# Boosting Regression Trees.
530534
#
531535
# - The non-linear Gradient Boosting Regression Trees model does not seem to
532536
# suffer from significant mis-calibration issues (despite the use of a least
533537
# squares loss).
534538
#
535539
# - The Poisson deviance computed as an evaluation metric reflects both the
536-
# calibration and the discriminative power of the model but makes a linear
540+
# calibration and the ranking power of the model but makes a linear
537541
# assumption on the ideal relationship between the expected value of an the
538542
# variance of the response variable.
539543
#

0 commit comments

Comments
 (0)
0