FIX Ignore zero sample weights in precision recall curve #18328

albertvillanova · 2020-09-02T17:15:03Z

Supersede and close #16319.

…_curve

rth

Thanks a lot @albertvillanova ! A few comments are below.

rth · 2020-09-03T11:30:03Z

sklearn/metrics/_ranking.py

+    if len(np.unique(y_true)) != 2:
+        raise ValueError("Only one class present in y_true. Detection error "
+                         "tradeoff curve is not defined in that case.")
+


Why this change? I would imagine it's better to check for it before computing the the _binary_clf_curve?

As it was, if you pass a multiclass y_true, it would rise "Only one class present..." ValueError.

With the reorder, a multiclass ValueError is raised by _binary_clf_curve.

this needs to be in whats_new since it's changing a behavior.

@adrinjalali are you sure?

When passing a multiclass y_true, before and now a ValueError is raised; that does not change. The only change is the error message:

Before: "Only one class present...", which was wrong.

Now: "multiclass format is not supported"

sklearn/metrics/_ranking.py

sklearn/utils/validation.py

lorentzenchr

I've a few minor comments.

lorentzenchr · 2021-01-25T12:08:57Z

sklearn/metrics/_ranking.py

        sample_weight = column_or_1d(sample_weight)
+        _check_sample_weight(sample_weight, y_true)


Suggested change

sample_weight = column_or_1d(sample_weight)

_check_sample_weight(sample_weight, y_true)

sample_weight = _check_sample_weight(sample_weight, y_true)

_check_sample_weight internally calls check_array. This proposed change is the only one I consider important and not cosmetic.

Done. Thank you for pointing this out.

I think you can remove the line sample_weight = column_or_1d(sample_weight).

After that change, I think it'll be a LGTM from my side:smirk:

In this case, we can't call _check_sample_weight.

@scikit-learn/core-devs Could someone help out?
According to the docstrings, _binary_clf_curve accepts only 1d array-likes of shape (n_samples,). Nevertheless, the code for _binary_clf_curve, dating back more than 7 yeary at #2251 d8f90d1, calls column_or_1d on y_true, y_score and sample_weight and therefore also accepts shape (n_samples, 1).

Shall we adopt the docstings or be stricter to the inputs and actually require shape (n_sampels,).

I don't think we should change the behavior. When there's a discrepancy between the code and the docs, we tend to fix the docs. In this case, I think I'd be happy with having column_or_1d, since it makes it more convenient for people to work with the API, and it doesn't have much of a consequence. In terms of the docs, I'm not sure if we need to updated them, but I wouldn't object an update I think. I'd need to think more about it.

@adrinjalali Thanks for your expert judgement. In this case, I'd leave the docs as is.
The only open question is then whether to use _check_sample_weight, which enforces shape (n_samples,), xor column_or_1d for sample_weight—but not both.
@albertvillanova For me, the choice is yours:smirk:

Oh, I realize: the current solution with both column_or_1d before _check_sample_weight works as expected.

sklearn/metrics/_ranking.py

sklearn/metrics/tests/test_ranking.py

lorentzenchr · 2021-01-26T11:00:38Z

I propose to open a new issue for the renaming and deprecation in precision_recall_curve from predict_proba to y_score and ask for other core devs' opinion there.

lorentzenchr

LGTM. Thanks for your patience.

adrinjalali

Thanks @albertvillanova

adrinjalali · 2021-01-31T16:52:27Z

sklearn/metrics/_ranking.py

+        sample_weight = _check_sample_weight(sample_weight, y_true)
+        nonzero_weight_mask = sample_weight != 0
+        y_true = y_true[nonzero_weight_mask]
+        y_score = y_score[nonzero_weight_mask]
+        sample_weight = sample_weight[nonzero_weight_mask]


Would it not be better to instead of filtering out zero sample weights, doing the computation of scores in a way that zero sample weights are not taken into account? This proposed change results in a difference between 0. and epsilon, which I don't think is desired.

For whatever reasons you have some zeros in sam 10000 ple_weight, you better always get the same length of arrays. Imagine y_score = clf.predict_proba(X) with y_score.shape[0] == X.shape[0] == sample_weight.shape[0]. Then, filtering out zero weight samples here in _binary_clf_curve is quite convenient.

This proposed change results in a difference between 0. and epsilon, which I don't think is desired.

Could you elaborate?

Sorry @adrinjalali I do not understand your point either...

let say we want a weighted average of a bunch of numbers, one way is first to filter out data with zero weight, the other is to have sth like: sum(w[i]*x[i]) / sum(w[i]) and ignore the fact that some of those weights are zero. The calculation itself takes care of zero sample weights.

Now in this case, there's an issue with having samples with zero weight in the data. My question is if we just filter out the zero weights, then what happens to the sample with weight equal to 1e-32 for instance? That's practically zero, and should be treated [almost] the same as the ones with zero sample weight.

AFAIK, _binary_clf_curve is only used here in _ranking.py. It calculates counts cumsum(y * weight) which are the ingredients for precision, recall and the like. You never have ... / sum(weight). This would cancel out in ratios of counts anyway. Therefore, I would have expected that filtering out zero weight samples is equivalent.
The real problem mentioned in #16065, however, seems to be the different values of thresholds, i.e. thresholds with only zero weights should be dropped. This PR solves this problem (by removing those sample).

Suggestion: Should we just add a test with weight1=[1, 1, 1e-30] and weight2=[1, 1, 0] for several relevant scores/erros/losses?

Can we reach a consensus? I would like to get this one closed?

I played around a little bit, I'm happy with this solution as is :)

adrinjalali · 2021-01-31T16:54:07Z

sklearn/metrics/_ranking.py

+    if len(np.unique(y_true)) != 2:
+        raise ValueError("Only one class present in y_true. Detection error "
+                         "tradeoff curve is not defined in that case.")
+


this needs to be in whats_new since it's changing a behavior.

lorentzenchr · 2021-03-27T11:13:37Z

@adrinjalali I guess you're busy with other important tasks. Do you still intend to review this one or should we look out for another reviewer?

adrinjalali · 2021-04-02T15:48:37Z

I did what I wanted to do (#18328 (comment)), I'm happy for this to be merged :)

lorentzenchr · 2021-04-02T21:15:31Z

@adrinjalali Thanks for confirming. A green check mark would have been enough:smirk:

@albertvillanova Could you move the whatsnew entry from 0.24 to 1.0 and thereby solve merge conflicts (after merging main)? After that, I'm eager to hit the green button.

albertvillanova · 2021-04-04T19:27:24Z

Done @lorentzenchr 😉

…n#18328) Co-authored-by: Alonso Silva Allende <alonsosilva@gmaiil.com>

Alonso Silva Allende and others added 7 commits January 30, 2020 15:24

Ignore weights equal zero in precision recall curve

2374879

Remove previous changes

3bca885

Add check positive sample_weight with default=False

89d1858

Add check for strictly positive sample_weight in binary_clf_curve

7d3e151

Change name from positive to check_negative

6974713

Add check to see is sample_weight is strictly positive in _binary_clf…

c336a40

…_curve

Merge remote-tracking branch 'upstream/master' into pr/16319

db507b7

github-actions bot added module:metrics module:utils labels Sep 2, 2020

albertvillanova added 6 commits September 3, 2020 09:20

Address requested changes

3f0eb2d

Add and fix tests

31418e8

Reorder sanity checks in det_curve

b1f7339

Rename probas_pred to y_score for consistency

9f89984

Fix linting

f42cbef

Add whatsnew

e4aabb5

albertvillanova marked this pull request as ready for review September 3, 2020 09:06

albertvillanova changed the title ~~Ignore zero sample weights in precision recall curve~~ FIX Ignore zero sample weights in precision recall curve Sep 3, 2020

rth reviewed Sep 3, 2020

View reviewed changes

albertvillanova added 2 commits September 3, 2020 15:09

Remove checking for negative sample weight

c1cddd6

Merge remote-tracking branch 'upstream/master' into pr/16319

3d49013

Base automatically changed from master to main January 22, 2021 10:53

albertvillanova added 3 commits January 25, 2021 10:02

Revert arg renaming for backward compatibility

1a6c65f

Merge remote-tracking branch 'origin/main' into pr/16319

b06c816

Update whatsnew

ed12766

cmarmo added the Waiting for Reviewer label Jan 25, 2021

lorentzenchr reviewed Jan 25, 2021

View reviewed changes

cmarmo removed the Waiting for Reviewer label Jan 25, 2021

albertvillanova added 2 commits January 25, 2021 21:55

Factorize test curve_func parameter definition

cc71416

Remove blank line

35866c9

cmarmo added the Waiting for Reviewer label Jan 26, 2021

Assign returned validated sample weight

303b375

lorentzenchr approved these changes Jan 30, 2021

View reviewed changes

adrinjalali reviewed Jan 31, 2021

View reviewed changes

cmarmo removed the Waiting for Reviewer label Feb 1, 2021

adrinjalali self-assigned this Feb 15, 2021

albertvillanova added 3 commits April 4, 2021 20:48

Remove whatsnew entry from O.24

1d31472

Merge remote-tracking branch 'upstream/main' into pr/16319

44264e6

Add whatsnew entry to 1.0

d41aba3

Merge remote-tracking branch 'upstream/main' into pr/16319

b3075a4

lorentzenchr merged commit c957eb3 into scikit-learn:main Apr 5, 2021

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Apr 19, 2021

FIX Ignore zero sample weights in precision recall curve (scikit-lear…

d4696c0

…n#18328) Co-authored-by: Alonso Silva Allende <alonsosilva@gmaiil.com>

glemaitre mentioned this pull request Apr 22, 2021

Release 0.24.2 #19954

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX Ignore zero sample weights in precision recall curve #18328

FIX Ignore zero sample weights in precision recall curve #18328

		sample_weight = column_or_1d(sample_weight)
		_check_sample_weight(sample_weight, y_true)

	sample_weight = column_or_1d(sample_weight)
	_check_sample_weight(sample_weight, y_true)
	sample_weight = _check_sample_weight(sample_weight, y_true)

FIX Ignore zero sample weights in precision recall curve #18328

FIX Ignore zero sample weights in precision recall curve #18328

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment