-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG + 3] ENH linearly scale to (-0.5, 0.5) instead of [-0.4, 0.4] #4295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
facdfa6
to
7826375
Compare
return votes + sum_of_confidences * \ | ||
(0.4 / max(abs(max_confidences), abs(min_confidences))) | ||
# Map the sum_of_confidences to (0, 1) using the sigmoid function | ||
return votes + (expit(0.5*sum_of_confidences)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need a 0.5 here?
@mblondel the sigmoid is not <1 for large scores for numeric reasons. |
The other option would be to use a very small value (0.001 instead of the 0.5) and stretch the sigmoid further till it can handle all sane values for |
well why change one magic value to another? |
True that ;) |
or maybe |
With both options there is a minor problem that when score_1 = 38, score_2 = 40 the output in both cases will be 0.9. There are two more ways we could deal with it :
Both these methods are quite weird as @amueller pointed out :) Perhaps we could simply replace
EDIT: I am suggesting |
If the original 0.4 magic constant is considered too magic then I think a linear scaling to |
7826375
to
29af585
Compare
Thanks for the comment! Done! |
29af585
to
0d41b33
Compare
return votes + sum_of_confidences * \ | ||
(0.4 / max(abs(max_confidences), abs(min_confidences))) | ||
(scaling_factor / max(abs(max_confidences), abs(min_confidences))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would write that as:
# Scale the sum_of_confidences to (-0.5, 0.5) and add it with votes.
# The motivation is to use confidence levels as a way to break ties in the
# votes without switching any decision made based on a difference of 1
# vote.
eps = np.finfo(sum_of_confidences.dtype).eps
max_abs_confidence = max(abs(max_confidences), abs(min_confidences)))
scale = (0.5 - eps) / max_abs_confidence
return votes + sum_of_confidences * scale
0d41b33
to
d6e389c
Compare
@ogrisel done! |
Fine with me. |
Should I consider that a +1 ? @mblondel Could you also please take a look at this? |
d6e389c
to
342e6e4
Compare
@ogrisel Could you take a look at this? |
LGTM as well. @mblondel no objection to using linear scaling instead of a logistic sigmoid? The ranking induced by any monotonic function to the |
Nope, no objection. |
Alright, merging. Thanks @ragv! |
[MRG + 3] ENH linearly scale to (-0.5, 0.5) instead of [-0.4, 0.4]
Initially started off addressing #3891 (comment) by using
expit
(sigmoid) to map thesum_of_confidences
to(0, 1)
...Currently uses linear scaling to map the
sum_of_confidences
to(-0.5, 0.5)
before adding it to thevotes
. (scaling factor0.4
-->5 - np.finfo(sum_of_confidences.dtype).eps
)@mblondel @amueller kindly take a look!