-
-
Notifications
You must be signed in to change notification settings - Fork 26k
ENH: Add Array API support to NDCG/DCG score #31152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
def _get_device_arr(arr_np): | ||
# Gets the equivalent device array for input numpy array | ||
# Downcasts to a lower float precision type if float64 isn't | ||
# supported (e.g. on MPS) | ||
if np.isdtype(arr_np.dtype, "real floating"): | ||
max_float_dtype = _max_precision_float_dtype(xp, device) | ||
arr_xp = xp.asarray(arr_np, dtype=max_float_dtype, device=device) | ||
arr_np = _convert_to_numpy(arr_xp, xp) | ||
return arr_np, arr_xp | ||
arr_xp = xp.asarray(arr_np, device=device) | ||
return arr_np, arr_xp | ||
|
||
a_np, a_xp = _get_device_arr(a_np) | ||
b_np, b_xp = _get_device_arr(b_np) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✨ AI Suggestion: The helper _get_device_arr
modifies the original numpy array (arr_np
) based on device capabilities before it's used for the reference calculation (metric_np
). This couples the reference calculation to the test device's limitations (e.g., float precision). It's preferable to calculate the reference metric using the unmodified original numpy arrays and create the array API arrays separately, adjusting assertion tolerances later if necessary due to potential downcasting.
def _get_device_arr(arr_np): | |
# Gets the equivalent device array for input numpy array | |
# Downcasts to a lower float precision type if float64 isn't | |
# supported (e.g. on MPS) | |
if np.isdtype(arr_np.dtype, "real floating"): | |
max_float_dtype = _max_precision_float_dtype(xp, device) | |
arr_xp = xp.asarray(arr_np, dtype=max_float_dtype, device=device) | |
arr_np = _convert_to_numpy(arr_xp, xp) | |
return arr_np, arr_xp | |
arr_xp = xp.asarray(arr_np, device=device) | |
return arr_np, arr_xp | |
a_np, a_xp = _get_device_arr(a_np) | |
b_np, b_xp = _get_device_arr(b_np) | |
def _create_xp_arr(arr_np_orig): | |
# Creates the equivalent device array, downcasting if needed. | |
# Returns the xp array and a flag indicating if downcasting occurred. | |
if np.isdtype(arr_np_orig.dtype, "real floating"): | |
max_float_dtype = _max_precision_float_dtype(xp, device) | |
needs_downcast = np.dtype(max_float_dtype).itemsize < arr_np_orig.dtype.itemsize | |
arr_xp = xp.asarray(arr_np_orig, dtype=max_float_dtype, device=device) | |
return arr_xp, needs_downcast | |
arr_xp = xp.asarray(arr_np_orig, device=device) | |
return arr_xp, False | |
# Create Array API arrays, potentially downcasted | |
a_xp, a_needs_downcast = _create_xp_arr(a_np) | |
b_xp, b_needs_downcast = _create_xp_arr(b_np) | |
# Note: Use the original a_np, b_np for the reference metric_np calculation later. | |
# Adjust assertion tolerance based on *_needs_downcast flags if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lithomas1 ! Just some flyby comments,
ranking = np.argsort(y_score)[:, ::-1] | ||
ranked = y_true[np.arange(ranking.shape[0])[:, np.newaxis], ranking] | ||
cumulative_gains = discount.dot(ranked.T) | ||
ranking = _flip(xp.argsort(y_score), axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you do?
ranking = _flip(xp.argsort(y_score), axis=1) | |
ranking = (xp.argsort(y_score), axis=1, descending=True) |
@@ -1487,20 +1495,27 @@ def _dcg_sample_scores(y_true, y_score, k=None, log_base=2, ignore_ties=False): | |||
Cumulative Gain (the DCG obtained for a perfect ranking), in order to | |||
have a score between 0 and 1. | |||
""" | |||
discount = 1 / (np.log(np.arange(y_true.shape[1]) + 2) / np.log(log_base)) | |||
xp, _, device = get_namespace_and_device(y_true, y_score) | |||
max_float_dtype = _max_precision_float_dtype(xp, device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar question as in #30878, if we should be using max precision for metrics, lets make this consistent for all metrics and move away from _find_matching_floating_dtype
.
max_float_dtype = _max_precision_float_dtype(xp, device) | ||
log_base = xp.asarray(log_base, device=device, dtype=max_float_dtype) | ||
discount = 1 / ( | ||
xp.log(xp.arange(y_true.shape[1], dtype=max_float_dtype, device=device) + 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question for my education, is this to make sure arange
gives an array with max_float_dtype
? Does that matter if log_base
already has max_float_dtype
?
cumulative_gains = discount.dot(ranked.T) | ||
ranking = _flip(xp.argsort(y_score), axis=1) | ||
ranked = xp.take_along_axis(y_true, ranking, axis=1) | ||
cumulative_gains = discount @ xp.asarray(ranked.T, dtype=max_float_dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't ranked
already be of max_float_dtype
?
cumulative_gains[i] = _tie_averaged_dcg( | ||
y_true[i, :], y_score[i, :], discount_cumsum | ||
) | ||
cumulative_gains = xp.asarray(cumulative_gains, device=device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
# TODO: use unique_all when pytorch supports it | ||
# _, _, inv, counts = xp.unique_all(-y_score) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an issue we could link to?
_, inv = xp.unique_inverse(-y_score) | ||
_, counts = xp.unique_counts(-y_score) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much of a performance hit do we get performing 'unique' twice? Would it be worth making a helper that uses unique_all
unless the xp is torch?
Reference Issues/PRs
xref #26024
supersedes #29339
What does this implement/fix? Explain your changes.
Makes ndcg/dcg_score array API compatible.
Any other comments?