ENH: Add Array API support to NDCG/DCG score #31152

lithomas1 · 2025-04-06T20:23:03Z

Reference Issues/PRs

xref #26024
supersedes #29339

What does this implement/fix? Explain your changes.

Makes ndcg/dcg_score array API compatible.

Any other comments?

github-actions · 2025-04-06T20:24:22Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: cf7b3c7. Link to the linter CI: here}

Prateikx · 2025-04-09T17:46:51Z

sklearn/metrics/tests/test_common.py

+    def _get_device_arr(arr_np):
+        # Gets the equivalent device array for input numpy array
+        # Downcasts to a lower float precision type if float64 isn't
+        # supported (e.g. on MPS)
+        if np.isdtype(arr_np.dtype, "real floating"):
+            max_float_dtype = _max_precision_float_dtype(xp, device)
+            arr_xp = xp.asarray(arr_np, dtype=max_float_dtype, device=device)
+            arr_np = _convert_to_numpy(arr_xp, xp)
+            return arr_np, arr_xp
+        arr_xp = xp.asarray(arr_np, device=device)
+        return arr_np, arr_xp
+
+    a_np, a_xp = _get_device_arr(a_np)
+    b_np, b_xp = _get_device_arr(b_np)


✨ AI Suggestion: The helper _get_device_arr modifies the original numpy array (arr_np) based on device capabilities before it's used for the reference calculation (metric_np). This couples the reference calculation to the test device's limitations (e.g., float precision). It's preferable to calculate the reference metric using the unmodified original numpy arrays and create the array API arrays separately, adjusting assertion tolerances later if necessary due to potential downcasting.

Suggested change

def _get_device_arr(arr_np):

# Gets the equivalent device array for input numpy array

# Downcasts to a lower float precision type if float64 isn't

# supported (e.g. on MPS)

if np.isdtype(arr_np.dtype, "real floating"):

max_float_dtype = _max_precision_float_dtype(xp, device)

arr_xp = xp.asarray(arr_np, dtype=max_float_dtype, device=device)

arr_np = _convert_to_numpy(arr_xp, xp)

return arr_np, arr_xp

arr_xp = xp.asarray(arr_np, device=device)

return arr_np, arr_xp

a_np, a_xp = _get_device_arr(a_np)

b_np, b_xp = _get_device_arr(b_np)

def _create_xp_arr(arr_np_orig):

# Creates the equivalent device array, downcasting if needed.

# Returns the xp array and a flag indicating if downcasting occurred.

if np.isdtype(arr_np_orig.dtype, "real floating"):

max_float_dtype = _max_precision_float_dtype(xp, device)

needs_downcast = np.dtype(max_float_dtype).itemsize < arr_np_orig.dtype.itemsize

arr_xp = xp.asarray(arr_np_orig, dtype=max_float_dtype, device=device)

return arr_xp, needs_downcast

arr_xp = xp.asarray(arr_np_orig, device=device)

return arr_xp, False

# Create Array API arrays, potentially downcasted

a_xp, a_needs_downcast = _create_xp_arr(a_np)

b_xp, b_needs_downcast = _create_xp_arr(b_np)

# Note: Use the original a_np, b_np for the reference metric_np calculation later.

# Adjust assertion tolerance based on *_needs_downcast flags if necessary.

lucyleeow

Thanks @lithomas1 ! Just some flyby comments,

lucyleeow · 2025-05-22T02:12:24Z

sklearn/metrics/_ranking.py

-        ranking = np.argsort(y_score)[:, ::-1]
-        ranked = y_true[np.arange(ranking.shape[0])[:, np.newaxis], ranking]
-        cumulative_gains = discount.dot(ranked.T)
+        ranking = _flip(xp.argsort(y_score), axis=1)


could you do?

Suggested change

ranking = _flip(xp.argsort(y_score), axis=1)

ranking = (xp.argsort(y_score), axis=1, descending=True)

lucyleeow · 2025-05-22T02:19:53Z

sklearn/metrics/_ranking.py

@@ -1487,20 +1495,27 @@ def _dcg_sample_scores(y_true, y_score, k=None, log_base=2, ignore_ties=False):
        Cumulative Gain (the DCG obtained for a perfect ranking), in order to
        have a score between 0 and 1.
    """
-    discount = 1 / (np.log(np.arange(y_true.shape[1]) + 2) / np.log(log_base))
+    xp, _, device = get_namespace_and_device(y_true, y_score)
+    max_float_dtype = _max_precision_float_dtype(xp, device)


Similar question as in #30878, if we should be using max precision for metrics, lets make this consistent for all metrics and move away from _find_matching_floating_dtype.

lucyleeow · 2025-05-22T02:21:38Z

sklearn/metrics/_ranking.py

+    max_float_dtype = _max_precision_float_dtype(xp, device)
+    log_base = xp.asarray(log_base, device=device, dtype=max_float_dtype)
+    discount = 1 / (
+        xp.log(xp.arange(y_true.shape[1], dtype=max_float_dtype, device=device) + 2)


Just a question for my education, is this to make sure arange gives an array with max_float_dtype ? Does that matter if log_base already has max_float_dtype ?

lucyleeow · 2025-05-22T02:23:14Z

sklearn/metrics/_ranking.py

-        cumulative_gains = discount.dot(ranked.T)
+        ranking = _flip(xp.argsort(y_score), axis=1)
+        ranked = xp.take_along_axis(y_true, ranking, axis=1)
+        cumulative_gains = discount @ xp.asarray(ranked.T, dtype=max_float_dtype)


Shouldn't ranked already be of max_float_dtype ?

lucyleeow · 2025-05-22T02:25:45Z

sklearn/metrics/_ranking.py

+            cumulative_gains[i] = _tie_averaged_dcg(
+                y_true[i, :], y_score[i, :], discount_cumsum
+            )
+        cumulative_gains = xp.asarray(cumulative_gains, device=device)


Is this needed?

lucyleeow · 2025-05-22T02:29:40Z

sklearn/metrics/_ranking.py

+    # TODO: use unique_all when pytorch supports it
+    # _, _, inv, counts = xp.unique_all(-y_score)


Is there an issue we could link to?

lucyleeow · 2025-05-22T02:30:25Z

sklearn/metrics/_ranking.py

+    _, inv = xp.unique_inverse(-y_score)
+    _, counts = xp.unique_counts(-y_score)


How much of a performance hit do we get performing 'unique' twice? Would it be worth making a helper that uses unique_all unless the xp is torch?

ENH: Add Array API support to NDCG/DCG score

b5ca6af

github-actions bot added module:metrics module:utils labels Apr 6, 2025

lithomas1 added 2 commits April 6, 2025 17:25

try to fix type of target

a0ad9b4

docs

175fae6

lithomas1 marked this pull request as ready for review April 6, 2025 21:44

virchan added the Array API label Apr 7, 2025

Prateikx reviewed Apr 9, 2025

View reviewed changes

lucyleeow reviewed May 22, 2025

View reviewed changes

lucyleeow mentioned this pull request May 22, 2025

Make more of the "tools" of scikit-learn Array API compatible #26024

Open

Merge branch 'main' into dcg-score

cf7b3c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Add Array API support to NDCG/DCG score #31152

ENH: Add Array API support to NDCG/DCG score #31152

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	ranking = _flip(xp.argsort(y_score), axis=1)
	ranking = (xp.argsort(y_score), axis=1, descending=True)

		# TODO: use unique_all when pytorch supports it
		# _, _, inv, counts = xp.unique_all(-y_score)

		_, inv = xp.unique_inverse(-y_score)
		_, counts = xp.unique_counts(-y_score)

Uh oh!

ENH: Add Array API support to NDCG/DCG score #31152

Are you sure you want to change the base?

ENH: Add Array API support to NDCG/DCG score #31152

Conversation

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Uh oh!

✔️ Linting Passed

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!