8000 Allow Python integer return for `count_nonzero(arr, axis=None)` · Issue #932 · data-apis/array-api · GitHub
[go: up one dir, main page]

Skip to content

Allow Python integer return for count_nonzero(arr, axis=None) #932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
seberg opened this issue Apr 2, 2025 · 13 comments
Open

Allow Python integer return for count_nonzero(arr, axis=None) #932

seberg opened this issue Apr 2, 2025 · 13 comments
Labels
API change Changes to existing functions or objects in the API. Needs Discussion Needs further discussion.

Comments

@seberg
Copy link
Contributor
seberg commented Apr 2, 2025

NumPy returns Python integers in that particular case, and I am not sure I feel it's worth changing it.

The reason is, that if a place only uses axis=None (the default), the result of intp vs. Python integer, changes promotion rules when combined with e.g. float32.

Now, I don't think this will happen a lot, but for count_nonzeros I am not sure I don't think this isn't more user-friendly and an OK and very small inconsistency.
NumPy itself will be fine (because we never only use axis=None), skimage I saw one function which would return float64 rather than a Python float.

(Maybe one of those things where, I feel forcing intp return is great, but I am not sure changing NumPy main namespace for it is worthwhile.)

See numpy/numpy#28615 (comment)

@ev-br
Copy link
Member
ev-br commented Apr 9, 2025

Not sure. So the proposal is to change the spec (https://data-apis.org/array-api/latest/API_specification/generated/array_api.count_nonzero.html) to make the return type a python int is axis=None and an array otherwise?

@rgommers
Copy link
Member
rgommers commented Apr 9, 2025

@seberg I'm not sure I understand the concern. I just checked NumPy, CuPy, PyTorch and JAX - it's a 0-D array in all other libraries, and it's an array scalar for NumPy in case of a 1-D array with axis=0:

>>> import numpy as np
>>> np.count_nonzero(np.arange(5))
4
>>> np.count_nonzero(np.arange(5), axis=0)
np.int64(4)

The axis=None case seems like a simple oversight, which is inconsistent with NumPy's own design and also makes static typing harder. I think it's pretty safe to change that to return an array scalar?

@seberg
Copy link
Contributor Author
seberg commented Apr 14, 2025

The axis=None case seems like a simple oversight, which is inconsistent with NumPy's own design

@rgommers the point is that Python integers tend to more useful, and it is a subtle change. Yeah, it may have been an oversight, I don't know, but I also don't mind too much if axis=None is slightly different.

It is a relatively small change, so I don't care too much. But I am not quite convinced it is a clear improvement. E.g. that sklearn function is certainly better off to return Python floats but will change to return float64.

to make the return type a python int is axis=None and an array otherwise?

I am proposing to amend it to capture the current state: Returning a Python integer is OK (at least for axis=None), i.e. both are acceptable.

@lucascolley
Copy link
Member

Returning a Python integer is OK (at least for axis=None), i.e. both are acceptable.

Are there existing examples in the spec where the return type is underspecified between an array and python scalar like this case?

@seberg
Copy link
Contributor Author
seberg commented Apr 22, 2025

Are there existing examples in the spec where the return type is underspecified between an array and python scalar like this case?

I doubt it, but the question is whether we are comfortable with changing NumPy here, not with whether this makes sense in the array api.

@lucascolley
Copy link
Member

Okay. Either way, I don't think array-api-compat is the right place for this discussion. How does this sound?

  1. Discuss on the NumPy repo whether to make changes in line with the spec or not.
  2. If the answer is yes, great. If the answer is no, open an issue on the array-api repo about potentially changing the spec.

Maybe the answer to (1) will be "we could change it, but we think it makes more sense for the spec to change", in which case the appropriate way forward is probably to see whether other stakeholders agree with that on the array-api repo.

@seberg
Copy link
Contributor Author
seberg commented Apr 22, 2025

Sorry, you are right, I meant to open this issue on array-api, not array-api-compat :/

@lucascolley
Copy link
Member

no worries! We can transfer the issue over. (I would still suggest trying to get a little more consensus from NumPy's side before asking others to weigh in on the array-api side)

@lucascolley lucascolley transferred this issue from data-apis/array-api-compat Apr 22, 2025
@lucascolley lucascolley added API change Changes to existing functions or objects in the API. Needs Discussion Needs further discussion. labels Apr 22, 2025
@leofang
Copy link
Contributor
leofang commented Apr 22, 2025

No Python scalars, please 🙂 This goes against our design principle, and is not compatible with device support. It forces a synchronous d2h copy to happen for no obvious gain.

@lucascolley
Copy link
Member

This goes against our design principle, and is not compatible with device support. It forces a synchronous d2h copy to happen for no obvious gain.

I think the proposal is not to require that python scalars are returned, but rather to allow either python scalars or a 0D-array.

@seberg
Copy link
Contributor Author
seberg commented Apr 22, 2025

@leofang I was asking if the spec should allow Python integers, not suggest that it is better for anyone but NumPy's main namespace (mainly to not change things unnecessarily).

get a little more consensus from NumPy's side

Well, this is hard since the usual answer to any of this is 🤷 as it is very hard to know whether this type of change is a nuisance to users or not.

I.e. to push this to NumPy, I still would like agreement for: Sure, we are happy to just allow this for NumPy's sake (or accept if NumPy diverges subtly).
Otherwise, things tend to be seen as a one-way road even if it should be two-way.

@leofang
Copy link
Contributor
leofang commented Apr 22, 2025

I think the proposal is not to require that python scalars are returned, but rather to allow either python scalars or a 0D-array.

@leofang I was asking if the spec should allow Python integers, not suggest that it is better for anyone but NumPy's main namespace (mainly to not change things unnecessarily).

But we spent a great effort in avoiding polymorphic return values. This change would reinstate polymorphism of some sort, and at least from the typing perspective it'd make type addition harder, no? I have a hard time imagining how this change helps downstream consumption.

@seberg
Copy link
Contributor Author
seberg commented Apr 22, 2025

It doesn't help anyone from an array API perspective but subtly breaking downstream from a NumPy perspective isn't helpful either.

Anyway, chances are nobody will care enough even it may well have some surprising effects downstream (probably not many).
I probably just don't like the dynamic of a subtle change in NumPy going in because of a choice here that was done without the information that it will even cause such change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Changes to existing functions or objects in the API. Needs Discussion Needs further discussion.
Projects
None yet
Development

No branches or pull requests

5 participants
0