8000 API: Introduce ``correction`` argument for ``np.var`` and ``np.std`` [Array API] by mtsokol · Pull Request #25169 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

API: Introduce correction argument for np.var and np.std [Array API] #25169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

mtsokol
Copy link
Member
@mtsokol mtsokol commented Nov 17, 2023

This PR contains next Array API compatibility change.
It adds correction argument to np.var and np.std (and also np.nanvar and np.nanstd), which is an Array API compatible name for ddof.

@mtsokol mtsokol marked this pull request as ready for review November 17, 2023 13:11
@mtsokol mtsokol self-assigned this Nov 17, 2023
@mtsokol mtsokol added this to the 2.0.0 release milestone Nov 30, 2023
@ngoldbaum
Copy link
Member

One of the jobs failed with

worker 'gw2' crashed while running 'numpy/_core/tests/test_multiarray.py::test_sort_float[d-N127]'

indicating a possible aarch64 mac heisenbug. Commenting here for reference in case it happens again. Restarted both failing CI jobs which are unrelated I think.

Copy link
Member
@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, although this was one of the array API changes that was flagged at the last triage meeting as one that might be controversial because people are not a fan of the name correction.

I personally think it's fine to have duplicate names for better array API compatibility but I'm holding off on merging this until we can have a deeper community discussion around the coming NEP for this.

@mtsokol mtsokol changed the title API: Introduce correction argument for np.var and np.std API: Introduce correction argument for np.var and np.std [Array API] Jan 2, 2024
Copy link
Contributor
@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot say I like the name correction much, but given its existence, this seems the right way forward. Some comments in-line.

return (a, where, out, mean)


@array_function_dispatch(_std_dispatcher)
def std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=np._NoValue, *,
where=np._NoValue, mean=np._NoValue):
where=np._NoValue, mean=np._NoValue, correction=0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make the default correction=_np.NoValue so that we can truly recognize if some value is put in? Could even consider doing the same for ddof, but that would be an API change for people whose classes have a .std() method, so probably best not.

Copy link
Member Author
@mtsokol mtsokol Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea - done! Then std(x, ddof=1, correction=0) can also be recognized as incorrect set of parameters.
(updated only correction parameters)

)
else:
ddof = correction

if type(a) is not mu.ndarray:
try:
std = a.std
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good reminder: any class that defines .std() gets these arguments passed on. So, there is a choice to be made here: do we pass on correction (and not ddof) if it is passed in, or do we, as implemented now, just set ddof=correction. I think the choice here is fine, but raising the possible issue just in case (I have had to implement .std() in some astropy classes, but the choice here would not prevent me from adding correction; I'd still need to support ddof anyway...).

Copy link
Member Author
@mtsokol mtsokol Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing it out. I also think that the current setup is fine: Array API defines std and var only for the main namespace. There's no std and var for the Array Object, so there's no need to introduce correction in numpy.ndarray class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marten's point is a little bit different though, this path only happens if the array-like is explicitly not an ndarray. That said, he's also right that any array-like implmenting a std method would need to deal with ddof for backward compatibility, so this code is fine.

If, in the future, we ever decide to deprecate ddof in favor of correction, this issue becomes a bit more acute, but we could always do hacks like first pass correction in a try/except block, see if there's an error, and if so try again with ddof.

@charris charris changed the title API: Introduce correction argument for np.var and np.std [Array API] API: Introduce correction argument for np.var and np.std [Array API] Jan 16, 2024
@mtsokol mtsokol force-pushed the var-std-correction branch from 2fb6d10 to 4540923 Compare January 17, 2024 13:33
@ngoldbaum
Copy link
Member

Alright, I think this one is ready to go in since there haven't been any followups since the last updates. Thanks @mtsokol!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0