8000 ENH: Add support for copy modes to NumPy by czgdp1807 · Pull Request #19173 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Add support for copy modes to NumPy #19173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 65 commits into from
Nov 15, 2021
Merged

Conversation

czgdp1807
Copy link
Member
@czgdp1807 czgdp1807 commented Jun 5, 2021

This PR implements never copy mode in numpy array creation by adding an enum, np.array_api.CopyMode which will have three members, ALWAYS (equivalent to True in current API), IF_NEEDED (equivalent to False in current API) and NEVER (the never copy mode). Related NEP - 47.

Related Issues

#9818
#11884
#8679

Related PRs
#11897

@czgdp1807 czgdp1807 changed the title Supporting Copy modes in NumPy [WIP] Supporting Copy modes in NumPy Jun 5, 2021
@czgdp1807
Copy link
Member Author

ping @seberg I will keep updating this PR. I have added CopyMode enum as of now.

@BvB93 BvB93 added 01 - Enhancement 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes 59 - Needs tests labels Jun 5, 2021
@BvB93 BvB93 changed the title [WIP] Supporting Copy modes in NumPy ENH: Add support for copy modes to NumPy Jun 5, 2021
@czgdp1807 czgdp1807 changed the title ENH: Add support for copy modes to NumPy [WIP] ENH: Add support for copy modes to NumPy Jun 7, 2021
@BvB93
Copy link
Member
BvB93 commented Jun 7, 2021

Pinging the people involved in the previous copy=np.never_copydiscussion: @shoyer and @seberg

@seberg seberg added the 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. label Jun 7, 2021
@BvB93
Copy link
Member
BvB93 commented Jun 7, 2021

As a bit of general (non-C-related) advice: there are quite a few linting related remarks that should ideally be addressed.

@czgdp1807
Copy link
Member Author

As a bit of general (non-C-related) advice: there are quite a few linting related remarks that should ideally be addressed.

Yeah. I will run the linting tests locally to fix such kind of issues. Thanks for pointing out.

@czgdp1807
Copy link
Member Author

I have addressed the reviews. @mattip @seberg

cc: @rgommers @charris

@czgdp1807
Copy link
Member Author

@seberg Is this good to go?

@seberg
Copy link
Member
seberg commented Nov 11, 2021

Probably, I am tempted to push that double checking for the dunder refactor myself. In any case its holiday today, so will only have a close look/do that tomorrow probably.

@czgdp1807
Copy link
Member Author
czgdp1807 commented Nov 11, 2021

No worries. Please feel free to push any changes here. Anything (commits, reviews) that can help in completing this will work for me.

rgommers and others added 6 commits November 12, 2021 16:15
Yes, these may be slightly biased towards my own opinions.
Doing the check before calling `PyArray_FromArrayAttr` means
we look up the attribute twice.
It further fixes two bugs:
1. The reference counting was wrong.
2. In debug mode there was a failure for some deprecation warnings
   (probably due to error propagation)
The first name was `do_copy`, which was confusing because it sounds
like a copy is forced (opposite is the case!)  `allow_copy` would
have been better, but corrects it into the wrong direction.
`never_copy` is correct, and aligns with the Python side names
@seberg
Copy link
Member
seberg commented Nov 12, 2021

Should be OK now from my side. First thought I would put this in and make a new PR, but the refactor is necessary since the tests were currently failing.
I will note that the new flag is public in C, but we do not actually document that fact. But I don't care much about it.

@seberg
Copy link
Member
seberg commented Nov 12, 2021

(There is another missed path above, but I will ignore it, because it is in an ugly deprecated code-path)

@seberg seberg added the 63 - C API Changes or additions to the C API. Mailing list should usually be notified. label Nov 13, 2021
@czgdp1807
Copy link
Member Author

Travis CI is stalled I think. Is there anyway to start the two tests there? Rest of the tests are passing though.
@rgommers

@rgommers
Copy link
Member

Travis CI is stalled I think. Is there anyway to start the two tests there? Rest of the tests are passing though.
@rgommers

Restarted - just ignore if they don't complete, they seem to be hanging again.

@czgdp1807
Copy link
Member Author

Restarted - just ignore if they don't complete, they seem to be hanging again.

Yeah. They appear to be in queue forever.

@rgommers rgommers removed the triage review Issue/PR to be discussed at the next triage meeting label Nov 13, 2021
@charris
Copy link
Member
charris commented Nov 14, 2021

Is everyone happy with this?

@seberg
Copy link
Member
seberg commented Nov 14, 2021

I am happy with this, just think someone should sanity check my fixups.

@czgdp1807
Copy link
Member Author
czgdp1807 commented Nov 15, 2021

The changes here look good to me. @seberg I went through your code and they look good. Thanks for making the style fixes and improvisations. @mattip Any thoughts on this? I am okay with merging this into master and making it a part of 1.22 release. Thanks.

cc: @charris

@mattip mattip merged commit 7125cdf into numpy:main Nov 15, 2021
@mattip
Copy link
Member
mattip commented Nov 15, 2021

Thanks @czgdp1807, @seberg

@czgdp1807 czgdp1807 deleted the never_copy branch November 15, 2021 09:53
@seberg
Copy link
Member
seberg commented Jan 12, 2022

So one battle I decided to ignore the first time around is that while for __array__ clearly a copy may have happened, for the buffer protocol or the __array_struct__ and __array_interface__ we assume a copy did not happen.

For __array_struct__ and __array_interface__ this assumption is not entirely unfounded, because ownership is guaranteed to remain with the exporting object (the implementations rely on this, but gh-20674 wants to change that).
Ownership does not mean "no-copy" of course, in practice it is just fairly likely that is what people are doing.

For the buffer protocol, things are even murkier. NumPy somewhat assumes ownership stays with the exporter, but really, the buffer protocol does not guarantee this at all. Python has a hack to try to ensure ownership in one place, but it is a pretty bad guess. (Again, the caveat applies that ownership does not guarantee no-copy per-se, it just makes workflows that do copy difficult and weird.)

So... maybe we should refuse the "no copy" stuff completely. Small problem: There is no actual "no copy" protocol remaining for use in np.asarray anymore then :).
I had the slight opinion to not do gh-20674 so that we could hope to assume that everyone who uses it does use a copy=NEVER compatible meaning and start specifying that (i.e. if you copy, you move to __array__ or your users may be surprised and you got a bug).

The problem is not impossible to solve: E.g. we can aim for __array__(..., ensure_view=True) to provide a protocol compatible with this meaning and deprecate __array_struct__ anway, because __array__ is really better and more extensible anyway.

So, I am happy to move forward with that other PR, ideally, this one gets restricted down though. Probably together with expanding the __array__ Protocol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. 63 - C API Changes or additions to the C API. Mailing list should usually be notified.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants
0