-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Expected behaviour regarding new copy
keyword in __array__
implementations
#25941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Deprecated though, right? The signature should already have included the
What we want to achieve I think is that only a single copy is ever made. We can have two now in many cases, because the other library may copy from its internal implementation into an array, and then numpy will make another copy. The only way to get it right is for the other library to do the copy if The old behavior is:
The intended new behavior is:
It looks like we need to do two things:
This change of |
One additional aspect of the old behaviour is illustrated as follows: class MyArray:
def __init__(self, arr):
self._ndarray = arr
def __array__(self, dtype=None):
return self._ndarray
In [7]: arr = MyArray(np.array([1, 2, 3]))
In [8]: np.asarray(arr, dtype="float64")
Out[8]: array([1., 2., 3.]) So when When a |
It seems that on current main, numpy still does a copy afterwards, so |
Right now it's actually not possible to get to the code path in numpy that calls |
This should probably be fixed:
Right now the We also need to update the docs to cover how implementers should handle this. |
Currently copy=False is commented out due to a bug in numpy. see: numpy/numpy#25941 (comment)
Closing, thanks. |
Currently copy=False is commented out due to a bug in numpy. see: numpy/numpy#25941 (comment)
I have an additional question here that is not fully covered in the migration guide note that closed this issue:
I don't know what "if copy keyword is considered" means exactly (only when handling the copy keyword in your actual implementation, or whenever adding the copy keyword to the signature?), but it seems that adding # no copy keyword -> raises an error (and depr warning) when specifying copy=False
class MyArray:
def __array__(self, dtype=None):
return np.array([1,2,3], dtype=int)
>>> np.array(MyArray(), copy=False)
DeprecationWarning: __array__ should implement the 'dtype' and 'copy' keyword argument
ValueError: Unable to avoid copy while creating an array as requested.
If using `np.array(obj, copy=False)` replace it with `np.asarray(obj)` to allow a copy when needed (no behavior change in NumPy 1.x).
For more details, see https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword.
# adding copy keyword just to signature -> no error
class MyArray:
def __array__(self, dtype=None, copy=None):
return np.array([1,2,3], dtype=int)
>>> np.array(MyArray(), copy=False)
array([1, 2, 3]) So just adding And this is what for example pandas did (pandas-dev/pandas#57739, and I just wanted to do the same for pyarrow). Is the behaviour difference expected? (also in the case of |
This is working as intended. We only raise the warning if the function doesn't support the keyword at all. As soon as you add the keyword (even if you ignore it), numpy has to assume the implementation has done what the user expects when
Hmm, maybe it would be better to reword the error message to say something like "numpy tried passing copy=False, but the Ping @mtsokol, can we get some minor updates to docs and error messages to alleviate the confusion @jorisvandenbossche ran into? |
That message would have helped me as implementor, but is probably not super useful for users? (they don't control the implementation of Update the migration guide / |
And https://numpy.org/devdocs/user/basics.interoperability.html#the-array-method already mentions the expected behaviour. It might be good to add a link to that from https://numpy.org/doc/stable/reference/generated/numpy.ndarray.__array__.html |
If we touch the docs, it would be nice to use "must" rather than "should". Should can (and when it comes to such things does) convey that downstream has a choice: they don't (i.e. is more like a strong recommendation)! So the correct terminology is "must". |
Another point of feedback: for implementing the class MyArray:
def __array__(self, dtype=None):
if dtype is None:
return self._ndarray
return self._ndarray.astype(dtype) To update this to have a EDIT: on second thought, I suppose this can be done much simpler by actually relying on the new behaviour of |
The From your edit, it sounds like you figured out an appropriate workaround. |
@jorisvandenbossche I think this could be the shortest way to support both def __array__(self, dtype=None, copy=None):
dtype = self._ndarray.dtype if dtype is None else dtype
return np.array(self._ndarray, dtype=dtype, copy=copy) |
Currently copy=False is commented out due to a bug in numpy. see: numpy/numpy#25941 (comment)
… compatibility (#41071) ### Rationale for this change Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208) ### What changes are included in this PR? Add a `copy=None` to the signatures of our `__array__` methods. This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment) ### Are these changes tested? Yes ### Are there any user-facing changes? No (compared to usage with numpy<2) * GitHub Issue: #39532 * GitHub Issue: #41098 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
… compatibility (#41071) ### Rationale for this change Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208) ### What changes are included in this PR? Add a `copy=None` to the signatures of our `__array__` methods. This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment) ### Are these changes tested? Yes ### Are there any user-facing changes? No (compared to usage with numpy<2) * GitHub Issue: #39532 * GitHub Issue: #41098 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
This has been explained, and a new PR is open to change the behavior, but it will be in 2.1 with a deprecation. Closing this. |
…y 2.0+ compatibility (apache#41071) ### Rationale for this change Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208) ### What changes are included in this PR? Add a `copy=None` to the signatures of our `__array__` methods. This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment) ### Are these changes tested? Yes ### Are there any user-facing changes? No (compared to usage with numpy<2) * GitHub Issue: apache#39532 * GitHub Issue: apache#41098 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
…y 2.0+ compatibility (apache#41071) ### Rationale for this change Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208) ### What changes are included in this PR? Add a `copy=None` to the signatures of our `__array__` methods. This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment) ### Are these changes tested? Yes ### Are there any user-facing changes? No (compared to usage with numpy<2) * GitHub Issue: apache#39532 * GitHub Issue: apache#41098 Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
Can this comment #25941 (comment) be linked to in the deprecation warning? it was very useful to me. I can make the PR if ok! |
Do you mean adding a link to the comment? Most of it has been put into the migration guide, no? |
Moving from #25922, which added this to the docs:
Thinking through how to update some
__array__
implementations, I am wondering: what are the expectations from numpy how an__array__
implementations handles this keyword fully or partially?When
copy=True
is being passed, does numpy assume that the__array__
implementation already made that copy? But what if the implementation defers dtype casting to numpy (which is currently possible), then it might be a waste to already copy up front?Originally posted by @jorisvandenbossche in #25922 (comment)
The text was updated successfully, but these errors were encountered: