-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
1.20.0 RC: change in array coercion for scalar objects with __array_interface__ #17965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the heads up. Yeah, I didn't really expect objects like this, that are partial array-likes. We could probably add a tiny hack to restore a most of this for now:
I could probably inject an Do you think the behaviour should be retained indefinitely, or should we look into trying for a |
It is too bad that I don't have public API for dtypes, yet... Otherwise we could probably even create a
|
I think eventually removing that capability is certainly reasonable, but if it could be done with a deprecation cycle instead of directly changing it, that would be great (and give us a bit more time to get Shapely 2.0 ready). |
I made PR at gh-17973 which can be used to see how things should end up, still needs tests though. The one issue is that I can't think of good way for opting into the future behaviour (aside from the type pretending to be a Python sequence, which may not always be correct?). Some rare users may just have to ignore the warning :(. This does not change the fact that if the object is a sequence (or pretends to be one), the shape is always taken from the array-interface and not by unpacking the sequence. (Although I think that was partially changed in 1.19.x already.) |
This fixes issue numpygh-17965. The slightly annoying thing is that there is no simple way to opt-in to the new behaviour and the old behaviour is a bit quirky to begin with (honoring the dtype, but not the shape).
This fixes issue numpygh-17965. The slightly annoying thing is that there is no simple way to opt-in to the new behaviour and the old behaviour is a bit quirky to begin with (honoring the dtype, but not the shape).
I'm going to close this. If the issue persists after 1.20.k0rc2 is released, please reopen. |
Late reply, but thanks a lot for the quick follow-up and nice deprecation! Justed tested this, and this should work for us. There is still a difference compared to < 1.20: now |
Actually, I spoke too soon. The latest numpy gives a lot of failures in the geopandas test suite ... A small example without relying on Shapely, adapted from above:
So it's not anymore about the direct array creation with So the above raises with rc2:
while this worked in rc1. |
In one of the tests added in #17973, there is a note:
So this doesn't seem to be true in practice? If the result would actually be ignored (or the error catched and ignored), then the issue would be fixed I think? |
Heh, there was another issue with shapely, which lead us to ignore less errors when calling I am wondering now if I should go the 1.19.x route and only catch clearly "bad" errors (recursion and memory error), or keep it as is here and catch all errors but we add catch |
Oh, I didn't fully parse your last comment. So I guess raising the error may be fine for shapely/you if we ensure there is a special case for |
@jorisvandenbossche to decide how to best fix this:
Maybe @eric-wieser will also have a quick opinion. The question is whether this
must not error, but if we ignore the error in the first two examples, that will fall out without any special handling (although technically, it might be better to special case it anyway, due to the |
For the first two cases:
I think it would be nice if they don't start erroring now (since it works in released numpy). But if numpy would prefer to bubble up the error, maybe something similar could be done as for the case that Now, on the preferred long-term behaviour: never show such a NotImplementedError or not, I am not fully sure. On the one hand, it seems consistent with the current deprecation warning that objects with one of such protocols will "be coerced as if it was first converted using |
Closes (the later point) in numpygh-17965 and reverts parts of numpygh-17817. Shapely did rely on being able to raise a NotImplementedError which then got ignored in the attribute lookup. Arguably, this should probably just raise an AttributeError to achieve that behaviour, but it means we can't just rip the band-aid off here. Since 1.20 is practically released, just reverting most of the change (leaving only recursion and memory error which are both arguably pretty fatal). Ignoring most errors should be deprecated (and I am happy to do so), but it is not important enough for 1.20 or very important in itself. Closes numpygh-17965
OK, just opened a small PR to just revert it. I am very happy to add the deprecation (or even the more complex logic), but it just isn't very important, so doing anything for 1.20 already seems not even worth back porting release note additions :). |
Closes (the later point) in numpygh-17965 and reverts parts of numpygh-17817. Shapely did rely on being able to raise a NotImplementedError which then got ignored in the attribute lookup. Arguably, this should probably just raise an AttributeError to achieve that behaviour, but it means we can't just rip the band-aid off here. Since 1.20 is practically released, just reverting most of the change (leaving only recursion and memory error which are both arguably pretty fatal). Ignoring most errors should be deprecated (and I am happy to do so), but it is not important enough for 1.20 or very important in itself. Closes numpygh-17965
I checked that the latest master, and geopandas tests are passing again now (didn't check with the 1.20.x maint branch, but assuming that that will be OK as well since the fix is backported). Thanks! |
Uh oh!
There was an error while loading. Please reload this page.
The Shapely package defines a set of scalar geometry objects, but some of those objects also expose its underlying data using the array interface.
Up to now,
np.array([..], dtype=object)
coercion seemingly didn't check for an__array_interface__
on the elements in the list, so that code like the following to create an object array of such elements worked:but with the 1.20.0 RC, we now get the following:
I suppose this might be an intentional change (https://numpy.org/devdocs/release/1.20.0-notes.html#array-coercion-restructure, #16200). But still wanted to raise this, so you are at least aware of it (the release notes say "We are not aware of any such case").
This can be reproduced without shapely with the following example:
Although for Shapely itself there isn't any direct broken behaviour (no failing tests with 1.20.0 RC), everybody who is putting Shapely geometries in arrays will get troubles. And specifcially, the GeoPandas package is doing this a lot.
Some notes:
I know that having an
__array_interface__
on an object that doesn't pretend to be an array, is probably bad practice (it's still a way to expose its data to other libraries using numpy). This is long-standing behaviour of Shapely, but something we are actually planning to deprecate and remove in the near future (among other things, exactly because of playing better together with numpy).We are actively working on a "Shapely 2.0" which will remove this array interface (see this Shapely 2.0 RFC section about this aspect). However, a final release for this is at a minimum a few months away.
In addition to having an array_interface, a subset of the Shapely geometries are also iterable. For this reason, in GeoPandas, we have already been quite consistently using this pattern to create numpy arrays:
With this pattern, the issue described here can be avoided. However, we didn't yet do this everywhere in GeoPandas (eg in places we were sure we only had simple geometries like points, which are not iterable), so we still get some failures.
The text was updated successfully, but these errors were encountered: