8000 NumPy ABI does not successfully maintain forward compatibility -- should it? · Issue #5888 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

NumPy ABI does not successfully maintain forward compatibility -- should it? #5888

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
njsmith opened this issue May 17, 2015 · 20 comments
Closed

Comments

@njsmith
Copy link
Member
njsmith commented May 17, 2015

Not sure whether this is something we consider a bug, but flagging for potential discussion and so it doesn't get lost: I hadn't realized until today that NumPy in practice does not provide a forward compatible ABI, i.e., if you build with numpy 1.9 then try to run against 1.8, this may not work. Apparently packages that care about this like sklearn are actively working around this by carefully installing old versions of numpy before building wheels.

In particular, we have several times added extra fields to the dtype struct. In practice this is normally fine b/c no-one actually accesses these fields, but Cython in particular does struct size checking. For backwards compat -- build against 1.8 and then run against 1.9 -- the struct appears to get larger, and Cython merely issues a warning (which we suppress). For forward compat -- build against 1.9 and then run against 1.8 -- the struct appears to get smaller, and in this case Cython issues a hard error.

We could work around this by simply exposing a truncated struct to user code, so that Cython sees a small struct when doing sizeof, and the actual object is always larger then this, meaning that we always hit the warning path rather than the error path.

I don't know if this is the only problem we would have to fix in order to achieve forward compatibility, e.g. I haven't checked the C API import code to see what import_multiarray or import_umath do when they find themselves running against an older version of numpy.

If we want to take ABI compatibility seriously I guess we should probably also start requiring C API users to explicitly state which version of the ABI they expect, and enforce that they don't get access to anything newer than that. This would at least give us the option then in the future to provide different versions of the same function to old-users and new-users.

@rgommers
Copy link
Member

I'm pretty sure that numpy itself also raises an error when running against an older version than compiled against: https://github.com/numpy/numpy/blob/master/numpy/core/code_generators/generate_numpy_api.py#L90

@rgommers
Copy link
Member

This is related to the desire to hide implementation details, which is discussed in http://docs.scipy.org/doc/numpy/reference/c-api.deprecations.html#background in the docs. And maybe in other places?

@rgommers
Copy link
Member

Some more digging for discussions on the Cython issue: RuntimeWarning http://thread.gmane.org/gmane.comp.python.cython.devel/13072. Still can't find the more recent discussion though.

@njsmith
Copy link
Member Author
njsmith commented Jul 6, 2017

This has come up again now that pip has started to properly support build-requires (see PEP 518). The problem is that scipy (for example) then has to decide what version of numpy to declare a build-dependency on. If they depend on plain numpy, then they'll always be built against the latest version of numpy, and then you end up with scipy binaries that require the latest version of numpy, and this breaks stuff. OTOH if scipy declares a specific version of numpy to build against, like numpy == 1.8.1, then they have problems because this doesn't work if you're trying to build on python 3.6, because numpy 1.8 just flat out doesn't work on python 3.6. Instead you need numpy == 1.2.x. Or worse, maybe you're building on python 3.7, where no-one knows which version of numpy you'll need, yet scipy has to make a guess and encode into their sdists and if they guess wrong then things break.

There's lots more discussion here: pypa/pip#4582

Anyway, it seems like what we really want is a way for scipy to tell numpy "I'm using the 1.8 ABI" at build time, and produce a binary that uses the 1.8 ABI, even when building against a newer numpy. That breaks us out of this loop, because scipy can build-require numpy >= 1.8 and build against the latest version, but then the resulting binaries work the same regardless.

One way this might work:

  • Before importing any numpy headers, you have to do something like #define NUMPY_ABI_VERSION 0x010014
  • If you don't define this at all, then by default you get the last version before we added this support. (So if we release this in 1.14, then the default is 1.13 compatibility, i.e. all old code keeps working but if you want new stuff you have to start doing the explicit opt-in thing).
  • We use this to pick the ABI version in import_array / import_umath, and also use it to hide things like new fields inside dtypes.

We already have most of the code we'd need to implement this – every binary that uses the C ABI already gets a magic number compiled in saying which version of the ABI it expects, so if you have a scipy-built-against-numpy-1.8 and import it on numpy 1.13, then numpy 1.13 has a little table that says "ah, this is expecting the 1.8 ABI, which contained these entries, let me export just those ones". The problem is that right now this flexibility is only exposed at import time, not at build time.

@rgommers
Copy link
Member
rgommers commented Jul 6, 2017

Thanks for bringing this up again @njsmith. I've thought about it for a bit just now and have the vague feeling some tricky issue could come up (also remembering the pain of the not working OS X 10.5 SDK), but can't put my finger on it. Implementation of what you propose seems relatively straightforward.

@shoyer
Copy link
Member
shoyer commented Jul 6, 2017

Instead you need numpy == 1.2.x.

@njsmith I assume you mean 1.12.x?

@njsmith
Copy link
Member Author
njsmith commented Jul 6, 2017

Yes.

@ghost
Copy link
ghost commented Aug 30, 2017

Out of curiosity, where is the magic number table defined?

@jdemeyer
Copy link
Contributor

@rgommers
Copy link
Member

Related: https://discuss.python.org/t/support-for-build-and-run-time-dependencies/1513

I've just read through that, and agree with most the responses you got from Nathaniel, Thomas and Paul. In general, this isn't really a major issue, it's just something to be aware of at the moment: just build your wheels against the lowest NumPy version you want to support at runtime.

That one requirement can be expressed in the current pyproject.toml and setup.py. A new option "use this ABI" would perhaps make this a little easier, but it's low priority imho given that in practice there aren't many complaints about this.

@jdemeyer
Copy link
Contributor

just build your wheels against the lowest NumPy version you want to support at runtime.

I don't think there is any guarantee that this will continue to work for every numpy version ever released in the future. For example, maybe some day numpy 2.0 will be released with a very different API which won't be compatible with packages built against numpy 1.x. So having some way to indicate that would still be useful.

By the way, I never claimed that it's a big issue. But I've been bitten by this (not with numpy but with another Python package with a C API).

@rgommers
Copy link
Member

I don't think there is any guarantee that this will continue to work for every numpy version ever released in the future. For example, maybe some day numpy 2.0 will be released with a very different API which won't be compatible with packages built against numpy 1.x.

Of course, there is no such guarantee. That's a different issue though - the only way to protect against future API or ABI changes is to use install_requires <= numpy_current_released_version for your own packages. Which would be the right thing today in many case, but unfortunately no one does this.

@seberg
Copy link
Member
seberg commented Apr 25, 2019

Well, we would definitely increment major version, so most software (with very little risk) should be able to say < 2.0 probably? Not sure that is good habit though.

@jdemeyer
Copy link
Contributor

Of course, there is no such guarantee.

But why not? Why can't numpy have some kind of ABI guarantee? In fact, that's already the case in practice. Why not make it official? This will not make it impossible for numpy to change its ABI, it just means that the version needs to be increased to 2.0 if that happens.

@seberg
Copy link
Member
seberg commented Apr 25, 2019

@jdemeyer I think we have that guarantee, except in some rare cases were we have a long deprecation beforehand. I suppose I misunderstood the thread maybe then.

@rgommers
Copy link
Member

But why not? Why can't numpy have some kind of ABI guarantee? In fact, that's already the case in practice. Why not make it official? This will not make it impossible for numpy to change its ABI, it just means that the version needs to be increased to 2.0 if that happens.

We will do exactly that, and make every effort to keep ABI compatibility for 1.x. Again that has very little to do with either forward compatibility or your pypa thread though. It seems like we're talking past each other here. If we release 2.0 with an incompatible ABI, then one should build against 2.0 as the lowest supported version. That can be done today, simply by putting numpy == 2.0.0 in pyproject.toml. In whatever other scheme you come up with you'll have to make the same update.

@estan
Copy link
estan commented Jun 19, 2019

We recently ran into this with pycuda + numpy as well. We had

pycuda==2019.1
numpy==1.14.0

in our requirements.txt, but that will not work since pycuda has numpy>=1.6 in its setup_requires, which means it will be built and linked against the latest version of numpy currently on PyPI (1.16.4) (installed to a temp folder by pip during package building), and the numpy 1.14.0 we're pinning is not forward compatible with that, so you get a crash when trying to import pycuda.

Reported this against pycuda here: inducer/pycuda#209

A possible workaround from the pycuda side would be for them to pin to the exact oldest version against which it can build (so 1.6 in this case) in its setup_requires. That would make the pycuda maximally compatible with any later 1.x numpy the user wants to use (provided numpy maintaines backward compatibility throughout the 1.x series). At the moment it sort of requires the user to use the latest numpy (or well, whatever was the latest when pycuda was built).

@mattip
Copy link
Member
mattip commented Jun 19, 2019

If I recall correctly, SciPy pins its build system to the oldest numpy it wishes to support, but tests against later numpy versions.

@estan
Copy link
estan commented Jun 19, 2019

@mattip Alright, that sounds reasonable. I'll see what @inducer says regarding pycuda.

@seberg
Copy link
Member
seberg commented Nov 20, 2022

Closing. This is effectively solved by https://pypi.org/project/oldest-supported-numpy/ and build requires. So we don't attempt this, and thats that.
(There is an interesting problem of ensuring that API break in the future is adhered to even though you compile against an older version, i.e. some compat headers, etc. This would be for a major NumPy 2.0 release, and something we need to figure out... But probably not on this issue.)

@seberg seberg closed this as completed Nov 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants
0