-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: DLPack cannot export readonly buffers and raises wrong exception #20742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you first standardize that a |
@seberg I have two thoughts. First, I do like it for better clarity. However I am opposed to adding new spec to DLPack at this point because
|
Include it for the next version? If this should raise a
This is one of the points that Now, implicit
Well, specifying that So, nailing these things down in the SPEC will go a huge way to making it not painful give my blessing – it will at the least make it clear that it isn't just ignored due to a solution being tricky. But you don't need my blessing to move this forward, so all good... |
Don't you think even
Yes. Now that I look back, I think we better revisit @rgommers's data-apis/array-api#191 (comment) in the linked issue. Apparently NumPy picked one of his suggestions (raise an exception) but it led to this very issue.
Sorry, I am not sure I follow here. What does it have to do with
I now think that a DLPack exporter should deliver the readonly flag and let the consumer decide what to do with it, instead of shutting down the door right away. Let me see if we could get this discussed a bit in the Array API meeting tomorrow.
Don't give up, Sebastian! I am sure we'll figure it out 😂 |
Maybe, but if you expect NumPy to raise a In mpi4py you can rely on the |
If NumPy thinks it is a better choice and jumps right on it, I can convince the team to follow. We picked this choice way way way before NumPy had it or Array API started, but I am sure we would like to align with NumPy if this is the ultimate decision. cc: @kmaehashi |
To be clear, I did not think about whether My whole point is that the fact that cupy, et. al. also use something that isn't a EDIT: Or in other words: NumPy should not be the source of truth for cupy here. The spec should be the source of truth for both. |
For most cases NumPy is the source of truth for us (CuPy). This case is special only because
so yes we probably have some wiggle room here. For mpi4py, @dalcinl probably would have time to jump in later this week, but as I said in the very beginning the choice of exception type is just a short-term workaround; the best scenario is NumPy also exports the readonly arrays (see #20742 (comment), for example). Again, if NumPy (and other libraries that support the readonly flag) does this, the discussion is moot. |
Interesting discussion, and yes there's more work to do. Just a note that we've just arranged an internship to work on DLPack support for 3 months starting end of January - it's high on my wish list to work on docs, rough edges and extensions in DLPack itself. My $2c on the decision here: NumPy has zero users for this functionality yet, we released support a few days ago. So let's adjust NumPy to CuPy et al. where that make sense, and not ask CuPy to make backwards incompatible changes unless it means ending up with significantly improved behaviour. |
All I would like is to have a clear basis/decision. Obviously NumPy can change everything easily, but if a Right now: torch.BoolTensor([False]).__dlpack__()
# RuntimeError: Bool type is not supported by dlpack Where cupy will give the better |
To follow up: In the Array API meeting yesterday it's decided that the intern will help us tie up the loose ends -- revising the DLPack spec if needed -- so that this and several other issues can be addressed properly. Spoiler alert: We'll likely want to export a readonly tensor anyway and find a way to let the consumer decide how to deal with it. Before that happens, let's keep this issue open and not rush into any change. |
Great :). I am not sure I think this is an intern-sized problem, but if you got a good intern who is interested in figuring this out it hopefully can be. (I still think more of this work clearly should have been done ahead of time, and things would have been much easier and better now.) Independently though, a decision to have a |
What's the cutoff date for 1.22.1? |
Dunno, lets assume there will be one pretty soon (there are some bugs that need fixing, but also still some open ones so best case 1-2 weeks I would say). But 1.22.2 is likely not far off anyway after that, so it probably mainly matters how quick you want a fix. |
Thanks, Sebastian. One more quick question:
Would this suggestion necessarily involve revising the DLPack spec as you suggested earlier? |
Look, I just feel that when we change it nobody will want to change it again. So if NumPy+mpi4py agree on this it will either standardize this effectively or give us headaches later if something else gets standardized. I do not think that standardizing anything is my decision to make, though. |
Pushing this off, but if we want to change the exception, that is possible, there is still time until the release. |
Irrespective of the type of exception raised, the fact that non-writeable arrays cannot be exported with The dataframe interchange protocol itself is being drafted here: and it explicitly relies on |
Unfortunately, NumPy cannot fix this. This requires a push from the dlpack side and I am not sure how to do that. Maybe you can also comment there? The main issue is probably dmlc/dlpack#104 I persoally do think DLPack not extending its ABI is quickly becoming a critical issue for any further adoption, and it needs to be cleared up ASAP. |
Minor correction: we can in principle do this in NumPy, just like JAX is happily exporting its read-only arrays. We just have decided to not do it when merging NumPy's DLPack support, because it was a sticking point and there was no consensus on doing it the same way as JAX. Fully agreed that it would be better to extend DLPack, but it's turned out to be quite hard to move forward (and not for lack of effort by @tirthasheshpatel and @mattip). So for now it looks like we've just parked the issue until there is renewed effort (or more urgent need) to push this forward. It seems like this particular issue that @ogrisel is running into will be straightforward to resolve on the Pandas side though. |
@rgommers What about supporting the thing if some environment variable is set? Previous practice would support my proposal. Additional mechanisms could be added, but an environment variable is a good start, and much better than nothing. |
@dalcinl we can add an env variable as a stop-gap for experimentation and testing. But it is not a long term solution IMO. So, we can provide work-arounds, we could even do the unsafe export. But on the unsafe export I would need some convincing and input from a wider audience that they feel that is OK. |
@seberg I'm 100% on your side regarding the need to find a proper solution. My environ variable proposal is just a way to strength NumPy's statement of need and provide some alternative to nothing/nichts/nada/niente/rien. With an environment variable or any other alternative opt-in mechanism, it is ultimately the end-user who decides whether to activate the feature and takes responsibility for the outcome. A little extra freedom for these poor souls cannot harm them that much 😄, right? |
I'd prefer to start doing the export unconditionally over an env var, but if the env var is the one thing on the table without a long discussion now, I think I'm okay with that. One issue with env vars is that they stick around for too long. Case in point: I thought we had gotten rid of Maybe we should not document the env var, so that people who come look for it comment here with a "hey I needed this as well"? |
Contrary to
Not documenting the env var may be too much. NumPy could advertise it, declare it provisional, warn that its support and/or behavior may change at any time. And then perhaps point to a polling place (GH issue?) where people can upvote for "I need this" to get lobbying power. I believe setting the env var (and having to remember about it) to get things working will be annoying enough to guarantee the votes you want. PS: Sorry, I'm a bit confused. What's the root issue here? Is it lack of agreement on how to extend DLPack? Is it lack of contributors to implement the new features? Or is it DLPack's rejection to extend and evolve the standard? |
The main issue is that DLPack is a few C structs that are currently not extendable. So an ABI break is needed to make it extendable, add versioning, etc. Doing so in a way that works for both Python and C/C++ users and is not overly disruptive to the current user base is highly nontrivial. Significant work has been put into proposals, but (a) the maintainers of DLPack have quite limited bandwidth, and (b) they are reluctant to now work on this for what is at least partially "future-proofing". The need for adding alignment info is probably the most compelling argument right now, a readonly flag is nice to have but not all that compelling by itself. |
Sorry, I don't get this. Why do we need DLPack maintainers to spare any bandwidth (other than clicking "merge") if we can find people to contribute (like what Tirth was doing)? I am not a fan of a community infrastructure being blocked because of this. To me this seems to be suggesting DLPack was not the most ideal choice if it can't evolve as needed. |
I think it's a bit more complicated than that. Breaking the ABI at all is a big decision, and it'd need a very carefully documented and communicated strategy. If someone wanted to do that for NumPy, with a ready-to-go PR including some docs and a version bump, I'm fairly sure we'd not hit the merge button just like that:) |
I was perhaps a bit exaggerating. I didn't mean it's a simple task or anything like that. I already recognized it's a community infrastructure and naturally any change would involve reaching a consensus of the majority of community maintainers. What I meant is if someone can lead the said effort, it should not take too much bandwidth of any DLPack maintainers. What's more important is the green checks from downstream (more is better). So I disagree that the burden is on Tianqi et al (again, under the assumption that if someone can work on this). |
Pushing off the milestone, probably could also just close the issue: Right now I assume DLPack will recieve the update and fix this eventually. If anyone wants to look into a NumPy side work-around, that might be a different issue. Although, could still change it to reliably give EDIT: Leaving milestone for the possible error change only for now, do not hesitate to move it. |
I would actually like MUST, but since most implementations currently don't actually give a ``BufferError``, writing it as SHOULD instead. First part of closing numpy/numpy#20742
See also data-apis/array-api#498. I think we should just change this. It is a niche feature and just an error type change. Closes numpygh-20742
See also data-apis/array-api#498. I think we should just change this. It is a niche feature and just an error type change. Closes numpygh-20742
* API: Specify that DLPack should use BufferError I would actually like MUST, but since most implementations currently don't actually give a ``BufferError``, writing it as SHOULD instead. First part of closing numpy/numpy#20742 * Fix duplicated word Co-authored-by: Sebastian Berg <sebastianb@nvidia.com> Co-authored-by: Athan <kgryte@gmail.com>
Describe the issue:
The root issue is currently DLPack does not support exporting readonly arrays:
numpy/numpy/core/src/multiarray/dlpack.c
Lines 122 to 126 in 7262b9c
which prevents its usefulness from situations where downstream consumers only need to read it or wrap it in the same readonly fashion (ex: JAX, MPI). IMHO NumPy should not make this abrupt decision for consumers, but instead just raises a warning.
But, for short-term solution before we converge, it's more important that a
BufferError
, instead ofTypeError
, is raised, in order to be aligned with the Python buffer protocol. In mpi4py, we do not anticipateTypeError
and it causes this test to fail in our CI, which is a regression:https://github.com/mpi4py/mpi4py/blob/02a6f31f6cc4a5779c6414b993a714caa2bf93c6/test/test_msgspec.py#L399-L404
cc: @dalcinl @mattip
Reproduce the code example:
Error message:
https://github.com/mpi4py/mpi4py-testing/runs/4676901782?check_suite_focus=true
NumPy/Python version information:
NumPy 1.22.0
The text was updated successfully, but these errors were encountered: