8000 Question about root project? · Issue #583 · data-apis/array-api · GitHub
[go: up one dir, main page]

Skip to content

Question about root project? #583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
NeilGirdhar opened this issue Dec 24, 2022 · 4 comments
Closed

Question about root project? #583

NeilGirdhar opened this issue Dec 24, 2022 · 4 comments
Labels
Question General question.

Comments

@NeilGirdhar
Copy link
NeilGirdhar commented Dec 24, 2022

Sorry if this is the wrong place to ask, but I wanted to understand a few things about the Array API project. Given an array that implements the Array API, how does one get the Array API module object for that array? Is there a "root Array API project" that provides a function to do that?

Similarly, where are constants and sentinels like newaxis stored? Does each Array API implementer have a copy of such sentinels? Or is there a root project that exposes them?

Finally, is there a place where the type annotations for the Array API are stored? How does one annotate a library without depending on any particular implementation of the Array API?

@rgommers rgommers added the Question General question. label Dec 24, 2022
@rgommers
Copy link
Member

That is a good question. The basic answer is x.__array_namespace__(), see https://data-apis.org/array-api/2022.12/purpose_and_scope.html#how-to-adopt-this-api.

What was found in real-world adoption, e.g. in scikit-learn, that there's a need for a compatibility layer which is "all functions in an existing library, made array API standard compliant where there is overlap". There is now a package for that: https://github.com/data-apis/array-api-compat. Current that has NumPy and CuPy support. There is a concrete plan to add PyTorch to that soon, and there's room for other libraries too.

Similarly, where are constants and sentinels like newaxis stored? Does each Array API implementer have a copy of such sentinels? Or is there a root project that exposes them?

Yes, each implementer should have a copy of all objects in the standard. There is no root project, because existing libraries have very little appetite to add any runtime dependencies, even pure Python ones (and understandably so). See also https://data-apis.org/array-api/2022.12/assumptions.html#dependencies, which explains the premise that different array libraries are not aware of each other.

Finally, is there a place where the type annotations for the Array API are stored? How does one annotate a library without depending on any particular implementation of the Array API?

We do need that common place for a few typing protocols at least, but it doesn't exist yet. Static typing hasn't been a priority yet (other than making sure that every function/object is cleanly type-able without the need for lots of unions), but we should sort that out.

@NeilGirdhar
Copy link
Author

Thanks for the detailed explanation. This is a very exciting project! All your points make perfect sense.

I've always wished for NumPy to have a more compact API (e.g., numpy/numpy#8864), and it's coming sooner than I'd imagined! What really got me excited about this is:

In [1]: import numpy.array_api as xp
In [2]: z = xp.zeros(10)
In [3]: z.
           device    ndim      T
           dtype     shape     to_device
           mT        size

compared with NumPy:


In [11]: y.
 all          astype       conj         data         fill         item         nbytes       ptp          resize       size         swapaxes     tostring
 any          base         conjugate    diagonal     flags        itemset      ndim         put          round        sort         T            trace
 argmax       byteswap     copy         dot          flat         itemsize     newbyteorder ravel        searchsorted squeeze      take         transpose
 argmin       choose       ctypes       dtype        flatten      max          nonzero      real         setfield     std          tobytes      var
 argpartition clip         cumprod      dump         getfield     mean         partition    repeat       setflags     strides      tofile       view
 argsort      compress     cumsum       dumps        imag         min          prod         reshape      shape        sum          tolist

It just feels so much less cluttered. I will probably use the Array API in my NumPy/JAX code even when I don't need to.

@rgommers
Copy link
Member

That's great to hear @NeilGirdhar, thanks! And I completely agree on API compactness - I hope we can clean up NumPy to at least be less messy than it is now. The split between functions and methods is pretty arbitrary indeed.

For one other little bit of context: we've figured out by now that we need (at least) one implementation of the array API standard that is minimal - everything that's in the standard, and nothing else. That is numpy.array_api right now. Every library should then be a superset of that - we expect that for various reasons, authors of every library want/need their own bells and whistles. As long as there's that minimal implementation that you can use to develop and test against, you can then be pretty confident that your code will be portable to other libraries.

@rgommers
Copy link
Member

I will probably use the Array API in my NumPy/JAX code even when I don't need to.

If you run into any issues, please don't hesitate to ping me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Question General question.
Projects
None yet
Development

No branches or pull requests

2 participants
0