8000 NEP: Add zero-rank arrays historical info NEP by mattip · Pull Request #12166 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

NEP: Add zero-rank arrays historical info NEP #12166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 16, 2018
Merged

Conversation

mattip
Copy link
Member
@mattip mattip commented Oct 14, 2018

Fixes part of #12164. I reworked the formatting of the document, added links to mailing list discussions where I could and removed references to changesets implementing [...] and [()] indexing that I could not find. The original wiki document refers to "Multidimensional Arrays for Python" by Travis Oliphant, draft 02-Feb-2005. Did this later become the Guide to NumPy? Some of the email discussions on sourceforge refer to a PEP that apparently is not PEP 209 since the text does not match.

@eric-wieser
Copy link
Member

changesets ... I could not find

Here's 1864: 9024ff0

Here's 1866: 743d922

Here's 1871: b32744e

:Author: Alexander Belopolsky (sasha), transcribed Matt Picus <matti.picus@gmail.com>
:Status: Draft
:Type: Informational
:Created: 2018-10-14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to backdate this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's put the original date here -- maybe noting the date it was transcribed, too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

~~~~~~~~~~~~~~~~~~~~~~~~

Sasha started a `Jan 2006 discussion`_ on scipy-dev
with the folowing proposal:
Copy link
Member

Choose a reason for hiding this comment

The r 8000 eason will be displayed to describe this comment to others. Learn more.

Typo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


.. _`2006 wiki entry`: https://web.archive.org/web/20100503065506/http://projects.scipy.org:80/numpy/wiki/ZeroRankArray
.. _`history`: https://web.archive.org/web/20100503065506/http://projects.scipy.org:80/numpy/wiki/ZeroRankArray?action=history
.. _`2005 mailing list thread`: https://sourceforge.net/p/numpy/mailman/message/11299166
Copy link
Member
@eric-wieser eric-wieser Oct 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this must have a piper-mail url too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strangely enough I could not find it. Seems to be lost in the transition?

@eric-wieser
Copy link
Member
eric-wieser commented Oct 14, 2018

Nice archaeology recovering most of the other links!

@ahaldane
Copy link
Member

Wow, very interesting. Thanks for the revival!

This helps a lot for thinking about the arrayprint code we were working on in the last year, since the 0d vs scalar distinction is very important there.

@mhvk
8000
Copy link
Contributor
mhvk commented Oct 14, 2018

Interesting, though I must admit the text leaves me still puzzled about why 0-d arrays cannot be used also in place of scalars, i.e., why we need the scalar types at all.

@eric-wieser
Copy link
Member
eric-wieser commented Oct 14, 2018

@mhvk: Agreed. I think the argument about isinstance(x, float) working might touch on that, as clearly we can't make float a base class of ndarray - but we already broke isinstance(np.int_(...), int) with python 3, and I haven't seen anybody notice. The claim about _PyEval_SliceIndex is almost certainly outdated now too, and I would hope uses __index__.

As an aside - the way we make the scalars be subclasses of python scalars is forbidden by the CPython docs (#11998), so if we can, I'd prefer to drop these base classes.

@mhvk
Copy link
Contributor
mhvk commented Oct 14, 2018

The email https://web.archive.org/web/20100501162447/http://aspn.activestate.com:80/ASPN/Mail/Message/numpy-discussion/3028210 linked in #12164 (comment), which is all about indexing vs projection, did make it clearer why one logically could have two types. Am still not sure why we couldn't get rid of numpy scalars, though... Is there more than being immutable? [Edited to correct links]

@ahaldane
Copy link
Member

That email link doesn't work for me.

Not sure how important it is, but another case that makes eliminating scalars difficult is object arrays. If the user does a = np.array([MyObj()], dtype='O'), and then a[0], if we eliminate scalars this should return a 0d array containing the object, right? This would break most code using object arrays.

@mhvk
Copy link
Contributor
mhvk commented Oct 14, 2018

@ahaldane - yes, object arrays do stand out again. I guess one would either need to special-case object arrays (not crazy; suddenly the ndarray is really just a nested list...), or semi-expose the object, just like the 0-d arrays have methods that allow them to behave like float/int/complex (this would seem trickier).

Anyway, not specifically arguing that it should change, just noting that the text to me doesn't provide a particularly strong rationale for having both - the e-mail (now correctly linked) was clearer.

@seberg
Copy link
Member
seberg commented Oct 14, 2018

Hehe, derailing discussion, I like it ;) (sorry, needed to distract myself for a few minutes). I do believe that immutability and in-place behaviour are good enough reasons! I also believe that "there should be only one obvious way" is a red herring.

Of course a future numpy could remove all scalars, I frankly believe that scalars have a lot of uses. Often they should even have different semantics ("string" * scalar is far more reasonable than "string" * 0dcontainer). Now, I suppose you could argue that if you need scalars, just make it a python scalar, which is a point, but frankly they are also types such as datetime, so yes, for the basic types it works, but if I have a type with a unit, I would like to be able to get a scalar version of it.

Now why I think it is a red herring: I do think that most of the time if you want something 0D, a scalar is what you want. Assuming you do not argue against the fact that scalars are typically a bit nicer. Having 0D arrays is as first class citizens is not an issue, for the simple reason that they don't randomly appear. So there is one obvious way. Most of the time it is the scalar and sometimes when you need for example mutability the array well be the obvious solution and a blessing.

One thing I disagree is that scalars need indexing. I believe the only reason they do need it is because 0D arrays are not first class, and I challenge anyone to give me an example where fuzzing out the distinction is helpful ;).

On the other hand, maybe I just like consistency too much :).


Indexing of Zero-Rank Arrays
----------------------------

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a note that all these indexing operations have been implemented?

In [2]: x = np.array(1)

In [3]: x[...]
Out[3]: array(1)

In [4]: x[np.newaxis]
Out[4]: array([1])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we want to modify this to reflect the current state of things - I think PEPs tend to reflect the time of their writing, and not how things ended up being implemented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just change the date and then add a note at the top briefly explaining the state of things? It's just a little weird to see a NEP dated today that explains outdated behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a note to the abstract

:Author: Alexander Belopolsky (sasha), transcribed Matt Picus <matti.picus@gmail.com>
:Status: Draft
:Type: Informational
:Created: 2018-10-14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's put the original date here -- maybe noting the date it was transcribed, too?

@@ -0,0 +1,238 @@
=========================
NEP 16 — Zero Rank Arrays
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's save NEP 16 for #10706

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so this becomes NEP 27.

Copy link
Member
@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

See SVN changeset 1864 (which became git changeset `9024ff0`_) for
implementation of ``x[...]`` and ``x[()]`` returning numpy scalars.

See SVN changeset 1866 (which became git changeset `743d922`_) for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit: commit is more natural a word for git - I think changeset is a trac / SVN term

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

NEP 27 — Zero Rank Arrays
=========================

:Author: Alexander Belopolsky (sasha), transcribed Matt Picus <matti.picus@gmail.com>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhvk
Copy link
Contributor
mhvk commented Oct 15, 2018

@seberg - thanks for the insight! I was convinced, at least for a moment, until I remembered again the great pain any subclass has to go through in ensuring that it can also provide scalars (the masked item in MaskedArray being a particularly unpleasant example). For Quantity, we end up returning 0-d arrays any time indexing would produce a scalar. I.e., the problem is that scalars share properties beyond indexing with the array (dtype, unit), and it is unpleasant to have to have a separate class.

represent scalar quantities in all case. Pros and cons of converting rank-0
arrays to scalars were summarized as follows:

- Pros:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't really suggest changing this doc, but I think this section in particular is out of date.

The main argument I've heard for scalar types is speed -- they are significantly faster than working with 0-d arrays.

The first and third "Pros" here are no longer true. With Python 3, Python uses operator.index() for coercing integers and NumPy scalars can't be relied upon to subclass Python types (e.g., isinstance(np.int64(1), int) -> False).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it unlikely that scalars are faster, given that most of our operations start by casting them to 0d arrays. Perhaps the arithmetic has a fast path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is the conclusion is we should get rid of scalars once we move to python 3?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with that conclusion - I also think that python 2.7 already uses __index__, so if there's a cut-off line here we've already crossed it.

One thing I would like to see is a merge of the scalar types and dtypes - so that isinstance(np.dtype, type) is true, and isinstance(np.float64, np.dtype) is also true. But that's blocked by #11998 right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scalars absolutely have a fast path. It is in numpy/numpy/core/src/umath/scalarmath.c.src among other places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eric-wieser - Scalars are indeed fast-tracked for arithmetic (scalarmath.c.src):

In [3]: a = np.array(1.)

In [4]: %timeit a * a
1000000 loops, best of 3: 541 ns per loop

In [5]: a = np.float64(1.)

In [6]: %timeit a * a
10000000 loops, best of 3: 91.6 ns per loop

@eric-wieser
Copy link
Member

Looks good to merge to me now

@shoyer
Copy link
Member
shoyer commented Oct 15, 2018 via email

@shoyer
Copy link
Member
shoyer commented Oct 15, 2018 via email

@shoyer
Copy link
Member
shoyer commented Oct 15, 2018 via email

@eric-wieser
Copy link
Member

I don't think we need to get rid of scalars for that - we just need to stop producing them for anything but indexing

@seberg
Copy link
Member
seberg commented Oct 15, 2018

Indeed, I am convinced that most/all of the annoyances about scalars have nothing to do with scalars, but just with the way that PyArray_Return magically creates them for no reason (or we convert scalars to arrays for dubious reasons), plus the fact that units, etc. use array-likes even though they should be dtypes.
Of course I admit there might be one problem: It may not always be quite trivial to decide whether np.add(a, b) has scalar input (not output).

@shoyer
Copy link
Member
shoyer commented Oct 15, 2018

I don't think we need to get rid of scalars for that - we just need to stop producing them for anything but indexing

There are other operations that legitimately convert arrays into scalars, such as reductions. I suspect users would be confused if indexing and reductions (e.g., sum) gave results of different types.

In practice, I suspect the main objection to using 0d arrays extensively would be that users start seeing array(1) when they expect to see 1. Right now, users can mostly (but not always) get away with not understanding the difference between NumPy and builtin scalars.

@seberg
Copy link
Member
seberg commented Oct 15, 2018

@shoyer I believe reductions are basically a null argument. Becuase arr.sum(axis=None) obviously should create a scalar, while arr.sum(axis=(1, 2)) is odd to create a scalar even if the result is 0D, which is why I liked that mail so much, it made this argument ;).

@charris
Copy link
Member
charris commented Oct 15, 2018

In practice, I suspect the main objection to using 0d arrays extensively would be that users start seeing array(1) when they expect to see 1.

I think the main problem is speed, in particular, ufunc call overhead. That is why scalarmath.c.src exists. Now if one could arrange to call those functions for 0-D arrays, or speed up the loops and casting...

@ahaldane
Copy link
Member

I suspect the main objection to using 0d arrays extensively would be that users start seeing array(1) when they expect to see 1.

Of course we could change this, and make 0d arrays print like scalars. The arrayprint code already special-cases 0d arrays and prints them using the scalar-print path (different from the array-print code-path) which uses higher precision:

>>> str(np.array(np.pi))
'3.141592653589793'
>>> str(np.array([np.pi]))
'[3.14159265]'

@shoyer
Copy link
Member
shoyer commented Oct 15, 2018

Of course we could change this, and make 0d arrays print like scalars.

Yep, this would be a reasonable compromise. If we did this and kept around scalars as array constructors (e.g., np.float64(1) becomes equivalent to np.array(1.0)), I suspect very few users would even notice.

I think the main problem is speed, in particular, ufunc call overhead. That is why scalarmath.c.src exists. Now if one could arrange to call those functions for 0-D arrays, or speed up the loops and casting...

This has always felt like a strange optimization to me. If you want maximum speed with pure Python code, you are better off using Python's built scalars (e.g., 2x faster for multiplication and 10x (!) faster for math.sin() vs np.sin() in my micro-benchmarks).

This doesn't leave a very big niche for NumPy scalars -- only cases where you want the exact dtype semantics of numpy or where you want to use one of the rare ufuncs without an equivalent in the standard library's math or cmath modules.

@seberg
Copy link
Member
seberg commented Oct 15, 2018

I will just note again that I am -1 on even hoping or planning to rid of scalars (except from an implementation point of view), I have serious trouble seeing the point. I think scalars are the "expected" thing most of the time (mutability, hashability). 0-D arrays should not be promoted more, rather they should exist as a rarely used niche that most users will not run into because they don't need it and they won't typically create them accidentally. One that is still useful for those who happen to need it.
If I do not have scalars, I cannot put 1meter or +1day into a dictionary, that seems not an unreasonable thing to want to do. Or wanting a float128 with more precision then python floats but still have "normal" semantics.

EDIT: Of course for many array "dtypes" the associated scalar could in a sense be a python integer or float, but not sure that helps much. Also, of course it is not like I am sure, but I feel the arguments for no scalars are not quite along the line of what would be the best end-point but more on what the current problems appear to be.

@charris
Copy link
Member
charris commented Oct 15, 2018

This has always felt like a strange optimization to me.

IIRC, @rkern did the original implementation on account of complaints about speed.

I cannot put 1meter or +1day into a dictionary,

I believe that @teoliphant suggested adding a dictionary to ndarray at some point.

@eric-wieser eric-wieser merged commit a5e10f8 into numpy:master Oct 16, 2018
@eric-wieser
Copy link
Member

Putting this in, since there seem to be no further comments on the state of the NEP contents.

@abalkin
Copy link
Contributor
abalkin commented Oct 17, 2018

@eric-wieser - thanks for mentioning me in this thread. I agree with @seberg that scalars are necessary because due to hashability they can be used in places where 0-dims arrays cannot. This is covered in the NEP.

One way to reduce the number of types that do the same thing slightly differently would be to try to sneak in numpy scalars to the python core library under the guise of ctypes scalars. The ctypes module is showing its age and I think could use the expertise of the numpy community.

After rereading the NEP, I don't have any corrections other than maybe replacing "Sasha" with my full name in a few places. :-)

@mattip mattip deleted the nep-16 branch October 17, 2018 17:36
@mattip
Copy link
Member Author
mattip commented Oct 17, 2018

Thanks @abalkin for creating this document in the first place. Since this is still a draft, I will issue a new PR to make this accepted and do s/Sasha/Alexander/

Copy link
Contributor
@hameerabbasi hameerabbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One change that's outdated.

array(20)

Indexing of Zero-Rank Arrays
----------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that this section be removed entirely or updated. For example, if x is either an array scalar or a rank zero array, x[...] is guaranteed to be an array and x[()] is guaranteed to be a scalar. The difference is because x[{anything here}, ...] is guaranteed to be an array. In words, if the last index is an ellipsis, the result of indexing is guaranteed to be an array.

I came across this weird behaviour when implementing the equivalent of np.where for PyData/Sparse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should remove sections from past documents because they no longer apply. NEPs document the state of things when they were written, not the state of master. Perhaps a ..note: could go here

@eric-wieser
Copy link
Member

make this accepted and do s/Sasha/Alexander/

Worth keeping at least one mention of sasha in order to explain that sasha in the trac history is Alexander

@shoyer
Copy link
Member
shoyer commented Oct 18, 2018

I will just note again that I am -1 on even hoping or planning to rid of scalars (except from an implementation point of view), I have serious trouble seeing the point. I think scalars are the "expected" thing most of the time (mutability, hashability).

You're probably right from a backward compatibility point of view, but I don't agree here. It's not a huge burden to need to convert from arrays to built-in scalars, and users would quickly learn they need to call .item() before using 0d arrays as dictionary keys.

The way that NumPy intentionally conflates NumPy/builtin scalars (by giving them the same repr) is a recurrent source of confusion that in my experience has lead to lots of bugs.

@seberg
Copy link
Member
seberg commented Oct 18, 2018

@shoyer, yes that is the other option that may be clean. The question is if indexing should always return an array again, which breaks 1d compatibility with lists and will be confusing as well.
Then, you say "convert to built-in scalars", well, there are no built in float128, etc. and I fully agree that "array scalars" are a bad idea. Numpy Scalars should not have sum or indexing, etc.! And since they should not have sum or indexing, a float64 array could just spit out python floats probably (ints are a bit more complicated).

Object is maybe a bit tricky, but the only real issue I see is that it might be hard to know if something is a scalar or not right now.

Btw. I disagree a bit that there might no movement possible here. If ­– I guess a big if – we make progress on new dtype support we have to get it right and I do believe we will have some wiggle room. We could create a pyfloat type that is basically a float64, but returns python floats for arr.sum() or maybe arr[tuple_of_ints]. Or maybe not, since np.add(float, float) will coerce it away.... Oh well...

@hameerabbasi
Copy link
Contributor

If I may, from creating code that works with any number of dimensions perspective, it's very nice to have 0D arrays. Many a time I've had issues with the fact that something returns one result if it's 0-D and another when it's of a higher dimension.

I have no issues with scalars so long as they behave the same as arrays in any and all ways, that is, x[...] and x.sum() should be well-defined.

@mhvk
Copy link
Contributor
mhvk commented Oct 18, 2018

@hameerabbasi - and you probably would like them to be mutable too... it sounds like you want "0-D arrays", not scalars at all, like @shoyer and me. It may be that the three of us are coming from a perspective where it is really useful for arraymimc[index] to always return an arraymimc instance. But it seems clear that there are also good arguments for the contrary, where the array is seen more like a container of entities and you want to get the actual entity if you go down to that level (dtype=object being the most obvious example).

Anyway, not obvious what the path forward is here...

@seberg
Copy link
Member
seberg commented Oct 18, 2018

@mhvk right, personally I think things like unit is not an argument, because there is no reason I should not also have a 1meter scalar. In fact, that is more obvious, since it would be a dtype instance in a way, and the array would just hold many objects with the same dtype.

@hameerabbasi, me defending scalars has nothing to do with that issue. You can easily get consistency so that 0D behaves exactly like ND. For example ND.sum() is a scalar, but ND.sum(1) is not.
I admit there are some rough edges, like 0D.sum(None) would be scalar while 0D.sum(()) would be 0D, similar to 0d[...] is 0d while 0D[()] is scalar. But, given that rule 0Ds would be fully consistent, I think what is annoying people is not the scalars, but that 0D arrays do not generalize correctly, like when you used to have to do arr[np.asarray(arr > 0)] to do boolean indexing.

@hameerabbasi
Copy link
Contributor

me defending scalars has nothing to do with that issue. You can easily get consistency so that 0D behaves exactly like ND.

Consistency is mostly what I care about. If we go to Python scalars we can't get this consistency.

I admit there are some rough edges, like 0D.sum(None) would be scalar while 0D.sum(()) would be 0D, similar to 0d[...] is 0d while 0D[()] is scalar. But, given that rule 0Ds would be fully consistent, I think what is annoying people is not the scalars, but that 0D arrays do not generalize correctly, like when you used to have to do arr[np.asarray(arr > 0)] to do boolean indexing.

Well, there are things you can't do, I agree. But, this thing isn't one of them. arr[np.asarray(arr > 0)] is guaranteed to be 1D, so in this case, it would either be array([arr]) or array([]), but in any case, still 1D (I haven't tested, just saying what I think should happen).

@seberg
Copy link
Member
seberg commented Oct 19, 2018

@hameerabbasi sorry, tricky example. That one works now, but it used to be that if arr was 0-D, it did not work, because arr > 0 returns a scalar (for the time being). EDIT: arr[arr>0] would not work.

I agree that Py scalars will probably be inconsistent math wise, etc. But I really don't see the point of having container methods on our scalars. If anyone got an example that is not just created by numpy converting a 0D array silently, I would be interested!

@teoliphant
Copy link
Member
teoliphant commented Oct 27, 2018

For some historical perspective. 0-d arrays were not well accepted by Numeric and NumPy inherited this early on. Over time, 0-d arrays gained favor to the point where today it seems odd that we don't fully embrace them.

Also, this is an example of "user-APIs" vs "developer APIs". 0-d arrays are perfectly fine and desired for "developers" but end up creating "user-issues" that you have to carefully squash (printing, use in indexing, immutability for use as keys, ...). A developer can always get an actual scalar using .item() or [()], but a data-scientist user appreciates the convenience which is messy.

At the same time, dtypes have several challenges one of which is that they are a Python 1.x-style type concept where every type is an instance of a single Python type. Instead dtypes should be Python types specifically.

Array scalars exist and have the same API as arrays because of these two architectural problems.

A NumPy 2.x should definitely remove array scalars. This can be done by embracing 0-d arrays and also building dtypes as actual types.

@eric-wieser
Copy link
Member
eric-wieser commented Oct 27, 2018

and also building dtypes as actual types.

+1 on this. I'm not sure we necessarily need a numpy 2.0 for this change - if isinstance(np.int16, np.dtype) suddenly starts returning True, I doubt anyone will notice. The main blocker there right now is also the cause of #11998

@teoliphant
Copy link
Member

Yes, we definitely need NumPy 2 for this. To do this right, it will require a breaking change that will require re-compilation of extensions and some deprecation of APIs. There are implications on the C-structure level that will change the ABI. There are implications on the API level that will be too much work to try and figure out how to force them into a 1.x series --- if someone wants to back-port some of the changes to 1.x that could be done after the 2.x release.

I don't see how what I'm thinking about is related to #11998 which is an example of Python back-tracking --- it used to support multiple-inheritance on the C-level.

@eric-wieser
Copy link
Member

I don't see how what I'm thinking about is related to

If we want dtypes to be types, then we presumably want to end up with isinstance(np.float64, np.dtype). That means we need np.generic to have a metaclass (eventually np.dtype), which our multiple-inheritance currently prevents - CPython rejects attempts to add even a no-op metaclass, as it detects that our multiple inheritance of float and np.floating is incompatible.

@eric-wieser
Copy link
Member

There are implications on the C-structure level that will change the ABI.

You're almost certainly right there - it seems pretty likely that PyArray_Descr and PyType_Type have incompatible layouts.

@mattip
Copy link
Member Author
mattip commented Oct 27, 2018

Note this PR has been merged.

We (the BIDS team) have been giving the dtype overhaul some thought. We have a working document with a proposal to make dtypes PyTypeObject )(rather than PyObjects as today). Comments, ideas, or counter proposals are welcome, the document is open to changes, please mark your changes with your name. We will be prototyping this and then the whole idea should end up as a NEP.

Another venue for discussion is our weekly status sessions at noon Pacific time on Wednesdays, as published on the mailing list.

@shoyer
Copy link
Member
shoyer commented Oct 27, 2018

The NEP is listed with "Status: Draft", which means it appears under "Open NEPs" in the NEP index.

Perhaps we should switch the status to either "Deferred" or "Final"? I think I would be fine with either -- obviously if we want to change the behavior of scalars/0-rank arrays in NumPy today we would need a new proposal. (FWIW, I agree mostly with @teoliphant.)

@mattip
Copy link
Member Author
mattip commented Oct 27, 2018
D3B2

I'm not sure what the process is for informational NEPS. Do they need the 1 week notice to the mailing list?

@shoyer
Copy link
Member
shoyer commented Oct 27, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0