WIP,NEP: Create draft of DTypes NEP #14422

seberg · 2019-09-04T22:10:57Z

Link to the current rendered version (rendering currently broken?) (Updated with 4 commits)

This is the beginning for a draft, some points may fluctuate as I am prototyping. However, the NEP includes the core concepts/decisions:

Do we want DType classes, such that for dtype = np.dtype("float64"), type(dtype).mro() == [numpy.Float64, numpy.dtype, object]. (We would also have a DTypeMeta, but it seems better to hide that as an implementation detail mostly, since metaclass are just confusing. I am not actually sure numpy.dtype has to show up in the mro(), but it would be the base class).
Do we want an AbstractDType (hierarchy), which should allow to define some special coercions, such as to "blasable"/"inexact". It also provides isinstance/issubclass checking (through ABC like overriding) and a hierarchy for UFunc dispatching.
Use CastingImpl objects (to begin with very limited) for defining/storing casting functions.
- In the future, similar to/subclass of UFuncImpl, so that it is python executable
- Provides some information (such as float64 is cast to a string of specific length)
- Has private slot to ask for the DTypeTransferFunction (so that we can extend in the future and allow users to implement specialized/faster versions when we are ready).
Use dynamically created AbstractDType classes to support value based casting. Users will need some access to this, but maybe do not need to be able to create value based casting themselves. (I do not particularly like this part, but it works; I expect some of the indirection here and elsewhere may cause slight slowdowns, but I doubt they are serious and we should have some new opportunities for optimization).

A starting branch can be found at in seberg#6 and includes a chunk big chunk of the work for DTypes (albeit in an early prototype state which will need a lot of cleanup). UFunc dispatching are up next (may need a bit to get to a testable state).

[ci-skip]

h-vetinari

Very happy to see a NEP that tackles this topic!

It was too much to take in a short time right now, but the main questions I have:

does the proposal consider to fix the casting behaviour for numpy default integer types around the edges (i.e. np.iinfo('uint64').max and np.iinfo('int64').min? Cf. a whole set of issues
For that issue specifically, it would be amazing if the NEP allowed a way for users to choose their desired behaviour on crossing that respective int-boundary - either cast to an object array of python ints (for those that really need integers), or cast to float (for those who need speed more than accuracy).

PS. I'm sure you have the following issue on your radar, but I thought I'd xref for completeness: #2899

h-vetinari · 2019-09-20T16:21:59Z

doc/neps/nep-00XX-extensible-dtypes.rst.md

+
+For non-flexible DTypes, the second step is trivial, since they have a canonical implementation (if there is only a single instance, that one should be typically used for backward compatibility though). For flexible DTypes a second pass is needed, this is either an ``adjust_dtypes`` step within UFuncs, or ``__discover_descr_from_pyobject__`` when coercing within ``np.array``. For the latter, this generally means a second pass is necessary for flexible dtypes (although it may be possible to optimize that for common cases). In this case the ``__common_instance__`` method has to be used as well.
+
+There is currently an open question whether ``adjust_dtypes`` may require the values in some cases. This is currently *not* strictly necessary (with the exception that ``objarr.astype("S")`` will use coercion rather than casting logic, a special case that needs to remain). It could be allowed by giving ``adjust_dtypes`` the input array in certain cases. For the moment it seems preferable to avoid this, if such a disovery step is required, it will require a helper function::


There's a good chance I may not have understood everything, but I believe the issues around np.uint64 will need to use the values (i.e. greater than np.iinfo('uint64').max or not) to correctly set the dtype.

doc/neps/nep-00XX-extensible-dtypes.rst.md

h-vetinari · 2019-09-20T16:46:18Z

PPS. Xref-ing some previous NEP PRs and related issues so that people happening upon those threads can find this PR more easily: #2899 #12585 #12630 #12660

seberg · 2019-09-20T18:44:12Z

Thanks for the look, I did not update this in a bit, will do now (and want to try to make the motivation a bit clearer (especially with respect to why certain choices make sense). About tackling integer coercion issues, I do not think we particularly need this NEP to do so, since it is something that happens while creating the array. Deprecating/changing the behaviour that 1 + np.array([1, 2], np.int16) inspects the value of 1 (and worse corners), should get easier to tackle though.

This is a first push from HackMD directly. For now as a markdown file, simply because HackMD does not seem to support rst really... [ci-skip]

seberg · 2020-01-09T18:54:35Z

I am working on changing the current NEP to contain less technical things (with respect to actual techincal decisions). Trying to focus on a few fairly clear design decisions and a commitment of us to push this forward. So that we can hopefully accept this while avoiding most technical discussions.

The plan will be to followup with more more technical NEPs later on. This does not mean that this is easy or short unfortunately...

There is currently a start here, it may change drastically in the next week or two.

seberg · 2020-02-04T18:47:21Z

Clsoing in favor of gh-15505, and following.

rgommers added the component: NEP label Sep 4, 2019

h-vetinari reviewed Sep 20, 2019

View reviewed changes

seberg changed the title ~~WIP,NEP: Create early draft of DTypes NEP~~ WIP,NEP: Createdraft of DTypes NEP Sep 27, 2019

seberg force-pushed the dtype-draft-nep branch from 079a328 to 2a4f413 Compare September 27, 2019 02:40

mhvk mentioned this pull request Oct 11, 2019

Need extra dtypes: complex integer with different bit rate (8, 10 and 12 bits) #12184

Open

seberg mentioned this pull request Jan 7, 2020

The epic dtype cleanup plan #2899

Closed

2 tasks

hackmd-deploy and others added 4 commits January 8, 2020 16:10

Create early draft of DTypes NEP

8704f68

This is a first push from HackMD directly. For now as a markdown file, simply because HackMD does not seem to support rst really... [ci-skip]

NEP: Update NEP and fix rst

85191d6

NEP: Give nep a number

cbe701c

Start rewriting NEP as more informational and with few actual decisions

9243ccd

seberg force-pushed the dtype-draft-nep branch from a111b55 to 9243ccd Compare January 9, 2020 18:50

seberg changed the title ~~WIP,NEP: Createdraft of DTypes NEP~~ WIP,NEP: Create draft of DTypes NEP Jan 15, 2020

seberg closed this Feb 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP,NEP: Create draft of DTypes NEP #14422

WIP,NEP: Create draft of DTypes NEP #14422

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!


		For non-flexible DTypes, the second step is trivial, since they have a canonical implementation (if there is only a single instance, that one should be typically used for backward compatibility though). For flexible DTypes a second pass is needed, this is either an ``adjust_dtypes`` step within UFuncs, or ``__discover_descr_from_pyobject__`` when coercing within ``np.array``. For the latter, this generally means a second pass is necessary for flexible dtypes (although it may be possible to optimize that for common cases). In this case the ``__common_instance__`` method has to be used as well.

		There is currently an open question whether ``adjust_dtypes`` may require the values in some cases. This is currently not strictly necessary (with the exception that ``objarr.astype("S")`` will use coercion rather than casting logic, a special case that needs to remain). It could be allowed by giving ``adjust_dtypes`` the input array in certain cases. For the moment it seems preferable to avoid this, if such a disovery step is required, it will require a helper function::

Uh oh!

WIP,NEP: Create draft of DTypes NEP #14422

WIP,NEP: Create draft of DTypes NEP #14422

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!