WIP, NEP: add dtype design NEP - DtypeMeta, Dtype, and hierarchy #12660

mattip · 2019-01-04T09:21:37Z

Replaces PR #12630 after comments there.

doc/neps/nep-0029-dtype-as-type.rst

mattip · 2019-01-04T09:22:32Z

doc/neps/nep-0029-dtype-as-type.rst

+is a class object. Instantiating that class object ``a.type(3)`` produces a
+Num`py `scalar <http://www.numpy.org/devdocs/reference/arrays.scalars.html>`_.
+
+This NEP proposes a class heirarchy for dtypes. The ``np.dtype`` class will


hierarchy spelled wrong (here and below), also abstrace->abstract

doc/neps/nep-0029-dtype-as-type.rst

mattip · 2019-01-04T09:26:19Z

doc/neps/nep-0029-dtype-as-type.rst

+            # subclass of IntDescr
+            return f"dtype('{_kind_to_stem[self.kind]}{self.itemsize:d}')"
+
+        def get_format_function(self, data, **options):


@mhvk noted: Still feel this would logically be on dtype.type.format; maybe that can be the default in the upper class?

Or, perhaps better, just omit it altogether and focus on a method on a dtype that currently exists, overriding that in Color.

formatting exists on dtypes, it is based on a switch statement to get the proper formatting function in arrayprint. I am proposing to make that a method-based lookup on the dtype.

You mean the format function including the part where it determines, e.g., for integer the maximum size the string will be? I guess that is indeed logical to have on the dtype. Still, for the purpose of the NEP, it seems too complicated; if the point is to be able to override a dtype, I feel it would be better to have an example that is closer to what a dtype normally does, like interpreting a bit of memory.

doc/neps/nep-0029-dtype-as-type.rst

mattip · 2019-01-04T09:29:46Z

doc/neps/nep-0029-dtype-as-type.rst

+        itemsize = 8
+        col
10000
ors = ['red', 'green', 'blue']
+        def get_format_function(self, data, **options):
+            class Format():


@mhvk noted: Instead of formatting, I'd pick something that actually exists, perhaps the interpretation of the memory (either setting or getting).

The purpose of a categorical dtype, as far as I can tell, is to give textual meaning to the values in the ndarray. The only place that has meaning is when printing them.

Presumably, should be able to set that type with a string as well? Indeed, even accessing it should give, I would think, an ColorType scalar that is just an int with a different __str__ (heck, scalars come back to haunt us immediately!).

Overriding any of the PyArray_Descr functions or ufuncs requires C code. I was looking for an easy way to demonstrate a subclass, but it seems I only confused reviewers.

mattip · 2019-01-04T11:45:04Z

Simplified by removing the metaclass and extending Dtype.__new__. Fwiw, arrayformatnow works by method lookup on the dtype.

mhvk · 2019-01-04T15:42:11Z

doc/neps/nep-0029-dtype-as-type.rst

+                # Dtype('int8') or Dtype('S10') or record descr
+                return create_new_descr(cls, *args, **kwargs)
+
+    class GenericDtype(Dtype):


There seems to be no benefit to having GenericDtype any more. I thought the argument before that you don't necessarily want to override what int means, but want to be sure that itemsize exists, was logical, so I would move some of the __new__ above here (or stick with the metaclass, it really wasn't bad and could be used to auto-fill methods based on itemsize, etc.)

mhvk

@mattip - a few more comments, though I must admit I'm confused. Mostly, I feel the examples are too abstract; is the goal actually to be able to subclass np.dtype with the NEP work in place? How would one actually put in the cast functions?

Maybe most relevant: I think it should be closer to an abstract base class, where the presence of itemsize only gets checked for the instance, not for the class.

mhvk · 2019-01-04T15:43:23Z

doc/neps/nep-0029-dtype-as-type.rst

+        pass
+
+    class IntDtype(GenericDtype):
+        def __repr__(self):


This __repr__ could move to GenericDtype, as it would be the same for float, string (and that gives it some purpose!)

mhvk · 2019-01-04T15:53:25Z

doc/neps/nep-0029-dtype-as-type.rst

+            from  numpy.core.arrayprint import IntegerFormat
+            return IntegerFormat(data)
+
+    class UInt8Dtype(IntDtype):


Looking at this again, it seems strange that if arr.dtype is meant to be an instance, we are still creating a class here, with actual instantiation of the singleton doing nothing. It does seem one might then just as well just the class as the dtype and keep instantiation in reserve for something more interesting...

Or, more in the spirit of the PR (though with the metaclass, as the check for itemsize needs to come after initialization), could this be something like:

class UIntDtype(IntDtype): kind = 'u' def __init__(cls, itemsize): self.itemsize = itemsize self.minimum = 0 self.maximum = (1 << itemsize) - 1 uint8dtype = UIntDtype(itemsize=8)

though with the metaclass, as the check for itemsize needs to come after initialization F438

Dtypes will be immutable, so I'd expect this to go through __new__

Note with the issubclass(np.dtype, type) approach from the outdate NEP, this last line could look like:

class uint8dtype(metaclass=UIntDtype, itemsize=8): pass

mhvk · 2019-01-04T16:03:28Z

doc/neps/nep-0029-dtype-as-type.rst

+    import numpy as np
+
+    class Dtype():
+        def __new__(cls, *args, **kwargs):


Thinking more, the metaclass does in fact seem the best idea, but it should check for the presence of itemsize on any instances not on the class, i.e., it should let type do its work, and then check that itemsize exists. This, of course, is just what ABCmeta does...

mhvk · 2019-01-04T16:59:47Z

In #12630 (comment), it was noted:

Right now dtype objects already have this: PyArray_ArrFuncs->setitem converts a Python object to raw memory, and PyArray_ArrFuncs->getitem goes the other way. These are absolutely something we should have as (slot) methods on new-style dtypes

I think this goes to the crux of what I've been wondering about: how would this be done if I subclass dtype?

Note that I think it is absolutely fine for the actual work to proceed in steps, i.e., keep using ArrFuncs for now, but I think if subclassing is a goal, the NEP should address how those can make that work. Here, it would be particularly useful to have a more concrete example may be good: How would one really do your Plant, i.e., do actual setting/getting (based on enum?). But perhaps it is better to do something a bit simpler and closer to binary, e.g., implementing something that uses struct.pack/unpack (better if it is for something that is not a dtype already, which leaves the pascal string format 'p'...). Or a 1-bit format that uses np.unpackbits?

njsmith · 2019-01-04T19:49:15Z

In the long run we certainly want to support defining new dtypes in either C or Python, so it makes sense for a spec or prototype to discuss both.

But if the plan is to implement this in stages, then it makes sense for the first stage to only support subclassing in C, since that's where all the current dtypes are defined. Then stage 2 might be adding slot wrappers so that dtype methods are callable from python, and stage 3 would be adding the glue to actually define the methods in Python.

mattip · 2019-01-31T15:09:36Z

If we are going with the approach that Dtype should be a metaclass, this PR should be closed in favor of #12630

mattip · 2019-08-19T11:42:35Z

Closing, it seems starting over will be easier than resuscitating this. Hopefully the discussion will have been useful as a starting point for a replacement.

NEP: add dtype design NEP - DtypeMeta, Dtype, and hierarchy

e1badcb

mattip added the component: NEP label Jan 4, 2019

mattip commented Jan 4, 2019

View reviewed changes

doc/neps/nep-0029-dtype-as-type.rst Outdated Show resolved Hide resolved

mattip commented Jan 4, 2019

View reviewed changes

doc/neps/nep-0029-dtype-as-type.rst Outdated Show resolved Hide resolved

mattip commented Jan 4, 2019

View reviewed changes

doc/neps/nep-0029-dtype-as-type.rst Outdated Show resolved Hide resolved

mattip commented Jan 4, 2019

View reviewed changes

doc/neps/nep-0029-dtype-as-type.rst Outdated Show resolved Hide resolved

mattip commented Jan 4, 2019

View reviewed changes

doc/neps/nep-0029-dtype-as-type.rst Outdated Show resolved Hide resolved

mattip commented Jan 4, 2019

View reviewed changes

doc/neps/nep-0029-dtype-as-type.rst Outdated Show resolved Hide resolved

mattip commented Jan 4, 2019

View reviewed changes

doc/neps/nep-0029-dtype-as-type.rst Show resolved Hide resolved

mattip commented Jan 4, 2019

View reviewed changes

mattip mentioned this pull request Jan 4, 2019

WIP, NEP: add dtype design NEP #12630

Closed

mattip changed the title ~~NEP: add dtype design NEP - DtypeMeta, Dtype, and hierarchy~~ WIP, NEP: add dtype design NEP - DtypeMeta, Dtype, and hierarchy Jan 4, 2019

fixes from review and rework without metaclass

a2d839f

mattip force-pushed the dtype-nep3 branch from 94c2967 to a2d839f Compare January 4, 2019 11:44

mhvk reviewed Jan 4, 2019

View reviewed changes

h-vetinari mentioned this pull request Jan 25, 2019

BUG: incorrect equality of np.int64 and np.uint64 #12525

Closed

mattip closed this Aug 19, 2019

h-vetinari mentioned this pull request Sep 20, 2019

WIP,NEP: Create draft of DTypes NEP #14422

Closed

mattip deleted the dtype-nep3 branch June 8, 2020 06:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP, NEP: add dtype design NEP - DtypeMeta, Dtype, and hierarchy #12660

WIP, NEP: add dtype design NEP - DtypeMeta, Dtype, and hierarchy #12660

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

WIP, NEP: add dtype design NEP - DtypeMeta, Dtype, and hierarchy #12660

WIP, NEP: add dtype design NEP - DtypeMeta, Dtype, and hierarchy #12660

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!