ENH: Rewrite of array-coercion to support new dtypes #16200

seberg · 2020-05-10T19:01:02Z

~~This is in a draft state and needs a lot of cleanup still probably.~~ It is based on top of the creation of DTypeMeta classes. There are a bunch of choices and its non-trivial listed below.

This is a draft/prototype, I am putting this out there more if anyone is interested with what I am experimenting or has some feedback on general design thoughts! There are (probably) a lot of places that should be refactored and comments that need updating. (That said, I do see many of the basic ideas – e.g. what to ask from the DType and the use of a cache, as sound.)

It mostly tries to push off more/all logic to the DType (except special cases to support structured-void, object, and characters). It also introduces a cache... Its not really optimized much yet, and I have some open questions about how to deal with subclasses (I am tending to refuse to deal with subclasses).
Right now scalar/python class -> DType mapping is done with a global dict. I could push this off to the a subclass check on the scalar. Right now it does not matter much, since the decision is not really final. I am not sure it is the main overhead right now though...

Some notes:

Object arrays need some special case handling, because they are weird (i.e. they cannot make up their mind about what is or is not a scalar)
We still need special handling for structured void and the "character" dtype...
The current code incurs a significant (but tiny compared to __array_function__ overhead when dealing with a single array input as of this time. Because this is not short circuited, and thus a lot of work/overhead is incurred.
Setting elements, I have not changed this everywhere. But it in a few places goes through a new function. That function will incur a fairly significant overhead in some cases probably. But it also fixes a few subtle bugs...
Array creation from scalars is slower, that may be somewhat annoying since it happens in ufuncs. Again, there is no fast-path for this, which should be possible.
Array creation from sequences is usually faster (but not for a list of floats). Array creation from a sequence of array-likes will be significantly faster. YMMV probably, especially since I did some of my timings before fixing up element settings.
The above speed changes are due to the fact that I cache results from PySequence_FAST and __array__ calls. This means that generally the memory overhead can be much larger, but for example np.array(range(10)) does not have to create list(range(10)) three times, but only once.
Using SETITEM is nice, but not very fast when casting is involved in general. But maybe there is nothing to do about it, or casting should just get a cast-scalar slot to speed this up when necessary.

~~This is a draft, I am open to fully change it.~~ Some bugs that are fixed, or potentially fixed with this:

object arrays are special, and are now truly special. np.array([np.array("asdf", object), ], dtype="S") will do the right thing. On the flip, side I did not (yet) implement the retry-with-string nonsense. So np.array(["string", np.float64(12.34)]) will result in a string large enough to fit any float64. I think that is good. (right now it depends on the order).
This is fixed:
```
In [1]: arr = np.array(0, dtype=np.complex256)      
In [2]: arr[()] = np.sqrt(np.float128(2))    
In [3]: arr == np.sqrt(np.float128(2))      
Out[3]: True
```
(it is not fixed if the right hand side is a 0-D array, that still requires a specialized SETITEM in my code (which float128 has at least partially). I am not sure I am willing to make it work generally, rather I think float128 may want to error more often, and not call __float__ unless it knows it is dealing with a Python float (or maybe a subclass of it).
Maybe more... It is now consistent that np.array(..., dtype=np.int64) will raise an error when converting from np.nan, but not converting from np.float64(np.nan)...

mattip

Would it be possible to somehow diagram the overall view of how coercion works?

mattip · 2020-06-04T03:52:43Z

numpy/core/code_generators/numpy_api.py

-    'PyArrayDescr_Type':                (3,),
+    # Internally, PyArrayDescr_Type is a PyArray_DTypeMeta,
+    # the following also defines PyArrayDescr_TypeFull (Full appended)
+    'PyArrayDescr_Type':                (3, "PyArray_DTypeMeta"),


Does this change the function signature? The comment about PyArrayDescr_TypeFull, seems out of place: that function does not seem to be exported.

sorry, this change is a merge conflicts, or accidental.

mattip · 2020-06-04T03:54:02Z

numpy/core/include/numpy/ndarraytypes.h

+
+#endif  /* NPY_INTERNAL_BUILD */
+
+


Can this be moved to a non-public header?

Hmm, this is hopefully the one/only thing still missing. Those abstract dtypes are fairly obvious, I forgot to mention them in the chat, although its mentioned in the NEP.

Do you want to pull more than those?

@mattip, I tried a bit. The problem is that PyArray_Descr is a DTypeMeta internally (and has to be I think), so basically this mirrors what happens in the auto-generated headers (I believe), and I do not see how to get around it.

seberg · 2020-06-12T15:02:27Z

@mattip, @anirudh2290 and anyone else interested. Would Monday 9am PDT be good to discuss/walk through some of the changes/code here? I can send out an invite later.

mattip · 2020-06-12T15:49:04Z

That works for me. Could you point to the relevant parts of NEPs or other design documents? Don’t go to any great effort to create/edit them.

hameerabbasi · 2020-06-12T16:12:41Z

I would be interested. :)

anirudh2290 · 2020-06-12T16:53:16Z

That works for me too. Thanks @seberg !

vrakesh · 2020-06-12T17:15:04Z

@seberg count me in too

seberg · 2020-06-14T01:35:38Z

OK, its set then, I have to check what link best to use, so will send that out later. @mattip https://numpy.org/neps/nep-0042-new-dtypes.html#coercion-to-and-from-python-objects and the section after that apply, a quick read of it or the pseudo-code tries may make sense.

seberg · 2020-06-16T00:39:53Z

Just FYI, I am battling with some a segfault found by the pandas test-suit (possibly datetime related, but I am not sure yet). And on the way there found a few smaller things. The tests should run through once I push again... Once I found the pandas issue I will remove the draft marker. I do think its fine to look at it anyway.
I also think I might restructure things a tiny bit. If a descriptor is passed in, simply flagging that so that promotion, etc. can be skipped, rather then passing two descriptors around.

UPDATE: OK, I think I found the pandas test, not sure it seemed like the tests had another issue right now if that is so have to track it down tomorrow (by that time also hopefully the pandas test suit is done, and a new leak-check run may have found some other smaller issues). Still need to look into simplifying the "requested descriptor" logic into a flag.

seberg · 2020-06-16T18:50:00Z

OK, marking as ready. New tests are in the separate PR mainly, I will look into whether some new benchmarks make sense, just for documentation basically. We may or may not need release notes here. There may be a reference leak left somewhere, but my partial run indicates that it was fixed with the last set of changes.

seberg · 2020-06-17T01:47:44Z

I ran the astropy test suit (pandas seemed pretty much fine), the (new) failures it found are:

FAILED ../.local/lib/python3.8/site-packages/astropy/nddata/tests/test_utils.py::test_extract_array_return_pos - ValueError: cannot convert float NaN to integer
FAILED ../.local/lib/python3.8/site-packages/astropy/table/tests/test_groups.py::test_table_aggregate[False] - IndexError: list index out of range
FAILED ../.local/lib/python3.8/site-packages/astropy/table/tests/test_groups.py::test_table_aggregate[True] - IndexError: list index out of range
FAILED ../.local/lib/python3.8/site-packages/astropy/table/tests/test_table.py::test_rows_with_mixins - AssertionError: assert <class 'astropy.units.quantity.Quantity'> is <class 'astropy.table.column.Column'>

The first failure makes me wonder whether the old code converted arr[:] = vals into np.copyto(arr, np.array(vals)) and not np.copyto(arr, np.array(vals, dtype=arr.dtype)), that might explain it, I would have thought things are more lenient with these changes.

The next to "index error" are because a UserWarning is not being given, not sure without digging. I assume there is a NaN involved, which is now silently accepted and was not before, somewhere in the internals.

The test_rows_with_mixins failure is puzzling me the most, because it is a type issue and that feels unlikely. I need to dig into that a bit tomorrow... At least its only a hand-full of failures, half of which are probably trivial.

@bsipocz do you happen to know quickly whether these are problematic?

Full details below:

_______________________________________________________________________________________________________________ test_extract_array_return_pos ________________________________________________________________________________________________________________

    def test_extract_array_return_pos():
        '''Check that the return position is calculated correctly.
    
        The result will differ by mode. All test here are done in 1d because it's
        easier to construct correct test cases.
        '''
        large_test_array = np.arange(5)
        for i in np.arange(-1, 6):
>           extracted, new_pos = extract_array(large_test_array, 3, i,
                                               mode='partial',
                                               return_position=True)

../.local/lib/python3.8/site-packages/astropy/nddata/tests/test_utils.py:286: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

array_large = array([0, 1, 2, 3, 4]), shape = (3,), position = (-1,), mode = 'partial', fill_value = nan, return_position = True

    def extract_array(array_large, shape, position, mode='partial',
                      fill_value=np.nan, return_position=False):
        """
        Extract a smaller array of the given shape and position from a
        larger array.
    
        Parameters
        ----------
        array_large : `~numpy.ndarray`
            The array from which to extract the small array.
        shape : tuple or int
            The shape of the extracted array (for 1D arrays, this can be an
            `int`).  See the ``mode`` keyword for additional details.
        position : tuple of numbers or number
            The position of the small array's center with respect to the
            large array.  The pixel coordinates should be in the same order
            as the array shape.  Integer positions are at the pixel centers
            (for 1D arrays, this can be a number).
        mode : {'partial', 'trim', 'strict'}, optional
            The mode used for extracting the small array.  For the
            ``'partial'`` and ``'trim'`` modes, a partial overlap of the
            small array and the large array is sufficient.  For the
            ``'strict'`` mode, the small array has to be fully contained
            within the large array, otherwise an
            `~astropy.nddata.utils.PartialOverlapError` is raised.   In all
            modes, non-overlapping arrays will raise a
            `~astropy.nddata.utils.NoOverlapError`.  In ``'partial'`` mode,
            positions in the small array that do not overlap with the large
            array will be filled with ``fill_value``.  In ``'trim'`` mode
            only the overlapping elements are returned, thus the resulting
            small array may be smaller than the requested ``shape``.
        fill_value : number, optional
            If ``mode='partial'``, the value to fill pixels in the extracted
            small array that do not overlap with the input ``array_large``.
            ``fill_value`` must have the same ``dtype`` as the
            ``array_large`` array.
        return_position : bool, optional
            If `True`, return the coordinates of ``position`` in the
            coordinate system of the returned array.
    
        Returns
        -------
        array_small : `~numpy.ndarray`
            The extracted array.
        new_position : tuple
            If ``return_position`` is true, this tuple will contain the
            coordinates of the input ``position`` in the coordinate system
            of ``array_small``. Note that for partially overlapping arrays,
            ``new_position`` might actually be outside of the
            ``array_small``; ``array_small[new_position]`` might give wrong
            results if any element in ``new_position`` is negative.
    
        Examples
        --------
        We consider a large array with the shape 11x10, from which we extract
        a small array of shape 3x5:
    
        >>> import numpy as np
        >>> from astropy.nddata.utils import extract_array
        >>> large_array = np.arange(110).reshape((11, 10))
        >>> extract_array(large_array, (3, 5), (7, 7))
        array([[65, 66, 67, 68, 69],
               [75, 76, 77, 78, 79],
               [85, 86, 87, 88, 89]])
        """
    
        if np.isscalar(shape):
            shape = (shape, )
        if np.isscalar(position):
            position = (position, )
    
        if mode not in ['partial', 'trim', 'strict']:
            raise ValueError("Valid modes are 'partial', 'trim', and 'strict'.")
    
        large_slices, small_slices = overlap_slices(array_large.shape,
                                                    shape, position, mode=mode)
        extracted_array = array_large[large_slices]
        if return_position:
            new_position = [i - s.start for i, s in zip(position, large_slices)]
    
        # Extracting on the edges is presumably a rare case, so treat special here
        if (extracted_array.shape != shape) and (mode == 'partial'):
            extracted_array = np.zeros(shape, dtype=array_large.dtype)
>           extracted_array[:] = fill_value
E           ValueError: cannot convert float NaN to integer

../.local/lib/python3.8/site-packages/astropy/nddata/utils.py:224: ValueError
________________________________________________________________________________________________________________ test_table_aggregate[False] _________________________________________________________________________________________________________________

T1 = <Table length=8>
  a    b      c      d  
int64 str1 float64 int64
----- ---- ------- -----
    2    c     7.0     0
 ...    a     4.0     3
    0    a     0.0     4
    1    b     3.0     5
    1    a     2.0     6
    1    a     1.0     7

    def test_table_aggregate(T1):
        """
        Aggregate a table
        """
        # Table with only summable cols
        t1 = T1['a', 'c', 'd']
        tg = t1.group_by('a')
        tga = tg.groups.aggregate(np.sum)
        assert tga.pformat() == [' a   c    d ',
                                 '--- ---- ---',
                                 '  0  0.0   4',
                                 '  1  6.0  18',
                                 '  2 22.0   6']
        # Reverts to default groups
        assert np.all(tga.groups.indices == np.array([0, 3]))
        assert tga.groups.keys is None
    
        # metadata survives
        assert tga.meta['ta'] == 1
        assert tga['c'].meta['a'] == 1
        assert tga['c'].description == 'column c'
    
        # Aggregate with np.sum with masked elements.  This results
        # in one group with no elements, hence a nan result and conversion
        # to float for the 'd' column.
        t1m = Table(t1, masked=True)
        t1m['c'].mask[4:6] = True
        t1m['d'].mask[4:6] = True
        tg = t1m.group_by('a')
        with catch_warnings(Warning) as warning_lines:
            tga = tg.groups.aggregate(np.sum)
>           assert warning_lines[0].category == UserWarning
E           IndexError: list index out of range

../.local/lib/python3.8/site-packages/astropy/table/tests/test_groups.py:418: IndexError
_________________________________________________________________________________________________________________ test_table_aggregate[True] _________________________________________________________________________________________________________________

T1 = <Table length=8>
  a    b      c      d  
int64 str1 float64 int64
----- ---- ------- -----
    2    c     7.0     0
 ...    a     4.0     3
    0    a     0.0     4
    1    b     3.0     5
    1    a     2.0     6
    1    a     1.0     7

    def test_table_aggregate(T1):
        """
        Aggregate a table
        """
        # Table with only summable cols
        t1 = T1['a', 'c', 'd']
        tg = t1.group_by('a')
        tga = tg.groups.aggregate(np.sum)
        assert tga.pformat() == [' a   c    d ',
                                 '--- ---- ---',
                                 '  0  0.0   4',
                                 '  1  6.0  18',
                                 '  2 22.0   6']
        # Reverts to default groups
        assert np.all(tga.groups.indices == np.array([0, 3]))
        assert tga.groups.keys is None
    
        # metadata survives
        assert tga.meta['ta'] == 1
        assert tga['c'].meta['a'] == 1
        assert tga['c'].description == 'column c'
    
        # Aggregate with np.sum with masked elements.  This results
        # in one group with no elements, hence a nan result and conversion
        # to float for the 'd' column.
        t1m = Table(t1, masked=True)
        t1m['c'].mask[4:6] = True
        t1m['d'].mask[4:6] = True
        tg = t1m.group_by('a')
        with catch_warnings(Warning) as warning_lines:
            tga = tg.groups.aggregate(np.sum)
>           assert warning_lines[0].category == UserWarning
E           IndexError: list index out of range

../.local/lib/python3.8/site-packages/astropy/table/tests/test_groups.py:418: IndexError
___________________________________________________________________________________________________________________ test_rows_with_mixins ____________________________________________________________________________________________________________________

    def test_rows_with_mixins():
        """Test for #9165 to allow adding a list of mixin objects.
        Also test for fix to #9357 where group_by() failed due to
        mixin object not having info.indices set to [].
        """
        tm = Time([1, 2], format='cxcsec')
        q = [1, 2] * u.m
        mixed1 = [1 * u.m, 2]  # Mixed input, fails to convert to Quantity
        mixed2 = [2, 1 * u.m]  # Mixed input, not detected as potential mixin
        rows = [(1, q[0], tm[0]),
                (2, q[1], tm[1])]
        t = table.QTable(rows=rows)
        t['a'] = [q[0], q[1]]
        t['b'] = [tm[0], tm[1]]
        t['m1'] = mixed1
        t['m2'] = mixed2
    
        assert np.all(t['col1'] == q)
        assert np.all(t['col2'] == tm)
        assert np.all(t['a'] == q)
        assert np.all(t['b'] == tm)
        assert np.all(t['m1'][ii] == mixed1[ii] for ii in range(2))
        assert np.all(t['m2'][ii] == mixed2[ii] for ii in range(2))
>       assert type(t['m1']) is table.Column
E       AssertionError: assert <class 'astropy.units.quantity.Quantity'> is <class 'astropy.table.column.Column'>
E        +  where <class 'astropy.units.quantity.Quantity'> = type(<Quantity [1., 2.]>)
E        +  and   <class 'astropy.table.column.Column'> = table.Column

../.local/lib/python3.8/site-packages/astropy/table/tests/test_table.py:2543: AssertionError

seberg · 2020-06-17T16:25:54Z

@mhvk, thought I can avoid pinging you, but there is an actual issue here around quantities. Quantities raise an error for float(quantity), and the old code often (but not always), would call float(quantity) during item assignment, even if that item was already an array. (I am actually surprised by that, but OK).

I.e.

from astropy import units as u

q = 1 * u.m
np.array(q)  # casts away the unit.
np.array([q])  # calls float(q) leading to an error

This change-set currently does not affect assignment arr[()] = q (single element), although I think it ideally would be fixed. But it homogenizes the above more, so that float(q) is less likely to be called.

So, I guess the question is much of an issue do you think this is? I will try to think of work-arounds, but right now I am not sure there are any easy ones. I mean the float dtype, could recognize 0-d arrays as "known scalars", but that is even more of a band-aid to catch some of the more common cases :/.

bsipocz · 2020-06-17T17:48:16Z

@seberg - I can try to have a look at this later today, but indeed Marten is the one on the astropy side whom I would ping when I stuck with this.

mhvk · 2020-06-17T17:49:03Z

@seberg - don't worry about pinging me about astropy failures. Often, they may be related to work-arounds that we would gladly drop! (and we do numpy version checks if needed). Just to be sure, you are wondering about the last test failure, right?

seberg · 2020-06-17T18:11:56Z

Yes, that is the one that concerns me most, the other ones seemed similar to general issues around float(np.nan) assigning into an array. And I have some hope that we can maybe give a warning/raise more generally in NaN to int casts.

mhvk · 2020-06-17T18:26:11Z

Indeed, the first test is because we're trying to assign nan to an integer array (which I think just means the test is bad), while the first two table warnings are because

np.array([np.ma.MaskedArray([0.,1.], mask=[True, True]).sum()])

no longer gives a warning and returns 0 instead of np.nan (while float(np.ma.MaskedArray([0.,1.], mask=[True, True]).sum()) does still give a warning and returns np.nan). I fear that is something you may have to continue to be possible... Now looking into the last one...

seberg · 2020-06-17T18:55:27Z

Ohh, the masked scalar is a float64 subclass of an array, making it weird, and running into the same issue as the last quantity one, I guess. hmmmmm...

Could also just delete this path entirely, but thought for now to keep it around.

Unfortunately one test had to be adapted (I missed the difference between timedelta64 and datetime64), and one tests which I thought failed only for one reason, also failed for anothe reason which is not fully gone in master.

Also add some more documentation for PyArray_DiscoverDTypeAndShape_Recursive

Co-authored-by: Anirudh Subramanian <anirudh2290@apache.org>

…hen string is too short this retains (and slightly expands) the old behaviour. It is inconsistent and I think we should fix that inconsistency one way or the other (I honestly do not care *which* way).

Co-authored-by: Anirudh Subramanian <anirudh2290@apache.org>

The rational test still fails, but the xfail message was wrong or confusing.

Co-authored-by: Hameer Abbasi <einstein.edison@gmail.com>

As asked for by Hameer, the comment was outdated, so had to change it completely instead of fixing one word.

seberg · 2020-07-08T23:27:57Z

Hmmm, the - are improved speeds, most slowdowns should specifically not come from that, because I did not modify the array-only fast path, so there are no differences in most cases for those python side functions. These are the timings for gh-16786, which adds some simple scalar indexing tests (I deleted the random ones).

       before           after         ratio
     [996e6419]       [aee13e0e]
     <master>         <dtypemeta-array-coercion>
+     11.4±0.03μs      17.9±0.03μs     1.57  bench_indexing.ScalarIndexing.time_assign_cast(0)
+      14.1±0.2μs       21.3±0.2μs     1.50  bench_indexing.ScalarIndexing.time_assign_cast(1)
+      16.6±0.2μs       23.6±0.1μs     1.43  bench_indexing.ScalarIndexing.time_assign_cast(2)
+     9.31±0.06μs      10.9±0.01μs     1.17  bench_indexing.ScalarIndexing.time_assign(0)
+     11.8±0.03μs      13.8±0.02μs     1.17  bench_indexing.ScalarIndexing.time_assign(1)

A bit more impact than I thought on the casting, but 🤷, at least half of that should go away again eventually, whether or not we want to optimize the cast again, is something we can discuss...

EDIT: First version had a typo, leading to no actual casting in the cast case.

astype is similar to array coercion, and is also compared in many of the tests below.

anirudh2290

Hmmm, the - are improved speeds, most slowdowns should specifically not come from that,

oops, i was misinterpreting the ratio column. I think if the performance hit is only for scalar indexing and assignment case and if it fixes inconsistent behavior makes sense to move ahead with this PR.

mattip · 2020-07-09T05:25:39Z

LGTM. Should we just merge this as 57 commits or squash it into a logical set of commits?

seberg · 2020-07-09T12:59:58Z

Normally I would squash down liberally, here dunno. Guess I will cut it down... will probably swing to the other extreme of few commits, but then the history is not super interesting here.

seberg · 2020-07-09T18:21:31Z

Hmmmpf, I tried to recommit it by file rather than time, that doesn't really work, there is too many deletions and slight modifications. So it would be extremely intrinsic job to do it by both time and file... But even that will hardly get something compiling without problems. And then the tests will only pass with the full chunk anyway, so its not like there are clear intermediate steps which are cleanly bisectable...

So I can squash a few commits with sillier commit messages together if we want some notion of history. If we want "functional chunks", I give up, might as well do a single big commit.

mattip · 2020-07-09T18:34:44Z

Thanks @seberg, let's just put it in as-is.

Carreau · 2020-08-27T18:51:00Z

This seem to have introduced #17173

seberg changed the title ~~WIP, DRAFT: Very draft replacement of the array coercion~~ WIP, DRAFT: Prototype of replacement of the array coercion May 10, 2020

charris added 25 - WIP component: numpy._core labels May 11, 2020

hameerabbasi self-requested a review May 27, 2020 20:57

mattip self-requested a review May 27, 2020 20:58

seberg force-pushed the dtypemeta-array-coercion branch 2 times, most recently from 9e95bc2 to 60fc68d Compare June 1, 2020 02:34

anirudh2290 added the Priority: high High priority, also add milestones for urgent issues label Jun 4, 2020

mattip reviewed Jun 4, 2020

View reviewed changes

seberg force-pushed the dtypemeta-array-coercion branch 3 times, most recently from b04278c to c5d3d5b Compare June 9, 2020 18:39

seberg force-pushed the dtypemeta-array-coercion branch from 406a173 to afd99f6 Compare June 16, 2020 00:41

seberg changed the title ~~WIP, DRAFT: Prototype of replacement of the array coercion~~ ENH: Rewrite of array-coercion to support new dtypes Jun 16, 2020

seberg marked this pull request as ready for review June 16, 2020 18:50

seberg and others added 10 commits July 8, 2020 18:13

MAINt: Add assert for an unreachable exception path

8a7f0e6

Could also just delete this path entirely, but thought for now to keep it around.

TST: Adapt coercion-tests to the new situation

7012ef7

Unfortunately one test had to be adapted (I missed the difference between timedelta64 and datetime64), and one tests which I thought failed only for one reason, also failed for anothe reason which is not fully gone in master.

DOC: Add release notes for array-coercion changes

3ccf696

MAINT: Remove weakref from mapping (for now) and rename

6ff4d48

Also add some more documentation for PyArray_DiscoverDTypeAndShape_Recursive

Update numpy/core/src/multiarray/array_coercion.c

e3f091e

Co-authored-by: Anirudh Subramanian <anirudh2290@apache.org>

MAINT: Put a hack in place to allow datetime64 -> string assignment w…

4fe0ad2

…hen string is too short this retains (and slightly expands) the old behaviour. It is inconsistent and I think we should fix that inconsistency one way or the other (I honestly do not care *which* way).

Update doc/release/upcoming_changes/16200.compatibility.rst

d39953c

Co-authored-by: Anirudh Subramanian <anirudh2290@apache.org>

TST: datetime64 test_scalar_coercion does not fail anymore

b36750b

The rational test still fails, but the xfail message was wrong or confusing.

Update doc/release/upcoming_changes/16200.compatibility.rst

0f78129

Co-authored-by: Hameer Abbasi <einstein.edison@gmail.com>

DOC,STY: Use bitshift intsead of powers of two and fix comments

aee13e0

As asked for by Hameer, the comment was outdated, so had to change it completely instead of fixing one word.

TST: Add test for astype to stringlength tests

22ee971

astype is similar to array coercion, and is also compared in many of the tests below.

seberg force-pushed the dtypemeta-array-coercion branch from f4a75e6 to 22ee971 Compare July 8, 2020 23:39

anirudh2290 reviewed Jul 9, 2020

View reviewed changes

mattip merged commit 18a6e3e into numpy:master Jul 9, 2020

This was referenced Jul 9, 2020

Array creation behaves strangely with (very) deeply nested lists #15611

Closed

"setting an array element with a sequence" error could be improved. #6584

Closed

MAINT: Make void scalar to array creation copy when dtype is given #16820

Merged

eric-wieser self-requested a review July 12, 2020 19:05

WillAyd mentioned this pull request Jul 31, 2020

Numpydev CI Failures pandas-dev/pandas#35481

Closed

Carreau mentioned this pull request Aug 27, 2020

Regression in array creation np.array([...], dtype='(2,2)f4') RuntimeError (Zarr testsuite ) #17173

Closed

seberg deleted the dtypemeta-array-coercion branch August 27, 2020 19:14

charris mentioned this pull request Oct 10, 2020

REL: PRs that may need release notes #17533

Closed

20 tasks

jorisvandenbossche mentioned this pull request Dec 9, 2020

1.20.0 RC: change in array coercion for scalar objects with __array_interface__ #17965

Closed

mattip mentioned this pull request Apr 1, 2021

MAINT: Use pythoncapi_compat.h in npy_3kcompat.h #18713

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Rewrite of array-coercion to support new dtypes #16200

ENH: Rewrite of array-coercion to support new dtypes #16200

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!


		#endif /* NPY_INTERNAL_BUILD */

Uh oh!

ENH: Rewrite of array-coercion to support new dtypes #16200

ENH: Rewrite of array-coercion to support new dtypes #16200

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!