BUG: Concatenate with empty sequences, fixes #1586 #6224

jaimefrio · 2015-08-20T07:17:23Z

This is still WIP, as I may be complicating things more than needed. And tests are still missing.

The code avoids empty sequences in concatenate being converted to the default float64 type, which causes a typically undesired behavior:

>>> np.concatenate(([], [1, 2, 3, 4])).dtype
dtype('float64')

To do so, it remembers the last dtype it got, either from an ndarray, of from a non-ndarray that was converted to a non-empty ndarray, and casts non-ndarrays that convert to empty ndarrays to that dtype. Since at the beginning there is no dtype to remember, leading non-ndarrays that convert to empty ndarrays are discarded in the first pass, and converted again in a second pass.

Because we are using the size of the resulting ndarray as the criterion for emptiness, it also works with nested empty sequences, i.e. both of these produce arguably correct results:

>>> np.concatenate(([], [[1, 2, 3, 4]]), axis=None)
array([1, 2, 3, 4])
>>> np.concatenate(([[]], [[1, 2, 3, 4]], [[]]), axis=None)
array([1, 2, 3, 4])

The second case was not contemplated in the original #1586 issue, and the code can probably be simplified a little if only strictly empty sequences trigger the cast. But it is probably a good thing that [[]] behaves the same as [], right?

Thoughts on this are very welcome in order to write some meaningful tests.

charris · 2016-01-08T18:50:34Z

I'd suggest ignoring empty tuples and arrays. The only information that they carry is dimensionality, which we might want to check, but otherwise they should have no effect.

may return something else than one or zero and npy_bool is unfortunately an int8 not a c99 bool

…to a common type.

…into a common type.

…Object_Repr`. Also, do a better job of handling any errors raised while constructing the error message.

…nspose and calls `PyArray_MatrixProduct2`.

This add benchmarks randint. There is one set of benchmarks for the default dtype, 'l', that can be tracked back, and another set for the new dtypes 'bool', 'uint8', 'uint16', 'uint32', and 'uint64'.

Added functions are - cacos - cacosf - cacosl - cacosh - cacoshf - cacoshl Closes numpy#6063.

Fix issue numpy#6723. Given an exotic masked structured array, where one of the fields has a multidimensional dtype, make sure that, when accessing this field, the fill_value still makes sense. As it stands prior to this commit, the fill_value will end up being multidimensional, possibly with a shape incompatible with the mother array, which leads to broadcasting errors in methods such as .filled(). This commit uses the first element of this multidimensional fill value as the new fill value. When more than one unique value existed in fill_value, a warning is issued. Also add a test to verify that fill_value.ndim remains 0 after indexing.

empty strings are the default for the new rpath, extra_compile_args and extra_link_args sections

I have found that there are two missing numbers in a sequence in the documentation. http://docs.scipy.org/doc/numpy/user/misc.html#interfacing-to-c It goes 1,2,3,5,7,8 with missing 4 and 6.

Fixes GH6452 There are two types of datetime64/timedelta64 objects with generic times units: * NaT * unit-less timedelta64 objects Both of these should be safely castable to any more specific dtype. However, more specific dtypes should not be safely castable to generic units. Otherwise, the result of `np.datetime64('NaT')` or `np.timedelta(1)` is entirely useless, because they can't be used in any arithmetic operations or comparisons. This is a regression from NumPy 1.9, where these sort of operations worked because the default casting rules with ufuncs were less strict.

Now, NaT compares like NaN: - NaT != NaT -> True - NaT == NaT (and all other comparisons) -> False We discussed this on the mailing list back in October: https://mail.scipy.org/pipermail/numpy-discussion/2015-October/073968.html

Fixes numpygh-7010

Adds the 'order' parameter to the __new__ override in MaskedArray construction, enabling it to be enforced in methods like np.ma.core.array and np.ma.core.asarray. Closes numpygh-6646.

np.put and np.place do something only when the first argument is an instance of np.ndarray. These changes will cause a TypeError to be thrown in either function should that requirement not be satisfied.

…he same case.

…nnot be cast to a common type.

… code.

… behavior of `MaskedArray`'s masks is changing.

…ations.

… of their masks when they are also returning views of their data.

...to hstack, vstack, stack, hsplit, vsplit, dsplit, dstack that check that they raise exceptions.

The identity for bitwise_xor is zero.

Current value is 1, which only works for the low order bit. Use -1 instead. Closes numpy#7060.

- test values - test identity for bitwise_or, bitwise_xor, bitwise_and

The identity has changed from 1 to -1.

I know int is between 0 and 4294967295, but I think many people that do not know that will benefit from this comment. [ci skip]

Add note about wheels on pypi, and Windows wheels in particular. See discussion at: numpy#5479

Not everyone might recognize Travis status

Give hook to allow platform-specific installs to modify the initialization of numpy. Particular use-case is to allow check for SSE2 on Windows when shipping with ATLAS wheel.

Add description of ``numpy/_distribution_init.py`` file and init hook to release notes for 1.12.0.

Found this while reading a docstring.

Fixes numpy#7393

This was otherwise undocumented, so the nanprod.rst page wasn't being generated.

closes numpygh-7402

DEP: Deprecated using a float index in linspace DEP: Deprecated using a float index in linspace DEP: Deprecated using a float index in linspace DEP: Deprecated using a float index in linspace DEP: Deprecated using a float index in linspace DEP: Release notes for PR#7328

Completely rewrote binary_repr method to use the Python's built-in 'bin' function to generate binary representations quickly. Fixed the behaviour for negative numbers in which insufficient widths resulted in outputs of all zero's for the two's complement. Now, it will return the two's complement with width equal to the minimum number of bits needed to represent the complement. Closes numpygh-7168.

[ci skip]

The input arrays are documented to have ndim <=2, so check for that and raise a ValueError on failure.

The non-nan elements of the result of corrcoef should satisfy the inequality abs(x) <= 1 and the non-nan elements of the diagonal should be exactly one. We can't guarantee those results due to roundoff, but clipping the real and imaginary parts to the interval [-1, 1] improves things to a small degree. Closes numpy#7392.

This doesn't actually test much, as we don't have any inputs where that was not already the case. But at least it is there and perhaps a fuzz test can be added at a later date.

Empty non-arrays no longer participate in determining the dtype of the concatenated array. Have also refactored the code to unify as much as possible the logic for flattened and non-flattened paths, unifying the error checks and adding a few more tests, both for the fixed bug and for other functionality.

jaimefrio force-pushed the concatenate_empty branch from 80aaa59 to 7d1f87a Compare August 20, 2015 07:43

charris added 00 - Bug component: numpy._core labels Sep 9, 2015

charris added the 54 - Needs decision label Jan 8, 2016

jakirkham and others added 25 commits March 15, 2016 08:22

DOC: Fix typos.

0da1984

DEP: Deprecate random_integers

84d32f7

BUG: make result of isfinite/isinf/signbit a boolean

2f7140d

may return something else than one or zero and npy_bool is unfortunately an int8 not a c99 bool

TST: Ensure dot fails correctly if array types cannot be coerced in…

3f6ede2

…to a common type.

TST: Ensure inner fails correctly if array types cannot be coerced …

774768c

…into a common type.

BUG: Clear error before constructing error message using calls to `Py…

a3a6ba0

…Object_Repr`. Also, do a better job of handling any errors raised while constructing the error message.

MAINT: Refactor cblas_innerproduct to use cblas_matrixproduct.

c39e8bc

MAINT: Refactor PyArray_InnerProduct so that it just performs a tra…

eece342

…nspose and calls `PyArray_MatrixProduct2`.

ENH: Add benchmark tests for numpy.random.randint.

d46a1b7

This add benchmarks randint. There is one set of benchmarks for the default dtype, 'l', that can be tracked back, and another set for the new dtypes 'bool', 'uint8', 'uint16', 'uint32', and 'uint64'.

BUG: Add more complex trig functions to glibc < 2.16 blacklist.

f434a9a

Added functions are - cacos - cacosf - cacosl - cacosh - cacoshf - cacoshl Closes numpy#6063.

BUG: skip invalid path distutils warning for empty strings

dfaf5f4

empty strings are the default for the new rpath, extra_compile_args and extra_link_args sections

Fix number sequence

d72a6ea

I have found that there are two missing numbers in a sequence in the documentation. http://docs.scipy.org/doc/numpy/user/misc.html#interfacing-to-c It goes 1,2,3,5,7,8 with missing 4 and 6.

TST, ENH: make all comparisons with NaT false

6b03f0b

Now, NaT compares like NaN: - NaT != NaT -> True - NaT == NaT (and all other comparisons) -> False We discussed this on the mailing list back in October: https://mail.scipy.org/pipermail/numpy-discussion/2015-October/073968.html

DOC: Clean up/fix several references to the "future" 1.10 release

3b80d51

Fixes numpygh-7010

BUG: Enforce order param for MaskedArray construction

f4b1695

Adds the 'order' parameter to the __new__ override in MaskedArray construction, enabling it to be enforced in methods like np.ma.core.array and np.ma.core.asarray. Closes numpygh-6646.

MAINT: ensureisclose returns scalar when called with two scalars

311530c

DOC, MAINT: Enforce np.ndarray arg for np.put and np.place

54d7b18

np.put and np.place do something only when the first argument is an instance of np.ndarray. These changes will cause a TypeError to be thrown in either function should that requirement not be satisfied.

MAINT: Ensure inner is raising a ValueError just as dot does in t…

9cb28e7

…he same case.

DOC: Explain the new exception behavior of np.dot when its types ca…

93e848b

…nnot be cast to a common type.

DOC: Fix markdown style inline code to restructured text style inline…

7cba2d6

… code.

DEP: Add warnings to __getitem__ and __setitem__ to point out the…

136635f

… behavior of `MaskedArray`'s masks is changing.

TEST: Ignore FutureWarning if raised from running masked array oper…

d714dfe

…ations.

DOC: Explain that MaskedArrays will try to consistently return view…

e6ee55a

… of their masks when they are also returning views of their data.

gkBCCN and others added 27 commits March 15, 2016 08:22

TST: Fix numpy#6542: Add tests for non-iterable input...

fe44de8

...to hstack, vstack, stack, hsplit, vsplit, dsplit, dstack that check that they raise exceptions.

BUG: Give bitwise_xor an identity.

4a790e1

The identity for bitwise_xor is zero.

ENH: Add identity for bitwise_and.

46442a3

Current value is 1, which only works for the low order bit. Use -1 instead. Closes numpy#7060.

TST: Add tests for the bitwise ufuncs.

8fcf793

- test values - test identity for bitwise_or, bitwise_xor, bitwise_and

DOC: Document the changed bitwise_and identity.

57db3b5

The identity has changed from 1 to -1.

DOC: Update random.seed in mtrand.pyx.

dcfa9bb

I know int is between 0 and 4294967295, but I think many people that do not know that will benefit from this comment. [ci skip]

DOC: Clarify the valid range of integers passed to random.seed.

d69d347

added NumPy logo and separator

9c7c0a6

MAINT: cleanup np.average

7db8eb9

DOC: note about wheels / windows wheels for pypi

f333a30

Add note about wheels on pypi, and Windows wheels in particular. See discussion at: numpy#5479

Added label icon to Travis status

e1ba8df

Not everyone might recognize Travis status

ENH: platform-specific install hook to change init

219b23d

Give hook to allow platform-specific installs to modify the initialization of numpy. Particular use-case is to allow check for SSE2 on Windows when shipping with ATLAS wheel.

DOC: description of _distribution_init for release

f39e229

Add description of ``numpy/_distribution_init.py`` file and init hook to release notes for 1.12.0.

DOC: fix typo

0e71935

Found this while reading a docstring.

BUG: incorrect type for objects whose __len__ fails

3a84813

Fixes numpy#7393

DOC: add nanprod to the list of math routines

d2d8dd4

This was otherwise undocumented, so the nanprod.rst page wasn't being generated.

BUG: Fix decref before incref for in-place accumulate

e2684d0

closes numpygh-7402

ENH: Add generalized flip function and its tests

9eec39d

DEP: Deprecated using a float index in linspace

9654a7f

DOC: Updates to documentation from perusing it in detail.

a7b8578

[ci skip]

MAINT: Wrapped some docstrings and fixed typo

9dcaf09

ENH: Check array dimensionality in cov function.

d2cd83e

The input arrays are documented to have ndim <=2, so check for that and raise a ValueError on failure.

TST: Check that result of corrcoef are clipped.

16dbd0a

This doesn't actually test much, as we don't have any inputs where that was not already the case. But at least it is there and perhaps a fuzz test can be added at a later date.

jaimefrio force-pushed the concatenate_empty branch from 6f55325 to 943e6cf Compare March 22, 2016 22:19

jaimefrio closed this Mar 22, 2016

jaimefrio deleted the concatenate_empty branch March 22, 2016 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Concatenate with empty sequences, fixes #1586 #6224

BUG: Concatenate with empty sequences, fixes #1586 #6224

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: Concatenate with empty sequences, fixes #1586 #6224

BUG: Concatenate with empty sequences, fixes #1586 #6224

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!