BUG: Fix unicode(unicode_array_0d) on python 2.7 #9201

eric-wieser · 2017-06-01T15:47:33Z

~~Related to #9139, first two commits stand alone in #9202~~

Now ~~a commit from #9332~~ ~~a single commit on top of #9784~~, standalone.

This is like #9332, but:

Can't be affected by np.set_string_function (is this a good thing?)
Also fixes unicode such that they convert to python unicode scalars

Results on python 2.7:

>>> str(np.array('test'))
'test'
>>> unicode(np.array(u'café'))
u'café'

mhvk

Looks good overall, though I think there are two mistakes, and have a suggestion on how better to deal with MaskedArray.

mhvk · 2017-06-01T17:56:06Z

numpy/core/src/multiarray/strfuncs.c

@@ -198,3 +206,35 @@ array_str(PyArrayObject *self)
    }
    return s;
 }
+
+#ifdef NPY_PY3K


this should be ifndef, no?

mhvk · 2017-06-01T17:56:22Z

numpy/core/src/multiarray/strfuncs.h

@@ -10,4 +10,9 @@ array_repr(PyArrayObject *self);
 NPY_NO_EXPORT PyObject *
 array_str(PyArrayObject *self);

+#ifdef NPY_PY3K


And here too?

Missed that in the rebase, I guess

mhvk · 2017-06-01T17:58:05Z

numpy/ma/core.py

@@ -3885,15 +3885,15 @@ def compress(self, condition, axis=None, out=None):
            _new._mask = _mask.compress(condition, axis=axis)
        return _new

-    def __str__(self):
+    def __str__(self, str_type=str):


Maybe help our future selves by adding an explicit note that str_type can be removed once we drop support of python 2

Though I'm also slightly wary of using different signatures for dunder methods. Before you know it, subclasses rely on it... Might it be an idea to make a new function _prepare_str(self) (better name surely possible...) which returns res without str applied and then use that in both __str__ and __unicode__? Or possibly just the masked print option enabled part as _fill_with_mask_for_str?

eric-wieser · 2017-06-01T18:36:58Z

This recurses horribly, because some of the scalar __str__s call np.array(scalar).__str__()...

mhvk · 2017-06-01T18:50:31Z

How about splitting off the implementation of __unicode__ from the 0-d scalar case so that this PR remains simple?

ahaldane · 2017-06-01T18:57:10Z

I haven't read the code yet, but won't the recursion problems be fixed after #9139?

eric-wieser · 2017-06-01T19:01:56Z

Nope, because #9139 still recurses in void_str

eric-wieser · 2017-06-01T19:09:06Z

How about splitting off the implementation of unicode from the 0-d scalar case so that this PR remains simple?

The sole purpose of implementing __unicode__ at all is to deal with the scalar case. The other cases fall back to calling repr on the individual items anyway, and just work

The problem is invoking style(arr) when style==str

eric-wieser · 2017-06-01T19:22:48Z

(that was to trick the first two commits into disappearing)

eric-wieser · 2017-09-27T06:58:39Z

I've stolen a commit from #9332, and rebased this on top of it.

eric-wieser · 2017-09-27T07:00:35Z

numpy/ma/core.py

-        String representation.
-
-        """
+    def __insert_masked_print(self):


@mhvk: #9768 means this doesn't even need an extra argument any more

ahaldane · 2017-09-27T18:12:45Z

I'm a little on the fence whether str(0d) shouldn't respect np.set_string_function. 0d arrays are ndarrays, after all, so according to the set_string_function doc they should be affected.

Also, while I did remove the style argument in #8983, that was necessary to fix some problems there at the expense of back-compatibility. In #9332 I add it back in as those problems are avoided, so now we can maintain the style arg to be back-compatible.

On the other hand, removing the style argument and special-casing the str of 0d arrays is nice in two ways:

the style argument was ugly
makes 0d behavior more like numpy scalars, which could be desirable. But this is a bit questionable too, since the repr of a 0d array is different from a scalar.

But are these two benefits enough to justify the compatibility break of disabling the style argument?

mhvk · 2017-09-27T18:31:46Z

In my future perfect world, scalars would not exist at all, one would just have array scalars, and those behave like other arrays, so I'd prefer not to go in a direction where array scalars are less like arrays, even if it has the benefit of making them more similar to scalars. I'd could see going the other way, though, removing style and typesetting scalar[...]

eric-wieser · 2017-09-27T19:08:55Z

My argument for why str and unicode should behave as if they were raw scalars is that I'm categorizing str and unicode as type-coercion functions (like int, float, bool etc,) rather than with repr. In all those other functions, we let the 0d array decay to a scalar first.

I'll fix the warning this evening.

mhvk · 2017-09-27T19:30:40Z

@eric-wieser - I don't think that you should compare str with float, etc., at least not in the context where it executes __str__ Indeed, the python documentation states [1] that the point is "to compute the “informal” or nicely printable string representation of an object."

[1] https://docs.python.org/3/reference/datamodel.html#object.__str__

eric-wieser · 2017-09-28T16:12:58Z

Another argument for not letting np.set_string_function interfere is that if we did, then we'd want to allow it to do so for __unicode__ too for consistency, which would be a python2-only API - something that seems a little silly to add with dropping 2.7 support on the not-so-distance horizon

eric-wieser · 2017-11-03T06:11:31Z

Rebased on top of #9913, which makes the diff a little smaller

…ing_function It's more important that scalars and 0d arrays are consistent here. Previously, unicode(arr0d) would crash on 2.7

ahaldane · 2017-11-13T06:42:46Z

LGTM.

I understand it is a little strange that after this PR, in python2 0d unicode is hardcoded but not the 0d str. But I think it's OK to special-case the 0d unicode for various pragmatic reasons - we are already talking about python2 end of life, and unicode of 0ds hasn't been functional before now, so we should feel free to choose fixed behavior.

I'd like to merge. Good to go? (checking since you pushed without commenting)

eric-wieser · 2017-11-13T06:48:02Z

Yep, good to go.

it's OK to special-case the 0d unicode for various pragmatic reasons

I agree - this is better than the old behaviour, and sending __unicode__ through array_str probably opens a can of worms we don't care about solving before EOL for python 2.

it's also consistent with __format__, which is currently hard-coded until we decide whether it should be customizable.

ahaldane · 2017-11-13T22:49:32Z

All right, merging. Thanks Eric!

eric-wieser added 00 - Bug component: numpy._core component: numpy.ma masked arrays labels Jun 1, 2017

eric-wieser requested a review from ahaldane June 1, 2017 15:47

eric-wieser force-pushed the ndarray-unicode branch from f17b8d3 to af8e1e5 Compare June 1, 2017 17:16

This was referenced Jun 1, 2017

ENH: remove unneeded spaces in float/bool reprs, fixes 0d str #9139

Merged

MAINT: Move ndarray.__str__ and ndarray.__repr__ to their own file #9202

Merged

eric-wieser force-pushed the ndarray-unicode branch from af8e1e5 to 0a980d9 Compare June 1, 2017 17:54

mhvk reviewed Jun 1, 2017

View reviewed changes

eric-wieser force-pushed the ndarray-unicode branch from 0a980d9 to 538bfea Compare June 1, 2017 18:18

eric-wieser changed the base branch from master to maintenance/1.13.x June 1, 2017 19:22

eric-wieser changed the base branch from maintenance/1.13.x to master June 1, 2017 19:22

eric-wieser mentioned this pull request Sep 27, 2017

Alternative to #9332 #9776

Closed

eric-wieser force-pushed the ndarray-unicode branch from 538bfea to f166959 Compare September 27, 2017 06:53

eric-wieser mentioned this pull request Sep 27, 2017

ENH: fix 0d array printing using str or formatter. #9332

Merged

eric-wieser force-pushed the ndarray-unicode branch from f166959 to 4f49897 Compare September 27, 2017 06:58

eric-wieser commented Sep 27, 2017

View reviewed changes

eric-wieser force-pushed the ndarray-unicode branch 2 times, most recently from 4450637 to a8c563a Compare September 28, 2017 15:59

eric-wieser force-pushed the ndarray-unicode branch from a8c563a to 988b158 Compare September 28, 2017 16:26

ahaldane mentioned this pull request Oct 2, 2017

np.set_printoptions(sign='legacy') edge case for 0d array #9804

Closed

eric-wieser mentioned this pull request Oct 18, 2017

ENH: Implement ndarray.__format__ for 0d arrays #9883

Merged

eric-wieser force-pushed the ndarray-unicode branch from 988b158 to 940b4a1 Compare November 3, 2017 06:10

charris added this to the 1.14.0 release milestone Nov 12, 2017

eric-wieser force-pushed the ndarray-unicode branch from 940b4a1 to 72025eb Compare November 12, 2017 19:37

BUG: str(arr0d) and unicode(arr0d) should never go through np.set_str…

df0fff4

…ing_function It's more important that scalars and 0d arrays are consistent here. Previously, unicode(arr0d) would crash on 2.7

eric-wieser force-pushed the ndarray-unicode branch from 72025eb to df0fff4 Compare November 12, 2017 19:38

ahaldane approved these changes Nov 13, 2017

View reviewed changes

ahaldane merged commit 565e8ca into numpy:master Nov 13, 2017

ahaldane mentioned this pull request May 7, 2018

np.set_printoptions doesn't affect floats #11048

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Fix unicode(unicode_array_0d) on python 2.7 #9201

BUG: Fix unicode(unicode_array_0d) on python 2.7 #9201

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: Fix unicode(unicode_array_0d) on python 2.7 #9201

BUG: Fix unicode(unicode_array_0d) on python 2.7 #9201

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!