8000 ENH: Update scalar representations as per NEP 51 by seberg · Pull Request #22449 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Update scalar representations as per NEP 51 #22449

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 49 commits into from
Jul 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
20ac540
ENH: Changed repr of np.bool_
ganesh-k13 Oct 20, 2020
9af45ba
ENH: Used raw boolean value
ganesh-k13 Oct 20, 2020
587e6fd
ENH: Refactored to use PyUnicode_FromString
ganesh-k13 Oct 20, 2020
b573496
ENH: Fixed doc-string
ganesh-k13 Oct 21, 2020
d0ecf91
ENH: Added release notes (#17592)
ganesh-k13 Oct 26, 2020
4243ee7
ENH: Fixed release notes `np.bool_` repr (#17592)
ganesh-k13 Oct 27, 2020
7431a2d
Update doc/release/upcoming_changes/17592.improvement.rst
eric-wieser Nov 1, 2020
341c245
ENH: Changed repr of ints
ganesh-k13 Nov 20, 2020
f2119c6
ENH: Changed repr of ints to use tp_name
ganesh-k13 Nov 20, 2020
834f3f9
WIP: Get to the NEP 51 state (and try to figure that out...)
seberg Sep 14, 2022
4550643
TST: Fixup tests for repr changes
seberg Sep 14, 2022
eef1cbd
TST: Some more test fixups for repr changes
seberg Sep 19, 2022
b9d0cd3
TST: STrengthen test on newer Python versions
seberg Sep 19, 2022
46785ee
Fixup tests, repr, and guard against using full repr in array printing
seberg Sep 27, 2022
a7b045b
WIP: Continue with refactoring more...
seberg Sep 29, 2022
145998a
WIP: Refactor array formatter with a new `get_formatter`
seberg Oct 12, 2022
4e16d86
ENH: Fixup records and void scalar printing
seberg Oct 14, 2022
e7ce9eb
ENH: Only print type information when helpful for MA fill value
seberg Oct 14, 2022
b74dbfd
API: Switch `ndarray.tofile` back to defaulting to `str()` usage
seberg Oct 18, 2022
ae28524
TST: Fixup tests to make windows/32bit systems and linter happy
seberg Oct 18, 2022
6316dd4
Adapt format to not block s and r format codes
seberg Oct 21, 2022
e8f0492
WIP: Fixup arrayprint for scalar values
seberg Oct 21, 2022
413f1a7
MAINT: Switch to also have str explicitly, fixup MA and forward str/r…
seberg Oct 24, 2022
4c6ff64
DOC: Update reference to pass refguide
seberg Oct 24, 2022
3c0c3d2
STY: Fix linter
seberg Oct 24, 2022
1a53d88
BUG: Fixups for last changes to make CI pass
seberg Oct 24, 2022
a01c893
DOC: First small refdoc update (very verbose update required, too lar…
seberg Oct 24, 2022
6cceecc
DOC: Remove outdated release note (eventually need new referencing NEP)
seberg Oct 24, 2022
342b786
DOC: Fixup docs; only "big" changes not all scalars
seberg Oct 24, 2022
d681bac
fix rebase error
seberg Jun 21, 2023
90a8b61
MAINT: Allow all options again (because subarrays need them) and fix …
seberg Jun 30, 2023
815c8a1
DOC: More doc fixes to pass tests
seberg Jun 30, 2023
b6f83b6
MAINT: Special case string fill-value in MA repr
seberg Jul 4, 2023
afd5e87
DOC: Add a release note for NEP 51 related changes
seberg Jul 4, 2023
ea9063f
ENH; Add legacy fallback mode for non NEP 51 printing because...
seberg Jul 6, 2023
f0302f8
Allow legacy fallback (fix float16, heh)
seberg Jul 6, 2023
768fcf0
TST,MAINT: Add tests for both new and old scalar repr and fix complex
seberg Jul 10, 2023
66341e5
DOC: Mention `np.set_printoptions` in the release note
seberg Jul 10, 2023
62f9d5a
TST: Ensure little-endian int est
seberg Jul 10, 2023
01a1d29
Simplify, removing new fmt option
mhvk Jul 14, 2023
d40922d
Minimize changes relative to main
mhvk Jul 14, 2023
7a61d6d
Remove unnecessary legacy check and longdouble quoting support
seberg Jul 18, 2023
c8df697
MAINT: Undo bad rebase (or maybe small accidental commit)
seberg Jul 18, 2023
a6645e0
Address some other review comments
seberg Jul 18, 2023
6097389
Hack strings to be correct for fill-value
seberg Jul 18, 2023
7fcf5ca
Update release note and add note to NEP that it was not fully impleme…
seberg Jul 18, 2023
19ed59e
MAINT: Tweak string check (forgot the kind, but maybe tuple is nice)
seberg Jul 18, 2023
0329f18
Also force repr for object
seberg Jul 18, 2023
51eb71e
MAINT: A few small fixups from review
seberg Jul 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions doc/neps/nep-0051-scalar-representation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,15 @@ found `here <https://github.com/numpy/numpy/pull/22449>`_
Implementation
==============

.. note::
This part has *not* been implemented in the
`initial PR <https://github.com/numpy/numpy/pull/22449>`_.
A similar change will be required to fix certain cases in printing and
allow fully correct printing e.g. of structured scalars which include
longdoubles.
A similar solution is also expected to be necessary in the future
to allow custom DTypes to correctly print.

The new representations can be mostly implemented on the scalar types with
the largest changes needed in the test suite.

Expand Down
15 changes: 15 additions & 0 deletions doc/release/upcoming_changes/22449.change.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Representation of NumPy scalars changed
---------------------------------------
As per :ref:`NEP 51 <NEP51>`, the scalar representation has been
updated to include the type information to avoid confusion with
Python scalars.
The are now printed as ``np.float64(3.0)`` rather than just ``3.0``.
This may disrupt workflows that store representations of numbers
(e.g. to files) making it harder to read them. They should be stored as
explicit strings, for example by using ``str()`` or ``f"{scalar!s}"``.
For the time being, affected users can use ``np.set_printoptions(legacy="1.25")``
to get the old behavior (with possibly a few exceptions).
Documentation of downstream projects may require larger updates,
if code snippets are tested. We are working on tooling for:
`doctest-plus <https://github.com/scientific-python/pytest-doctestplus/issues/107>`__
to facilitate updates.
10 changes: 5 additions & 5 deletions doc/source/reference/arrays.classes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -657,9 +657,9 @@ objects as inputs and returns an iterator that returns tuples
providing each of the input sequence elements in the broadcasted
result.

>>> for val in np.broadcast([[1,0],[2,3]],[0,1]):
>>> for val in np.broadcast([[1, 0], [2, 3]], [0, 1]):
... print(val)
(1, 0)
(0, 1)
(2, 0)
(3, 1)
(np.int64(1), np.int64(0))
(np.int64(0), np.int64(1))
(np.int64(2), np.int64(0))
(np.int64(3), np.int64(1))
52 changes: 26 additions & 26 deletions doc/source/reference/arrays.datetime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,32 +62,32 @@ letters, for a "Not A Time" value.
A simple ISO date:

>>> np.datetime64('2005-02-25')
numpy.datetime64('2005-02-25')
np.datetime64('2005-02-25')

From an integer and a date unit, 1 year since the UNIX epoch:

>>> np.datetime64(1, 'Y')
numpy.datetime64('1971')
np.datetime64('1971')

Using months for the unit:

>>> np.datetime64('2005-02')
numpy.datetime64('2005-02')
np.datetime64('2005-02')

Specifying just the month, but forcing a 'days' unit:

>>> np.datetime64('2005-02', 'D')
numpy.datetime64('2005-02-01')
np.datetime64('2005-02-01')

From a date and time:

>>> np.datetime64('2005-02-25T03:30')
numpy.datetime64('2005-02-25T03:30')
np.datetime64('2005-02-25T03:30')

NAT (not a time):

>>> np.datetime64('nat')
numpy.datetime64('NaT')
np.datetime64('NaT')

When creating an array of datetimes from a string, it is still possible
to automatically select the unit from the inputs, by using the
Expand Down Expand Up @@ -168,39 +168,39 @@ data type also accepts the string "NAT" in place of the number for a "Not A Time
.. admonition:: Example

>>> np.timedelta64(1, 'D')
numpy.timedelta64(1,'D')
np.timedelta64(1,'D')

>>> np.timedelta64(4, 'h')
numpy.timedelta64(4,'h')
np.timedelta64(4,'h')

>>> np.timedelta64('nAt')
numpy.timedelta64('NaT')
np.timedelta64('NaT')

Datetimes and Timedeltas work together to provide ways for
simple datetime calculations.

.. admonition:: Example

>>> np.datetime64('2009-01-01') - np.datetime64('2008-01-01')
numpy.timedelta64(366,'D')
np.timedelta64(366,'D')

>>> np.datetime64('2009') + np.timedelta64(20, 'D')
numpy.datetime64('2009-01-21')
np.datetime64('2009-01-21')

>>> np.datetime64('2011-06-15T00:00') + np.timedelta64(12, 'h')
numpy.datetime64('2011-06-15T12:00')
np.datetime64('2011-06-15T12:00')

>>> np.timedelta64(1,'W') / np.timedelta64(1,'D')
7.0

>>> np.timedelta64(1,'W') % np.timedelta64(10,'D')
numpy.timedelta64(7,'D')
np.timedelta64(7,'D')

>>> np.datetime64('nat') - np.datetime64('2009-01-01')
numpy.timedelta64('NaT','D')
np.timedelta64('NaT','D')

>>> np.datetime64('2009-01-01') + np.timedelta64('nat')
numpy.datetime64('NaT')
np.datetime64('NaT')

There are two Timedelta units ('Y', years and 'M', months) which are treated
specially, because how much time they represent changes depending
Expand Down Expand Up @@ -289,10 +289,10 @@ specified in business days to datetimes with a unit of 'D' (day).
.. admonition:: Example

>>> np.busday_offset('2011-06-23', 1)
numpy.datetime64('2011-06-24')
np.datetime64('2011-06-24')

>>> np.busday_offset('2011-06-23', 2)
numpy.datetime64('2011-06-27')
np.datetime64('2011-06-27')

When an input date falls on the weekend or a holiday,
:func:`busday_offset` first applies a rule to roll the
Expand All @@ -308,16 +308,16 @@ The rules most typically used are 'forward' and 'backward'.
ValueError: Non-business day date in busday_offset

>>> np.busday_offset('2011-06-25', 0, roll='forward')
numpy.datetime64('2011-06-27')
np.datetime64('2011-06-27')

>>> np.busday_offset('2011-06-25', 2, roll='forward')
numpy.datetime64('2011-06-29')
np.datetime64('2011-06-29')

>>> np.busday_offset('2011-06-25', 0, roll='backward')
numpy.datetime64('2011-06-24')
np.datetime64('2011-06-24')

>>> np.busday_offset('2011-06-25', 2, roll='backward')
numpy.datetime64('2011-06-28')
np.datetime64('2011-06-28')

In some cases, an appropriate use of the roll and the offset
is necessary to get a desired answer.
Expand All @@ -327,16 +327,16 @@ is necessary to get a desired answer.
The first business day on or after a date:

>>> np.busday_offset('2011-03-20', 0, roll='forward')
numpy.datetime64('2011-03-21')
np.datetime64('2011-03-21')
>>> np.busday_offset('2011-03-22', 0, roll='forward')
numpy.datetime64('2011-03-22')
np.datetime64('2011-03-22')

The first business day strictly after a date:

>>> np.busday_offset('2011-03-20', 1, roll='backward')
numpy.datetime64('2011-03-21')
np.datetime64('2011-03-21')
>>> np.busday_offset('2011-03-22', 1, roll='backward')
numpy.datetime64('2011-03-23')
np.datetime64('2011-03-23')

The function is also useful for computing some kinds of days
like holidays. In Canada and the U.S., Mother's day is on
Expand All @@ -346,7 +346,7 @@ weekmask.
.. admonition:: Example

>>> np.busday_offset('2012-05', 1, roll='forward', weekmask='Sun')
numpy.datetime64('2012-05-13')
np.datetime64('2012-05-13')

When performance is important for manipulating many business dates
with one particular choice of weekmask and holidays, there is
Expand Down
2 changes: 1 addition & 1 deletion doc/source/reference/maskedarray.generic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ There are several ways to construct a masked array.
>>> x.view(ma.MaskedArray)
masked_array(data=[(1, 1.0), (2, 2.0)],
mask=[(False, False), (False, False)],
fill_value=(999999, 1.e+20),
fill_value=(999999, 1e+20),
dtype=[('a', '<i8'), ('b', '<f8')])

* Yet another possibility is to use any of the following functions:
Expand Down
9 changes: 5 additions & 4 deletions doc/source/user/absolute_beginners.rst
Original file line number Diff line number Diff line change
Expand Up @@ -581,10 +581,11 @@ example::

>>> for coord in list_of_coordinates:
... print(coord)
(0, 0)
(0, 1)
(0, 2)
(0, 3)
(np.int64(0), np.int64(0))
(np.int64(0), np.int64(1))
(np.int64(0), np.int64(2))
(np.int64(0), np.int64(3))


You can also use ``np.nonzero()`` to print the elements in an array that are less
than 5 with::
Expand Down
4 changes: 2 additions & 2 deletions doc/source/user/basics.rec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ a 32-bit integer named 'age', and 3. a 32-bit float named 'weight'.
If you index ``x`` at position 1 you get a structure::

>>> x[1]
('Fido', 3, 27.)
np.void(('Fido', 3, 27.0), dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])

You can access and modify individual fields of a structured array by indexing
with the field name::
Expand Down Expand Up @@ -515,7 +515,7 @@ a structured scalar::
>>> x = np.array([(1, 2., 3.)], dtype='i, f, f')
>>> scalar = x[0]
>>> scalar
(1, 2., 3.)
np.void((1, 2.0, 3.0), dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<f4')])
>>> type(scalar)
<class 'numpy.void'>

Expand Down
2 changes: 1 addition & 1 deletion numpy/core/_add_newdocs.py
Original file line number Diff line number Diff line change
Expand Up @@ -4663,7 +4663,7 @@

>>> x[0] = (9, 10)
>>> z[0]
(9, 10)
np.record((9, 10), dtype=[('a', 'i1'), ('b', 'i1')])

Views that change the dtype size (bytes per entry) should normally be
avoided on arrays defined by slices, transposes, fortran-ordering, etc.:
Expand Down
10 changes: 5 additions & 5 deletions numpy/core/_add_newdocs_scalars.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,13 +267,13 @@ def add_newdoc_for_scalar_type(obj, fixed_aliases, doc):
Examples
10000 --------
>>> np.void(5)
void(b'\x00\x00\x00\x00\x00')
np.void(b'\x00\x00\x00\x00\x00')
>>> np.void(b'abcd')
void(b'\x61\x62\x63\x64')
>>> np.void((5, 3.2, "eggs"), dtype="i,d,S5")
(5, 3.2, b'eggs') # looks like a tuple, but is `np.void`
np.void(b'\x61\x62\x63\x64')
>>> np.void((3.2, b'eggs'), dtype="d,S5")
np.void((3.2, b'eggs'), dtype=[('f0', '<f8'), ('f1', 'S5')])
>>> np.void(3, dtype=[('x', np.int8), ('y', np.int8)])
(3, 3) # looks like a tuple, but is `np.void`
np.void((3, 3), dtype=[('x', 'i1'), ('y', 'i1')])

""")

Expand Down
2 changes: 1 addition & 1 deletion numpy/core/_ufunc_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@ class errstate:
array([nan, inf, inf])

>>> np.sqrt(-1)
nan
np.float64(nan)
>>> with np.errstate(invalid='raise'):
... np.sqrt(-1)
Traceback (most recent call last):
Expand Down
28 changes: 23 additions & 5 deletions numpy/core/arrayprint.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,12 +88,14 @@ def _make_options_dict(precision=None, threshold=None, edgeitems=None,
options['legacy'] = 113
elif legacy == '1.21':
options['legacy'] = 121
elif legacy == '1.25':
options['legacy'] = 125
elif legacy is None:
pass # OK, do nothing.
else:
warnings.warn(
"legacy printing option can currently only be '1.13', '1.21', or "
"`False`", stacklevel=3)
"legacy printing option can currently only be '1.13', '1.21', "
"'1.25', or `False`", stacklevel=3)

if threshold is not None:
# forbid the bad threshold arg suggested by stack overflow, gh-12351
Expand Down Expand Up @@ -288,6 +290,8 @@ def set_printoptions(precision=None, threshold=None, edgeitems=None,
_format_options['sign'] = '-'
elif _format_options['legacy'] == 121:
set_legacy_print_mode(121)
elif _format_options['legacy'] == 125:
set_legacy_print_mode(125)
elif _format_options['legacy'] == sys.maxsize:
set_legacy_print_mode(0)

Expand Down Expand Up @@ -321,7 +325,7 @@ def get_printoptions():
"""
opts = _format_options.copy()
opts['legacy'] = {
113: '1.13', 121: '1.21', sys.maxsize: False,
113: '1.13', 121: '1.21', 125: '1.25', sys.maxsize: False,
}[opts['legacy']]
return opts

Expand Down Expand Up @@ -395,9 +399,13 @@ def _object_format(o):
return fmt.format(o)

def repr_format(x):
if isinstance(x, (np.str_, np.bytes_)):
return repr(x.item())
return repr(x)

def str_format(x):
if isinstance(x, (np.str_, np.bytes_)):
return str(x.item())
return str(x)

def _get_formatdict(data, *, precision, floatmode, suppress, sign, legacy,
Expand Down Expand Up @@ -1400,13 +1408,23 @@ def __call__(self, x):
return "({})".format(", ".join(str_fields))


def _void_scalar_repr(x):
def _void_scalar_to_string(x, is_repr=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the name change is fine!

"""
Implements the repr for structured-void scalars. It is called from the
scalartypes.c.src code, and is placed here because it uses the elementwise
formatters defined above.
"""
return StructuredVoidFormat.from_data(array(x), **_format_options)(x)
options = _format_options.copy()
if options.get('formatter') is None:
options['formatter'] = {}
options['formatter'].setdefault('float_kind', str)
val_repr = StructuredVoidFormat.from_data(array(x), **options)(x)
if not is_repr:
return val_repr
cls = type(x)
cls_fqn = cls.__module__.replace("numpy", "np") + "." + cls.__name__
void_dtype = np.dtype((np.void, x.dtype))
return f"{cls_fqn}({val_repr}, dtype={void_dtype!s})"


_typelessdata = [int_, float_, complex_, bool_]
Expand Down
4 changes: 2 additions & 2 deletions numpy/core/fromnumeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -2786,7 +2786,7 @@ def max(a, axis=None, out=None, keepdims=np._NoValue, initial=np._NoValue,
>>> b = np.arange(5, dtype=float)
>>> b[2] = np.NaN
>>> np.max(b)
nan
np.float64(nan)
>>> np.max(b, where=~np.isnan(b), initial=-1)
4.0
>>> np.nanmax(b)
Expand Down Expand Up @@ -2930,7 +2930,7 @@ def min(a, axis=None, out=None, keepdims=np._NoValue, initial=np._NoValue,
>>> b = np.arange(5, dtype=float)
>>> b[2] = np.NaN
>>> np.min(b)
nan
np.float64(nan)
>>> np.min(b, where=~np.isnan(b), initial=10)
0.0
>>> np.nanmin(b)
Expand Down
4 changes: 2 additions & 2 deletions numpy/core/function_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -505,9 +505,9 @@ def add_newdoc(place, obj, doc, warn_on_python=True):
----------
place : str
The absolute name of the module to import from
obj : str
obj : str or None
The name of the object to add documentation to, typically a class or
function name
function name.
doc : {str, Tuple[str, str], List[Tuple[str, str]]}
If a string, the documentation to apply to `obj`

Expand Down
Loading
0