8000 API: Make datetime64 timezone naive by shoyer · Pull Request #6453 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

API: Make datetime64 timezone naive #6453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 17, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions doc/release/1.11.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ This release supports Python 2.6 - 2.7 and 3.2 - 3.5.
Highlights
==========

* The datetime64 type is now timezone naive. See "datetime64 changes" below
for more details.

Dropped Support
===============
Expand All @@ -25,6 +27,41 @@ Future Changes
Compatibility notes
===================

datetime64 changes
~~~~~~~~~~~~~~~~~~

In prior versions of NumPy the experimental datetime64 type always stored
times in UTC. By default, creating a datetime64 object from a string or
printing it would convert from or to local time::

# old behavior
>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00-0800') # note the timezone offset -08:00

A concensus of datetime64 users agreed that this behavior is undesirable
and at odds with how datetime64 is usually used (e.g., by pandas_). For
most use cases, a timezone naive datetime type is preferred, similar to the
``datetime.datetime`` type in the Python standard library. Accordingly,
datetime64 no longer assumes that input is in local time, nor does it print
local times::

>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00')

For backwards compatibility, datetime64 still parses timezone offsets, which
it handles by converting to UTC. However, the resulting datetime is timezone
naive::

>>> np.datetime64('2000-01-01T00:00:00-08')
DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
numpy.datetime64('2000-01-01T08:00:00')

As a corollary to this change, we no longer prohibit casting between datetimes
with date units and datetimes with timeunits. With timezone naive datetimes,
the rule for casting from dates to times is no longer ambiguous.

pandas_: http://pandas.pydata.org

DeprecationWarning to error
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -149,6 +186,10 @@ if the matrix product is between a matrix and its transpose, it will use

**Note:** Requires the transposed and non-transposed matrices to share data.

*np.testing.assert_warns* can now be used as a context manager
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This matches the behavior of ``assert_raises``.

Changes
=======
Pyrex support was removed from ``numpy.distutils``. The method
Expand Down
62 changes: 37 additions & 25 deletions doc/source/reference/arrays.datetime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,16 +45,10 @@ some additional SI-prefix seconds-based units.
>>> np.datetime64('2005-02', 'D')
numpy.datetime64('2005-02-01')

Using UTC "Zulu" time:

>>> np.datetime64('2005-02-25T03:30Z')
numpy.datetime64('2005-02-24T21:30-0600')

ISO 8601 specifies to use the local time zone
if none is explicitly given:
From a date and time:

>>> np.datetime64('2005-02-25T03:30')
numpy.datetime64('2005-02-25T03:30-0600')
numpy.datetime64('2005-02-25T03:30')

When creating an array of datetimes from a string, it is still possible
to automatically select the unit from the inputs, by using the
Expand Down Expand Up @@ -100,23 +94,6 @@ because the moment of time is still being represented exactly.
>>> np.datetime64('2010-03-14T15Z') == np.datetime64('2010-03-14T15:00:00.00Z')
True

An important exception to this rule is between datetimes with
:ref:`date units <arrays.dtypes.dateunits>` and datetimes with
:ref:`time units <arrays.dtypes.timeunits>`. This is because this kind
of conversion generally requires a choice of timezone and
particular time of day on the given date.

.. admonition:: Example

>>> np.datetime64('2003-12-25', 's')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Cannot parse "2003-12-25" as unit 's' using casting rule 'same_kind'

>>> np.datetime64('2003-12-25') == np.datetime64('2003-12-25T00Z')
False


Datetime and Timedelta Arithmetic
=================================

Expand Down Expand Up @@ -353,6 +330,41 @@ Some examples::
# any amount of whitespace is allowed; abbreviations are case-sensitive.
weekmask = "MonTue Wed Thu\tFri"

Changes with NumPy 1.11
=======================

In prior versions of NumPy, the datetime64 type always stored
times in UTC. By default, creating a datetime64 object from a string or
printing it would convert from or to local time::

# old behavior
>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00-0800') # note the timezone offset -08:00

A concensus of datetime64 users agreed that this behavior is undesirable
and at odds with how datetime64 is usually used (e.g., by pandas_). For
most use cases, a timezone naive datetime type is preferred, similar to the
``datetime.datetime`` type in the Python standard library. Accordingly,
datetime64 no longer assumes that input is in local time, nor does it print
local times::

>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00')

For backwards compatibility, datetime64 still parses timezone offsets, which
it handles by converting to UTC. However, the resulting datetime is timezone
naive::

>>> np.datetime64('2000-01-01T00:00:00-08')
DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
numpy.datetime64('2000-01-01T08:00:00')

As a corollary to this change, we no longer prohibit casting between datetimes
with date units and datetimes with timeunits. With timezone naive datetimes,
the rule for casting from dates to times is no longer ambiguous.

pandas_: http://pandas.pydata.org

Differences Between 1.6 and 1.7 Datetimes
=========================================

Expand Down
14 changes: 4 additions & 10 deletions numpy/core/arrayprint.py
Original file line number Diff line number Diff line change
Expand Up @@ -708,25 +708,19 @@ def __call__(self, x):
i = i + 'j'
return r + i


class DatetimeFormat(object):
def __init__(self, x, unit=None,
timezone=None, casting='same_kind'):
def __init__(self, x, unit=None, timezone=None, casting='same_kind'):
# Get the unit from the dtype
if unit is None:
if x.dtype.kind == 'M':
unit = datetime_data(x.dtype)[0]
else:
unit = 's'

# If timezone is default, make it 'local' or 'UTC' based on the unit
if timezone is None:
# Date units -> UTC, time units -> local
if unit in ('Y', 'M', 'W', 'D'):
self.timezone = 'UTC'
else:
self.timezone = 'local'
else:
self.timezone = timezone
timezone = 'naive'
self.timezone = timezone
self.unit = unit
self.casting = casting

Expand Down
32 changes: 16 additions & 16 deletions numpy/core/src/multiarray/datetime.c
Original file line number Diff line number Diff line change
Expand Up @@ -1316,9 +1316,6 @@ datetime_metadata_divides(

/*
* This provides the casting rules for the DATETIME data type units.
*
* Notably, there is a barrier between 'date units' and 'time units'
* for all but 'unsafe' casting.
*/
NPY_NO_EXPORT npy_bool
can_cast_datetime64_units(NPY_DATETIMEUNIT src_unit,
Expand All @@ -1331,31 +1328,26 @@ can_cast_datetime64_units(NPY_DATETIMEUNIT src_unit,
return 1;

/*
* Only enforce the 'date units' vs 'time units' barrier with
* 'same_kind' casting.
* Can cast between all units with 'same_kind' casting.
*/
case NPY_SAME_KIND_CASTING:
if (src_unit == NPY_FR_GENERIC || dst_unit == NPY_FR_GENERIC) {
return src_unit == NPY_FR_GENERIC;
}
else {
return (src_unit <= NPY_FR_D && dst_unit <= NPY_FR_D) ||
(src_unit > NPY_FR_D && dst_unit > NPY_FR_D);
return 1;
}

/*
* Enforce the 'date units' vs 'time units' barrier and that
* casting is only allowed towards more precise units with
* 'safe' casting.
* Casting is only allowed towards more precise units with 'safe'
* casting.
*/
case NPY_SAFE_CASTING:
if (src_unit == NPY_FR_GENERIC || dst_unit == NPY_FR_GENERIC) {
return src_unit == NPY_FR_GENERIC;
}
else {
return (src_unit <= dst_unit) &&
((src_unit <= NPY_FR_D && dst_unit <= NPY_FR_D) ||
(src_unit > NPY_FR_D && dst_unit > NPY_FR_D));
return (src_unit <= dst_unit);
}

/* Enforce equality with 'no' or 'equiv' casting */
Expand Down Expand Up @@ -2254,6 +2246,14 @@ convert_pydatetime_to_datetimestruct(PyObject *obj, npy_datetimestruct *out,
PyObject *offset;
int seconds_offset, minutes_offset;

/* 2016-01-14, 1.11 */
PyErr_Clear();
if (DEPRECATE(
"parsing timezone aware datetimes is deprecated; "
"this will raise an error in the future") < 0) {
return -1;
}

/* The utcoffset function should return a timedelta */
offset = PyObject_CallMethod(tmp, "utcoffset", "O", obj);
if (offset == NULL) {
Expand Down Expand Up @@ -2386,7 +2386,7 @@ convert_pyobject_to_datetime(PyArray_DatetimeMetaData *meta, PyObject *obj,

/* Parse the ISO date */
if (parse_iso_8601_datetime(str, len, meta->base, casting,
&dts, NULL, &bestunit, NULL) < 0) {
&dts, &bestunit, NULL) < 0) {
Py_DECREF(bytes);
return -1;
}
Expand Down Expand Up @@ -3500,7 +3500,7 @@ find_string_array_datetime64_type(PyArrayObject *arr,

tmp_meta.base = -1;
if (parse_iso_8601_datetime(tmp_buffer, maxlen, -1,
NPY_UNSAFE_CASTING, &dts, NULL,
NPY_UNSAFE_CASTING, &dts,
&tmp_meta.base, NULL) < 0) {
goto fail;
}
Expand All @@ -3509,7 +3509,7 @@ find_string_array_datetime64_type(PyArrayObject *arr,
else {
tmp_meta.base = -1;
if (parse_iso_8601_datetime(data, tmp - data, -1,
NPY_UNSAFE_CASTING, &dts, NULL,
NPY_UNSAFE_CASTING, &dts,
&tmp_meta.base, NULL) < 0) {
goto fail;
}
Expand Down
Loading
0