-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
REF/ENH: Constructors for DatetimeArray/TimedeltaArray #23493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
f25d24c
e47c200
2cb7597
a4512b7
a5ef959
83b04fe
e8abc83
a4c8671
98dca45
56fd95e
5f92cfa
0e15536
1a015f6
5445a56
272f4b1
35195bd
bb394dc
510ae3d
49cf495
3a62633
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
… to pass them
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -119,7 +119,8 @@ def wrapper(self, other): | |
if isinstance(other, list): | ||
# FIXME: This can break for object-dtype with mixed types | ||
other = type(self)(other) | ||
elif not isinstance(other, (np.ndarray, ABCIndexClass, ABCSeries)): | ||
elif not isinstance(other, (np.ndarray, ABCIndexClass, ABCSeries, | ||
DatetimeArrayMixin)): | ||
# Following Timestamp convention, __eq__ is all-False | ||
# and __ne__ is all True, others raise TypeError. | ||
return ops.invalid_comparison(self, other, op) | ||
|
@@ -204,6 +205,7 @@ def __new__(cls, values, freq=None, tz=None, dtype=None, copy=False): | |
# e.g. DatetimeArray, DatetimeIndex | ||
tz = values.tz | ||
|
||
# TODO: what about if freq == 'infer'? | ||
if freq is None and hasattr(values, "freq"): | ||
# i.e. DatetimeArray, DatetimeIndex | ||
freq = values.freq | ||
|
@@ -219,7 +221,10 @@ def __new__(cls, values, freq=None, tz=None, dtype=None, copy=False): | |
.format(cls=cls.__name__, data=repr(values))) | ||
elif isinstance(values, ABCSeries): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would get out the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll see if there is a graceful way to do this in the next pass (if I ever manage to catch up with all these comments!) |
||
# extract nanosecond unix timestamps | ||
values = values._values.asi8 | ||
if tz is None: | ||
# TODO: Try to do this in just one place | ||
tz = values.dt.tz | ||
values = np.array(values.view('i8')) | ||
elif isinstance(values, DatetimeArrayMixin): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And you don't need to get the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No. For the moment we are still using inheritance, so this would mess up for DatetimeIndex == DatetimeArray. When we change to composition this check will have to become |
||
# extract nanosecond unix timestamps | ||
values = values.asi8 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
""" | ||
Tests for DatetimeArray | ||
""" | ||
import pytest | ||
import numpy as np | ||
|
||
from pandas.core.arrays import DatetimeArrayMixin as DatetimeArray | ||
|
||
import pandas as pd | ||
import pandas.util.testing as tm | ||
|
||
|
||
|
||
|
||
class TestDatetimeArrayConstructors(object): | ||
|
||
def test_init_from_object_dtype(self, tz_naive_fixture): | ||
tz = tz_naive_fixture | ||
if tz is not None: | ||
pytest.xfail(reason="Casting DatetimeIndex to object-dtype raises " | ||
"for pd.Index and is incorrect for np.array; " | ||
"GH#24391") | ||
|
||
# arbitrary DatetimeIndex; this should work for any DatetimeIndex | ||
# with non-None freq | ||
dti = pd.date_range('2016-01-1', freq='MS', periods=9, tz=tz) | ||
expected = DatetimeArray(dti) | ||
|
||
# Fails because np.array(dti, dtype=object) incorrectly returns Longs | ||
result = DatetimeArray(np.array(dti, dtype=object), freq='infer') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I mentioned above, but I don't think That is what we discussed in #23212 (although, @jbrockmendel , you didn't really react there) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hopefully I've answered this enough times in this thread. I see no reason not to handle object dtype in the DatetimeArray constructor. Series and Index and DataFrame all handle object-dtype and lists; I find it counter-intuitive that we would have a small subset of classes that don't. But in the short-run, we need to handle these cases if we want to share the extant tests (without significant overhaul) (which we do) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See #23493 (comment) for my answer to a previous related comment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Elsewhere in this thread you've said you're fine with bikeshedding this after we have a working implementation. Has this changed in the last few minutes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suppose you are referring to #23493 (comment) ? Indeed, I was somewhat inconsistent here, but I think the other comments are quite clear in that I am at least asking why you are adding or keeping handling object dtypes? If the answer is: because I think that is the way it should be, then let's discuss that and try to get to a consensus (#23212). If the answer is: because of practical reasons for now, then let's discuss the practical reasons and see if that is OK to defer to a follow-up, or if there is an easy way to overcome the practical reason. |
||
tm.assert_equal(pd.DatetimeIndex(result), dti) | ||
|
||
# Fails because `pd.Index(dti, dtype=object) raises incorrectly | ||
result = DatetimeArray(pd.Index(dti, dtype=object), freq='infer') | ||
tm.assert_equal(pd.DatetimeIndex(result), dti) | ||
|
||
# NB: for now we re-wrap in DatetimeIndex to use assert_index_equal | ||
# once assert_datetime_array_equal is in place, this will be changed | ||
def test_init_only_freq_infer(self, tz_naive_fixture): | ||
# just pass data and freq='infer' if relevant; no other kwargs | ||
tz = tz_naive_fixture | ||
|
||
# arbitrary DatetimeIndex; this should work for any DatetimeIndex | ||
# with non-None freq | ||
dti = pd.date_range('2016-01-1', freq='MS', periods=9, tz=tz) | ||
expected = DatetimeArray(dti) | ||
assert expected.freq == dti.freq | ||
assert expected.tz == dti.tz | ||
|
||
# broken until ABCDatetimeArray and isna is fixed | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will be done in the forthcoming PR mentioned in the OP. |
||
# assert (dti == expected).all() | ||
# assert (expected == dti).all() | ||
|
||
result = DatetimeArray(expected) | ||
tm.assert_equal(pd.DatetimeIndex(result), dti) | ||
|
||
result = DatetimeArray(list(dti), freq='infer') | ||
tm.assert_equal(pd.DatetimeIndex(result), dti) | ||
|
||
result = DatetimeArray(tuple(dti), freq='infer') | ||
tm.assert_equal(pd.DatetimeIndex(result), dti) | ||
|
||
if tz is None: | ||
result = DatetimeArray(np.array(dti), freq='infer') | ||
tm.assert_equal(pd.DatetimeIndex(result), dti) | ||
|
||
result = DatetimeArray(np.array(dti).astype('M8[s]'), freq='infer') | ||
tm.assert_equal(pd.DatetimeIndex(result), dti) | ||
|
||
result = DatetimeArray(pd.Series(dti), freq='infer') | ||
tm.assert_equal(pd.DatetimeIndex(result), dti) | ||
|
||
result = DatetimeArray(pd.Series(dti, dtype=object), freq='infer') | ||
tm.assert_equal(pd.DatetimeIndex(result), dti) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then we should also get the
freq
from the values as a "cheap" inference? Or would there be cases were an inferred frequency can be different than the actual frequency?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I'm thinking, yah