8000 ENH: Support writing timestamps with timezones with to_sql by mroeschke · Pull Request #22654 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

ENH: Support writing timestamps with timezones with to_sql #22654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Nov 8, 2018
Merged
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
776240b
ENH: Write timezone columns to SQL
Sep 9, 2018
befd200
add tests and change type to Timestamp
Sep 10, 2018
e9f122f
Lint error and comment our skipif
Sep 10, 2018
969d2da
Handle DatetimeTZ block
Sep 10, 2018
cc79b90
Ensure the datetimetz data is 2D first
Sep 11, 2018
24dbaa5
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Sep 11, 2018
6e86d58
Reading timezones returns timezones in UTC
Sep 11, 2018
c7c4a7a
Add whatsnew and some touchups
Sep 12, 2018
6aa4878
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Sep 14, 2018
513bbc8
Test other dbs
Sep 14, 2018
58772e1
timestamps are actually returned as naive local for myself, sqlite
Sep 14, 2018
1a29148
localize -> tz_localize
Sep 14, 2018
96e9188
sqlite doesnt support date types
Sep 15, 2018
ded5584
type
Sep 15, 2018
d575089
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Sep 15, 2018
a7d1b3e
retest
Sep 15, 2018
305759c
read_table vs read_query sqlite difference
Sep 16, 2018
7a79531
Add note in the to_sql docs
Sep 19, 2018
24823f8
Modify whatsnew
Sep 19, 2018
7db4eaa
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Sep 19, 2018
76e46dc
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Sep 21, 2018
978a0d3
Address review
Sep 21, 2018
8025248
Fix sqlalchemy ref
Sep 21, 2018
0e89370
clarify documentation and whatsnew
Sep 26, 2018
bab5cfb
Add an api breaking entry change as well
Sep 27, 2018
de62788
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Oct 10, 2018
e940279
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Oct 24, 2018
8c754b5
Add new section in whatsnew
Oct 25, 2018
e85842f
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Oct 26, 2018
5af83f7
Fix whatsnew to reflect prior bug
Oct 26, 2018
6b3a3f1
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Nov 6, 2018
c4304ec
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Nov 7, 2018
1054fdb
handle case when column is datetimeindex
Nov 7, 2018
f21c755
Add new whatsnew entry
Nov 7, 2018
f872ff7
Merge remote-tracking branch 'upstream/master' into writing_timezone_sql
Nov 7, 2018
ef3b20f
don't check name
Nov 7, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Reading timezones returns timezones in UTC
  • Loading branch information
Matt Roeschke committed Sep 11, 2018
commit 6e86d582140dbcf82c842be719fa221860b9dd5a
12 changes: 8 additions & 4 deletions pandas/tests/io/test_sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -1355,26 +1355,30 @@ def check(col):
df = sql.read_sql_table("types_test_data", self.conn)
check(df.DateColWithTz)

# @pytest.mark.skipif(flavor != 'postgresql',
# reason="postgresql only supports timezones")
def test_datetime_with_timezone_writing(self):
# GH 9086
if self.flavor != 'postgresql':
msg = "{} does not support datetime with time zone"
pytest.skip(msg.format(self.flavor))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we test assert the behaviour for a database that does not support it? Eg what happens if you write a column with timezone aware data to mysql?
From the sqlalchemy docs, it seems the flag timezone=True is simply ignored, but how are then the datetime values we send it stored? Does that change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the values send change:

In [67]: idx = pd.date_range("2012-01-01 09:00", periods=3, tz='Europe/Brussels')

In [68]: idx.values.astype('M8[us]').astype(object)
Out[68]: 
array([datetime.datetime(2012, 1, 1, 8, 0),
       datetime.datetime(2012, 1, 2, 8, 0),
       datetime.datetime(2012, 1, 3, 8, 0)], dtype=object)

In [69]: idx.to_pydatetime()
Out[69]: 
array([ datetime.datetime(2012, 1, 1, 9, 0, tzinfo=<DstTzInfo 'Europe/Brussels' CET+1:00:00 STD>),
       datetime.datetime(2012, 1, 2, 9, 0, tzinfo=<DstTzInfo 'Europe/Brussels' CET+1:00:00 STD>),
       datetime.datetime(2012, 1, 3, 9, 0, tzinfo=<DstTzInfo 'Europe/Brussels' CET+1:00:00 STD>)], dtype=object)

but not sure if that changes how the values are then stored in the mysql database (since the utc / naive time is still the same), but so this might be worth testing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. From earlier tests databases that don't support timestamp with timezone read back data as naive (don't know if this is local naive or converted to UTC first and then returned naive though). Will add some tests.


df = DataFrame({'A': date_range(
'2013-01-01 09:00:00', periods=3, tz='US/Pacific'
)})
df.to_sql('test_datetime_tz', self.conn)

expected = df.copy()
expected['A'] = expected['A'].dt.tz_convert('UTC')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that I remember correctly: when reading, and we get datetime objects with an offset (if there is a timestamp with timezone), we always convert to utc because we cannot really know what is the timezone? (we get a simple 'fixed offset' ?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, so we should expect this round trip test to load US/Pacific and return UTC

# with read_table -> type information from schema used
result = sql.read_sql_table('test_datetime_tz', self.conn)
result = result.drop('index', axis=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put index=False in the to_sql call, then you don't need to drop the index here I think

tm.assert_frame_equal(result, df)
tm.assert_frame_equal(result, expected)

# with read_sql -> no type information -> sqlite has no native
result = sql.read_sql_query(
'SELECT * FROM test_datetime_tz', self.conn
)
result = result.drop('index', axis=1)
tm.assert_frame_equal(result, df)
tm.assert_frame_equal(result, expected)

def test_date_parsing(self):
# No Parsing
Expand Down
0