-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
ReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, ExplodeTimezonesTimezone data dtypeTimezone data dtype
Milestone
Description
Concatenating DFs that have columns with all NaTs and TZ-aware ones breaks as of 0.17.1:
In [1]: import pandas as pd
In [2]: df1 = pd.DataFrame([[pd.NaT], [pd.NaT]])
In [3]: df1
Out[2]:
0
0 NaT
1 NaT
In [4]: df2 = pd.DataFrame([[pd.Timestamp('2015/01/01', tz='UTC')], [pd.Timestamp('2016/01/01', tz='UTC')]])
In [5]: df2
Out[4]:
0
0 2015-01-01 00:00:00+00:00
1 2016-01-01 00:00:00+00:00
In [6]: pd.concat([df1, df2])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-f61a1ab4009e> in <module>()
----> 1 pd.concat([df1, df2])
/.../env/local/lib/python2.7/site-packages/pandas/tools/merge.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, copy)
811 verify_integrity=verify_integrity,
812 copy=copy)
--> 813 return op.get_result()
814
815
/.../env/local/lib/python2.7/site-packages/pandas/tools/merge.py in get_result(self)
993
994 new_data = concatenate_block_managers(
--> 995 mgrs_indexers, self.new_axes, concat_axis=self.axis, copy=self.copy)
996 if not self.copy:
997 new_data._consolidate_inplace()
/.../env/local/lib/python2.7/site-packages/pandas/core/internals.py in concatenate_block_managers(mgrs_indexers, axes, concat_axis, copy)
4454 copy=copy),
4455 placement=placement)
-> 4456 for placement, join_units in concat_plan]
4457
4458 return BlockManager(blocks, axes)
/.../env/local/lib/python2.7/site-packages/pandas/core/internals.py in concatenate_join_units(join_units, concat_axis, copy)
4551 to_concat = [ju.get_reindexed_values(empty_dtype=empty_dtype,
4552 upcasted_na=upcasted_na)
-> 4553 for ju in join_units]
4554
4555 if len(to_concat) == 1:
/.../env/local/lib/python2.7/site-packages/pandas/core/internals.py in get_reindexed_values(self, empty_dtype, upcasted_na)
4799
4800 if self.is_null and not getattr(self.block,'is_categorical',None):
-> 4801 missing_arr = np.empty(self.shape, dtype=empty_dtype)
4802 if np.prod(self.shape):
4803 # NumPy 1.6 workaround: this statement gets strange if all
TypeError: data type not understood
Possibly related to #11693, #11705 and the #11456 family. However, this doesn't appear to be caused by the TZ-aware vs. non-TZ aware problems referenced there.
Versions:
In [4]: pd.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 2.7.6.final.0 python-bits: 64 OS: Linux OS-release: 3.13.0-77-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 pandas: 0.17.1 nose: 1.3.6 pip: 6.0.8 setuptools: 12.0.5 Cython: 0.22 numpy: 1.9.2 scipy: 0.15.1 statsmodels: 0.6.1.post1 IPython: 3.1.0 sphinx: 1.3.1 patsy: 0.2.1 dateutil: 2.4.2 pytz: 2015.4 blosc: 1.2.8 bottleneck: 1.0.0 tables: 3.2.0 numexpr: 2.4.3 matplotlib: None openpyxl: None xlrd: 0.9.3 xlwt: 0.7.5 xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None Jinja2: None
Metadata
Metadata
Assignees
Labels
ReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, ExplodeTimezonesTimezone data dtypeTimezone data dtype