-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Many of the extension arrays tests are skipped for Categorical
because the reconstruction of the expected result does not preserve the categoricals (so kind of the "metadata" of the dtype).
For example:
pandas/pandas/tests/extension/category/test_categorical.py
Lines 70 to 72 in 78fee04
@pytest.mark.skip(reason="Unobserved categories preseved in concat.") | |
def test_align(self, data, na_value): | |
pass |
because in the actual test, the expected result is constructed from a list:
pandas/pandas/tests/extension/base/reshaping.py
Lines 47 to 50 in 78fee04
r1, r2 = pd.Series(a).align(pd.Series(b, index=[1, 2, 3])) | |
# Assumes that the ctor can take a list of scalars of the type | |
e1 = pd.Series(type(data)(list(a) + [na_value])) |
(still with type(data)(..)
, but replacing that with data._constructor_from_sequence(..)
in #20746).
This is kind of a recurrent pattern, so that might indicate we should find a solution for this?
So do we want a canonical way in the extension array interface to construct an ExtensionArray that has a certain dtype?
Possible solution is to add a dtype
keyword to _constructor_from_sequence