8000 BUG: divmod return type by TomAugspurger · Pull Request #22932 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

BUG: divmod return type #22932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Oct 8, 2018
Merged
Prev Previous commit
Next Next commit
Merge remote-tracking branch 'upstream/master' into ea-divmod
  • Loading branch information
TomAugspurger committed Oct 4, 2018
commit cc2bfc8b991f4d8cf46a993bf4205cc80656384e
26 changes: 15 additions & 11 deletions pandas/core/arrays/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -781,20 +781,24 @@ def convert_values(param):
# a TypeError should be raised
res = [op(a, b) for (a, b) in zip(lvalues, rvalues)]

if coerce_to_dtype:
if op.__name__ in {'divmod', 'rdivmod'}:
def _maybe_convert(arr):
if coerce_to_dtype:
# https://github.com/pandas-dev/pandas/issues/22850
# We catch all regular exceptions here, and fall back
# to an ndarray.
try:
a, b = zip(*res)
res = (self._from_sequence(a),
self._from_sequence(b))
except TypeError:
pass
res = self._from_sequnce(arr)
except Exception:
res = np.asarray(arr)
else:
try:
res = self._from_sequence(res)
except TypeError:
pass
res = np.asarray(arr)
return res

if op.__name__ in {'divmod', 'rdivmod'}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel can't we use dispatch_to_extension_op here to avoid duplication of code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked something similar a few days ago. If Tom says it isn't feasible, I believe him.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are inside the extension array here, so it would also be strange to use (which doesn't prevent that both could share a helper function, if that would be appropriate).
But here we need to construct the divmod correctly, while dispatch_to_extension_op should assume this is already done correctly by the EA

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which doesn't prevent that both could share a helper function,

Right. This is possible, if people want it. I'll push up a commit with some kind of do_extension_op that both of these call to so people can take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now that I take a look it's not so straightforward. The two are similar but just slightly different in enough places that they wouldn't benefit from sharing code really.

  1. The unboxing of values. dispatch_to_extension_op knows that at least one of the two is a Series[EA]. _binop knows that self is an EA.
  2. The op: dispatch_to_extension_op dispatches, _binop is defining it in a list comprehension
  3. The re-boxing: _binop has the whole maybe re-constructing _from_seqence that the dispatch_to_extension_op doesn't have to worry about at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do not do "conceptually exactly the same thing". Paraphrasing myself from above:

The dispatch function calls the EA to perform an operation, the above code is the EA doing the operation.

Why would those two different things necessarily need to live in the same place / code path?

Of course, we could still move the whole EA._create_method to ops.py (which would indeed be similar as functions like add_flex_arithmetic_methods in ops.py that is used in series.py to add methods to Series). But this is then not related to the change in this PR, and should be left for another issue/PR to discuss (personally I don't think that would be an improvement).

I would see if its is possible to integrate these rather than adding a bunch of new code.

Well, and both Tom and me who have looked into the code, say: we don't think it is possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomAugspurger I saw your comment. I also @jorisvandenbossche comments. I have not looked at this in detail, nor do I have time to. My point is that this instantly creates technical debt no matter how you slice it.

It may require some reorganization to integrate this, and I appreciate that. So happy to defer this, maybe @jbrockmendel has more insight.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be doing a sparse-de-duplication PR following #22880, can take a fresh look at this then. In the interim, I wouldn't let this issue hold up this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that this instantly creates technical debt no matter how you slice it.

It really doesn't. They're doing two different things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is fixing a bug, not doing a refactor of how the ops on EA's are implemented** . If somebody want to look into that, it should be done in a separate PR anyway. So merging.

** and I fully acknowledge that sometimes, to properly fix a bug, you also need to refactor otherwise you just keep adding hacks. However, I don't think that is the case here, see all the comments above.

a, b = zip(*res)
res = _maybe_convert(a), _maybe_convert(b)
else:
res = _maybe_convert(res)
return res

op_name = ops._get_op_name(op, True)
Expand Down
4 changes: 4 additions & 0 deletions pandas/tests/extension/decimal/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,5 +143,9 @@ def to_decimal(values, context=None):
return DecimalArray([decimal.Decimal(x) for x in values], context=context)


def make_data():
return [decimal.Decimal(random.random()) for _ in range(100)]


DecimalArray._add_arithmetic_ops()
DecimalArray._add_comparison_ops()
6 changes: 1 addition & 5 deletions pandas/tests/extension/decimal/test_decimal.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,7 @@

from pandas.tests.extension import base

from .array import DecimalDtype, DecimalArray, to_decimal


def make_data():
return [decimal.Decimal(random.random()) for _ in range(100)]
from .array import DecimalDtype, DecimalArray, make_data


@pytest.fixture
Expand Down
You are viewing a condensed version of this merge commit. You can view the full changes here.
0