10000 Rebase, WIP: implement `__rop__` logic for scalar operators by charris · Pull Request #8126 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Rebase, WIP: implement __rop__ logic for scalar operators #8126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

charris
Copy link
Member
@charris charris commented Oct 7, 2016

Rebase of #7459.

Re. #7449.

This is a work in progress, both because I'd like to get some feedback on the approach and there are currently two test failures that I don't yet understand.

This is the second half of the fix for numpygh-1296 (trac 698).
None + np.longdouble(3) would cause either a RunTimeError or
an interpreter crash.  The swapped case (np.longdouble(3) + None)
was fixed in 2008 (376d483).

When writing binary operators in C, they must treat both arguments
equivalently because they will be called for both __op__ and
__rop__.  This bug fix is necessary because the longdouble and
clongdouble operators did not maintain this symmetry.
Previously it took 10x longer to do np.type1(1) op np.type2(2) when
type2 could not be safely cast to type1 than the equivalent operation
with type1 == type1 or when type2 could be safely upcast to type1. This
was due to falling back to calling the equivalent ufunc without trying
to defer the call to the scalar operator of type2 (and safely upcasting
type1)

so previously:

In [2]: a = np.float32(4)

In [3]: b = np.float64(4)

In [4]: timeit a * a
10000000 loops, best of 3: 69.1 ns per loop

In [5]: timeit b * b
10000000 loops, best of 3: 69.5 ns per loop

In [6]: timeit a * b
1000000 loops, best of 3: 1.29 µs per loop

In [7]: timeit b * a
10000000 loops, best of 3: 116 ns per loop

and with these changes:

In [2]: a = np.float32(4)

In [3]: b = np.float64(4)

In [4]: timeit a * a
10000000 loops, best of 3: 74 ns per loop

In [5]: timeit b * b
10000000 loops, best of 3: 73.7 ns per loop

In [6]: timeit a * b
10000000 loops, best of 3: 181 ns per loop

In [7]: timeit b * a
10000000 loops, best of 3: 125 ns per loop

Operations on mixed type scalars that result in a scalar of a new type
still use the ufunc fallback and hit the speed penalty e.g. F op d -> D.

In [2]: a = np.complex64(1+2j)

In [3]: b = np.double(4)

In [4]: timeit a * b
1000000 loops, best of 3: 1.22 µs per loop
When evaluating np.type1(2) op np.type2(4) don't fall back to calling
the op ufunc if the output type is neither of np.type1 or np.type2.
Defer the op call to that of the correct output type.

This speeds up things like F * d -> D by about 5x and with the previous
changes to the safely castable case, causes the scalar power operators
to always return a floating point type when raising integer types to
negative powers.

Fixes numpygh-7449.
@homu
Copy link
Contributor
homu commented Oct 18, 2016

☔ The latest upstream changes (presumably 0a02bb6) made this pull request unmergeable. Please resolve the merge conflicts.

@charris charris modified the milestones: 1.13.0 release, 1.12.0 release Nov 4, 2016
@charris
Copy link
Member Author
charris commented Nov 4, 2016

Pushing this off to 1.13.0. Not sure how much applies.

@charris charris modified the milestones: 1.14.0 release, 1.13.0 release May 5, 2017
@charris
Copy link
Member Author
charris commented May 5, 2017

@ewmoore ISTR that you have run out of spare time ;)

I think we do need to take a look at the __rops__ for scalars, but this problem also needs to be reviewed in light of the new __array_ufunc__ functionality. In any case, I'm pushing this off to 1.14.

@charris charris changed the title Rebase, WIP: implement __rop__ logic for scalar operators Rebase, WIP: implement __rop__ logic for scalar operators May 9, 2017
@charris
Copy link
Member Author
charris commented Nov 14, 2017

Pushing this off again, don't have time to work on this before the release.

@charris charris modified the milestones: 1.14.0 release, 1.15.0 release Nov 14, 2017
@charris
Copy link
Member Author
charris commented May 25, 2018

Pushing off to 1.16.

@charris charris modified the milestones: 1.16.0 release, 1.17.0 release Nov 17, 2018
@charris charris closed this May 15, 2019
@charris charris deleted the scalar_math branch May 15, 2019 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0