8000 Don't divide by zero in Line2D.segment_hits. by anntzer · Pull Request #17282 · matplotlib/matplotlib · GitHub
[go: up one dir, main page]

Skip to content

Don't divide by zero in Line2D.segment_hits. #17282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 1, 2020

Conversation

anntzer
Copy link
Contributor
@anntzer anntzer commented Apr 30, 2020

Instead of changing the numpy errstate, replace a / b <= 1 by a <= b and use np.hypot() for sqrt(x**2+y**2) (which avoids overflows from
x**2).

Otherwise a call to segment_hits should "always" be protected by
manipulating the numpy errstate.

PR Summary

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

@anntzer anntzer added this to the v3.3.0 milestone Apr 30, 2020
Instead of changing the numpy errstate, replace `a / b <= 1` by `a <=
b` and use np.hypot() for `sqrt(x**2+y**2)` (which avoids overflows from
`x**2`).

Otherwise a call to segment_hits should "always" be protected by
manipulating the numpy errstate.
@anntzer anntzer force-pushed the segment_hits_overflow branch from 3ebdb53 to fdf07f5 Compare May 1, 2020 08:08

# Note that there is a little area near one side of each point
# which will be near neither segment, and another which will
# be near both, depending on the angle of the lines. The
# following radius test eliminates these ambiguities.
point_hits = (cx - x) ** 2 + (cy - y) ** 2 <= radius ** 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an estimate how many elements this typically works on?

np.hypot scales worse than the squares, likely because taking the root is expensive.

grafik

import numpy as np
import matplotlib.pyplot as plt

ns = [10, 100, 200, 400,600, 1000]
v1 = []
v2 = []

for n in ns:
    a = np.linspace(0, 1, n)
    b = 2*a
    a0 = 0.1
    b0 = 0.2
    r = 1.2
    o1 = %timeit -o (a0 - a)**2 + (b0 - b)**2 < r**2
    o2 = %timeit -o np.hypot(a0 - a, b0 - b) < r
    v1.append(o1)
    v2.append(o2)
    
plt.plot(ns, [v.average *1e6 for v in v1], 'o--', label='squares')
plt.plot(ns, [v.average *1e6 for v in v2], 'o--', label='np.hypot')
plt.legend()
plt.ylabel('duration [µs]')
plt.xlabel('array size')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed slower, but on my machine the scaling is not so bad. Moreover skipping the call to errstate() is also quite helpful, as contextmanagers are slow (relatively speaking):

import numpy as np 
import matplotlib.pyplot as plt 
 
ns = [10, 100, 200, 400, 600, 1000] 
v1 = [] 
v2 = [] 
v3 = [] 
 
for n in ns: 
    a = np.linspace(0, 1, n) 
    b = 2*a 
    a0 = 0.1 
    b0 = 0.2 
    r = 1.2 
    o1 = %timeit -o (a0 - a)**2 + (b0 - b)**2 < r**2 
    o2 = %timeit -o with np.errstate(all="ignore"): (a0 - a)**2 + (b0 - b)**2 < r**2 
    o3 = %timeit -o np.hypot(a0 - a, b0 - b) < r 
    v1.append(o1) 
    v2.append(o2) 
    v3.append(o3) 
 
plt.plot(ns, [v.average *1e6 for v in v1], 'o--', label='squares') 
plt.plot(ns, [v.average *1e6 for v in v2], 'o--', label='squares+errstate') 
plt.plot(ns, [v.average *1e6 for v in v3], 'o--', label='np.hypot') 
plt.legend() 
plt.ylabel('duration [µs]') 
plt.xlabel('array size')

test

(I know there's two calls to hypot vs a single errstate(), so I guess the real threshold where hypot becomes slower would be around 500 points.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's assume we rarely deal with polygons with thousands of segments, so timing is not too different.

B7F5

# Note that there is a little area near one side of each point
# which will be near neither segment, and another which will
# be near both, depending on the angle of the lines. The
# following radius test eliminates these ambiguities.
point_hits = (cx - x) ** 2 + (cy - y) ** 2 <= radius ** 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's assume we rarely deal with polygons with thousands of segments, so timing is not too different.

@timhoffm timhoffm merged commit 480ea3f into matplotlib:master May 1, 2020
@anntzer anntzer deleted the segment_hits_overflow branch May 1, 2020 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0