10000 CLN: use float64_t consistently instead of double, double_t by jbrockmendel · Pull Request #23583 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

CLN: use float64_t consistently instead of double, double_t #23583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Nov 11, 2018
Merged
Changes from 1 commit
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
remove unliked comment
  • Loading branch information
jbrockmendel committed Nov 10, 2018
commit 23344b2823bf3c45f5ba88af50b1d8308d7d4f64
1 change: 0 additions & 1 deletion pandas/_libs/algos.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,6 @@ def groupsort_indexer(ndarray[int64_t] index, Py_ssize_t ngroups):
return result, counts


# TODO: redundant with groupby.kth_smallest_c
@cython.boundscheck(False)
@cython.wraparound(False)
def kth_smallest(numeric[:] a, Py_ssize_t k) -> numeric:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think will have a big perf slowdown

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its never used in cython and nogil isnt allowed for def functions. There is a with nogil block just below this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls test i once changed this (and tried to remove) and was all negative

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indistinguishable:

master:

In [3]: arr = np.arange(10000, dtype=np.int64)

In [4]: %timeit pd._libs.algos.kth_smallest(arr, 4)
The slowest run took 165.65 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.42 µs per loop

In [5]: %timeit pd._libs.algos.kth_smallest(arr, 4)
The slowest run took 4.39 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 10.3 µs per loop

In [6]: %timeit pd._libs.algos.kth_smallest(arr, 4)
The slowest run took 4.21 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.2 µs per loop

PR:

In [3]: arr = np.arange(10000, dtype=np.int64)

In [4]: %timeit pd._libs.algos.kth_smallest(arr, 4)
The slowest run took 12.06 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.71 µs per loop

In [5]: %timeit pd._libs.algos.kth_smallest(arr, 4)
The slowest run took 9.48 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.6 µs per loop

In [6]: %timeit pd._libs.algos.kth_smallest(arr, 4)
The slowest run took 6.23 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.95 µs per loop

Similar for other dtypes

Expand Down
0