8000 ENH: expose `bit_generator` and random C-API to cython by mattip · Pull Request #15463 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: expose bit_generator and random C-API to cython #15463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 16, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
MAINT: fixes from review, add _bit_generato_bit_generator.pxd
  • Loading branch information
mattip committed Feb 2, 2020
commit f66326a1e11e7aa0c828a17abb3ab794d73e776e
21 changes: 9 additions & 12 deletions numpy/random/_examples/cython/extending_distributions.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def uint10_uniforms(Py_ssize_t n):
return randoms

# cython example 3
def uniforms_ex(Py_ssize_t n, dtype=np.float64):
def uniforms_ex(bit_generator, Py_ssize_t n, dtype=np.float64):
"""
Create an array of `n` uniformly distributed doubles via a "fill" function.

Expand All @@ -84,6 +84,7 @@ def uniforms_ex(Py_ssize_t n, dtype=np.float64):

Parameters
----------
bit_generator: BitGenerator instance
n: int
Output vector length
dtype: {str, dtype}, optional
Expand All @@ -95,25 +96,21 @@ def uniforms_ex(Py_ssize_t n, dtype=np.float64):
cdef const char *capsule_name = "BitGenerator"
cdef np.ndarray randoms

typedict = {'f': np.float32, 'd': np.float64, 'float64': np.float64,
'float32': np.float32, np.float32: np.float32,
np.float64: np.float64}
typ = typedict.get(dtype, None)
if not typ:
_dtype = np.dtype(dtype)
if _dtype.type not in (np.float32, np.float64):
raise TypeError('Unsupported dtype "%r"' % dtype)
x = PCG64()
capsule = x.capsule
capsule = bit_generator.capsule
# Optional check that the capsule if from a BitGenerator
if not PyCapsule_IsValid(capsule, capsule_name):
raise ValueError("Invalid pointer to anon_func_state")
# Cast the pointer
rng = <bitgen_t *> PyCapsule_GetPointer(capsule, capsule_name)
randoms = np.empty(n, dtype=dtype)
if typ is np.float32:
with x.lock:
randoms = np.empty(n, dtype=_dtype)
if _dtype is np.float32:
Copy link
Member
@eric-wieser eric-wieser Feb 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if _dtype is np.float32:
if _dtype.type is np.float32:

or

Suggested change
if _dtype is np.float32:
if _dtype == np.float32:

As written this is always false. The former is more consistent with line 100.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got be careful here, the latter is correct. The former would have to also check _dtype.isnative. Right now dtype=">f8" (on little endian machines) is probably broken!

Copy link
Member
@seberg seberg Feb 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I think I actually like the dtype.type is np.float32 and dtype.isnative spelling I think.

EDIT: I guess here the isnative check needs to be further up anyway though, maybe just create an issue for it as a separate thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant as a test-run for something to replace the key = np.dtype(dtype).name in _generator.pyx and mtrand.pyx so it would be good to get it right here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think dtype == np.float32 is probably simplest, but make sure to update line 100 too.

Copy link
Member
@seberg seberg Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True dtype == ... is shortest. May be worth to time it in case the type + isnative happens to be much faster (right now).
I timed it, and it:

In [2]: %timeit dt == np.float64                                                                                        
85 ns ± 5.03 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [3]: %timeit dt.type is np.float64                                                                                   
57.3 ns ± 0.108 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %timeit dt.type is np.float64 and dt.isnative                                                                   
81.6 ns ± 0.697 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

so since the == includes the isnative check there is no speed difference between the spellings , so probably should use the == one. The only thing we could consider with respect to fixing the speed issue of dtype.name check would be to add a fast-path for the very common dtype default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using the == form

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still seems to be the == form here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixing

with bit_generator.lock:
random_standard_uniform_fill_f(rng, n, <float*>np.PyArray_DATA(randoms))
else:
with x.lock:
with bit_generator.lock:
random_standard_uniform_fill(rng, n, <double*>np.PyArray_DATA(randoms))
return randoms

2 changes: 1 addition & 1 deletion numpy/random/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def generate_libraries(ext, build_dir):
PCG64_DEFS = []
# One can force emulated 128-bit arithmetic if one wants.
#PCG64_DEFS += [('PCG_FORCE_EMULATED_128BIT_MATH', '1')]
depends = ['__init__.pxd', 'c_distributions.pxd']
depends = ['__init__.pxd', 'c_distributions.pxd', '_bit_generator.pxd']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_bit_generator.pxd existed in 1.18, but was not copied into the wheel. The use of depends on line 137 should fix that.


# npyrandom - a library like npymath
npyrandom_sources = [
Expand Down
6 changes: 5 additions & 1 deletion numpy/random/tests/test_extending.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,14 @@ def test_cython(tmp_path):
so2 = line.strip()
assert so1 is not None
assert so2 is not None
# import the so's without adding the directory to sys.path
from importlib.machinery import ExtensionFileLoader
extending = ExtensionFileLoader('extending', so1).load_module()
extending_distributions = ExtensionFileLoader('extending_distributions', so2).load_module()
values = extending_distributions.uniforms_ex(10, 'd')

# actually test the cython c-extension
from numpy.random import PCG64
values = extending_distributions.uniforms_ex(PCG64(0), 10, 'd')
assert values.shape == (10,)
assert values.dtype == np.float64

Expand Down
0