8000 ENH: expose `bit_generator` and random C-API to cython by mattip · Pull Request #15463 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: expose bit_generator and random C-API to cython #15463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 16, 2020
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions doc/source/reference/random/extending.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,18 @@ struct.
.. literalinclude:: ../../../../numpy/random/_examples/cython/extending_distributions.pyx
:language: cython
:start-after: example 2
:end-before: example 3

See :ref:`extending_cython_example` for a complete working example including a
minimal setup and cython files.
Cython can be used to directly access the functions in
``numpy/random/c_distributions.pxd``. This requires linking with the
``npyrandom`` library located in ``numpy/random/lib``.

.. literalinclude:: ../../../../numpy/random/_examples/cython/extending_distributions.pyx
:language: cython
:start-after: example 3

See :ref:`extending_cython_example` for the complete listings of these examples
and a minimal ``setup.py`` to build the c-extension modules.

CFFI
====
Expand Down
44 changes: 43 additions & 1 deletion numpy/random/_examples/cython/extending_distributions.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ from cpython.pycapsule cimport PyCapsule_IsValid, PyCapsule_GetPointer
from libc.stdint cimport uint16_t, uint64_t
from numpy.random cimport bitgen_t
from numpy.random import PCG64
from numpy.random.c_distributions cimport (random_standard_uniform_fill,
random_standard_uniform_fill_f)


@cython.boundscheck(False)
Expand Down Expand Up @@ -40,7 +42,7 @@ def uniforms(Py_ssize_t n):
randoms = np.asarray(random_values)

return randoms

# cython example 2
@cython.boundscheck(False)
@cython.wraparound(False)
Expand Down Expand Up @@ -72,3 +74,43 @@ def uint10_uniforms(Py_ssize_t n):
randoms = np.asarray(random_values)
return randoms

# cython example 3
def uniforms_ex(bit_generator, Py_ssize_t n, dtype=np.float64):
"""
Create an array of `n` uniformly distributed doubles via a "fill" function.

A 'real' distribution would want to process the values into
some non-uniform distribution

Parameters
----------
bit_generator: BitGenerator instance
n: int
Output vector length
dtype: {str, dtype}, optional
Desired dtype, either 'd' (or 'float64') or 'f' (or 'float32'). The
default dtype value is 'd'
"""
cdef Py_ssize_t i
cdef bitgen_t *rng
cdef const char *capsule_name = "BitGenerator"
cdef np.ndarray randoms

_dtype = np.dtype(dtype)
if _dtype.type not in (np.float32, np.float64):
raise TypeError('Unsupported dtype "%r"' % dtype)
capsule = bit_generator.capsule
# Optional check that the capsule if from a BitGenerator
if not PyCapsule_IsValid(capsule, capsule_name):
raise ValueError("Invalid pointer to anon_func_state")
# Cast the pointer
rng = <bitgen_t *> PyCapsule_GetPointer(capsule, capsule_name)
randoms = np.empty(n, dtype=_dtype)
if _dtype is np.float32:
Copy link
Member
@eric-wieser eric-wieser Feb 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if _dtype is np.float32:
if _dtype.type is np.float32:

or

Suggested change
if _dtype is np.float32:
if _dtype == np.float32:

As written this is always false. The former is more consistent with line 100.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got be careful here, the latter is correct. The former would have to also check _dtype.isnative. Right now dtype=">f8" (on little endian machines) is probably broken!

Copy link
Member
@seberg seberg Feb 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I think I actually like the dtype.type is np.float32 and dtype.isnative spelling I think.

EDIT: I guess here the isnative check needs to be further up anyway though, maybe just create an issue for it as a separate thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant as a test-run for something to replace the key = np.dtype(dtype).name in _generator.pyx and mtrand.pyx so it would be good to get it right here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think dtype == np.float32 is probably simplest, but make sure to update line 100 too.

Copy link
Member
@seberg seberg Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True dtype == ... is shortest. May be worth to time it in case the type + isnative happens to be much faster (right now).
I timed it, and it:

In [2]: %timeit dt == np.float64                                                                                        
85 ns ± 5.03 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [3]: %timeit dt.type is np.float64                                                                                   
57.3 ns ± 0.108 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %timeit dt.type is np.float64 and dt.isnative                                                                   
81.6 ns ± 0.697 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

so since the == includes the isnative check there is no speed difference between the spellings , so probably should use the == one. The only thing we could consider with respect to fixing the speed issue of dtype.name check would be to add a fast-path for the very common dtype default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using the == form

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still seems to be the == form here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixing

with bit_generator.lock:
random_standard_uniform_fill_f(rng, n, <float*>np.PyArray_DATA(randoms))
else:
with bit_generator.lock:
random_standard_uniform_fill(rng, n, <double*>np.PyArray_DATA(randoms))
return randoms

8 changes: 7 additions & 1 deletion numpy/random/_examples/cython/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@
from os.path import join, dirname

path = dirname(__file__)
src_dir = join(dirname(path), '..', 'src')
defs = [('NPY_NO_DEPRECATED_API', 0)]
inc_path = np.get_include()
# not so nice. We need the random/lib library from numpy
lib_path = join(np.get_include(), '..', '..', 'random', 'lib')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we invent a better way to spell this? I am not sure, and if it is np.get_include(component=None) with the only other component being "random" or similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the lib path, not the include path. Should I add np.get_libpath('npymath'/'npyrandom')?


extending = Extension("extending",
sources=[join(path, 'extending.pyx')],
Expand All @@ -24,7 +28,9 @@
)
distributions = Extension("extending_distributions",
sources=[join(path, 'extending_distributions.pyx')],
include_dirs=[np.get_include()],
include_dirs=[inc_path],
library_dirs=[lib_path],
libraries=['npyrandom'],
define_macros=defs,
)

Expand Down
110 changes: 1 addition & 109 deletions numpy/random/_generator.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import numpy as np
cimport numpy as np
from numpy.core.multiarray import normalize_axis_index

from .c_distributions cimport *
from libc cimport string
from libc.stdint cimport (uint8_t, uint16_t, uint32_t, uint64_t,
int32_t, int64_t, INT64_MAX, SIZE_MAX)
Expand All @@ -27,117 +28,8 @@ from ._common cimport (POISSON_LAM_MAX, CONS_POSITIVE, CONS_NONE,
check_array_constraint, check_constraint, disc, discrete_broadcast_iii,
)


cdef extern from "numpy/random/distributions.h":

struct s_binomial_t:
int has_binomial
double psave
int64_t nsave
double r
double q
double fm
int64_t m
double p1
double xm
double xl
double xr
double c
double laml
double lamr
double p2
double p3
double p4

ctypedef s_binomial_t binomial_t

double random_standard_uniform(bitgen_t *bitgen_state) nogil
void random_standard_uniform_fill(bitgen_t* bitgen_state, np.npy_intp cnt, double *out) nogil
double random_standard_exponential(bitgen_t *bitgen_state) nogil
double random_standard_exponential_f(bitgen_t *bitgen_state) nogil
void random_standard_exponential_fill(bitgen_t *bitgen_state, np.npy_intp cnt, double *out) nogil
void random_standard_exponential_fill_f(bitgen_t *bitgen_state, np.npy_intp cnt, double *out) nogil
void random_standard_exponential_inv_fill(bitgen_t *bitgen_state, np.npy_intp cnt, double *out) nogil
void random_standard_exponential_inv_fill_f(bitgen_t *bitgen_state, np.npy_intp cnt, double *out) nogil
double random_standard_normal(bitgen_t* bitgen_state) nogil
void random_standard_normal_fill(bitgen_t *bitgen_state, np.npy_intp count, double *out) nogil
void random_standard_normal_fill_f(bitgen_t *bitgen_state, np.npy_intp count, float *out) nogil
double random_standard_gamma(bitgen_t *bitgen_state, double shape) nogil

float random_standard_uniform_f(bitgen_t *bitgen_state) nogil
void random_standard_uniform_fill_f(bitgen_t* bitgen_state, np.npy_intp cnt, float *out) nogil
float random_standard_normal_f(bitgen_t* bitgen_state) nogil
float random_standard_gamma_f(bitgen_t *bitgen_state, float shape) nogil

int64_t random_positive_int64(bitgen_t *bitgen_state) nogil
int32_t random_positive_int32(bitgen_t *bitgen_state) nogil
int64_t random_positive_int(bitgen_t *bitgen_state) nogil
uint64_t random_uint(bitgen_t *bitgen_state) nogil

double random_normal(bitgen_t *bitgen_state, double loc, double scale) nogil

double random_gamma(bitgen_t *bitgen_state, double shape, double scale) nogil
float random_gamma_f(bitgen_t *bitgen_state, float shape, float scale) nogil

double random_exponential(bitgen_t *bitgen_state, double scale) nogil
double random_uniform(bitgen_t *bitgen_state, double lower, double range) nogil
double random_beta(bitgen_t *bitgen_state, double a, double b) nogil
double random_chisquare(bitgen_t *bitgen_state, double df) nogil
double random_f(bitgen_t *bitgen_state, double dfnum, double dfden) nogil
double random_standard_cauchy(bitgen_t *bitgen_state) nogil
double random_pareto(bitgen_t *bitgen_state, double a) nogil
double random_weibull(bitgen_t *bitgen_state, double a) nogil
double random_power(bitgen_t *bitgen_state, double a) nogil
double random_laplace(bitgen_t *bitgen_state, double loc, double scale) nogil
double random_gumbel(bitgen_t *bitgen_state, double loc, double scale) nogil
double random_logistic(bitgen_t *bitgen_state, double loc, double scale) nogil
double random_lognormal(bitgen_t *bitgen_state, double mean, double sigma) nogil
double random_rayleigh(bitgen_t *bitgen_state, double mode) nogil
double random_standard_t(bitgen_t *bitgen_state, double df) nogil
double random_noncentral_chisquare(bitgen_t *bitgen_state, double df,
AF5E double nonc) nogil
double random_noncentral_f(bitgen_t *bitgen_state, double dfnum,
double dfden, double nonc) nogil
double random_wald(bitgen_t *bitgen_state, double mean, double scale) nogil
double random_vonmises(bitgen_t *bitgen_state, double mu, double kappa) nogil
double random_triangular(bitgen_t *bitgen_state, double left, double mode,
double right) nogil

int64_t random_poisson(bitgen_t *bitgen_state, double lam) nogil
int64_t random_negative_binomial(bitgen_t *bitgen_state, double n, double p) nogil
int64_t random_binomial(bitgen_t *bitgen_state, double p, int64_t n, binomial_t *binomial) nogil
int64_t random_logseries(bitgen_t *bitgen_state, double p) nogil
int64_t random_geometric_search(bitgen_t *bitgen_state, double p) nogil
int64_t random_geometric_inversion(bitgen_t *bitgen_state, double p) nogil
int64_t random_geometric(bitgen_t *bitgen_state, double p) nogil
int64_t random_zipf(bitgen_t *bitgen_state, double a) nogil
int64_t random_hypergeometric(bitgen_t *bitgen_state, int64_t good, int64_t bad,
int64_t sample) nogil

uint64_t random_interval(bitgen_t *bitgen_state, uint64_t max) nogil

# Generate random uint64 numbers in closed interval [off, off + rng].
uint64_t random_bounded_uint64(bitgen_t *bitgen_state,
uint64_t off, uint64_t rng,
uint64_t mask, bint use_masked) nogil

void random_multinomial(bitgen_t *bitgen_state, int64_t n, int64_t *mnix,
double *pix, np.npy_intp d, binomial_t *binomial) nogil

int random_multivariate_hypergeometric_count(bitgen_t *bitgen_state,
int64_t total,
size_t num_colors, int64_t *colors,
int64_t nsample,
size_t num_variates, int64_t *variates) nogil
void random_multivariate_hypergeometric_marginals(bitgen_t *bitgen_state,
int64_t total,
size_t num_colors, int64_t *colors,
int64_t nsample,
size_t num_variates, int64_t *variates) nogil

np.import_array()


cdef int64_t _safe_sum_nonneg_int64(size_t num_colors, int64_t *colors):
"""
Sum the values in the array `colors`.
Expand Down
114 changes: 114 additions & 0 deletions numpy/random/c_distributions.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
#!python
#cython: wraparound=False, nonecheck=False, boundscheck=False, cdivision=True, language_level=3
from numpy cimport npy_intp

from libc.stdint cimport (uint64_t, int32_t, int64_t)
from numpy.random cimport bitgen_t

cdef extern from "numpy/random/distributions.h":

struct s_binomial_t:
int has_binomial
double psave
int64_t nsave
double r
double q
double fm
int64_t m
double p1
double xm
double xl
double xr
double c
double laml
double lamr
double p2
double p3
double p4

ctypedef s_binomial_t binomial_t

double random_standard_uniform(bitgen_t *bitgen_state) nogil
void random_standard_uniform_fill(bitgen_t* bitgen_state, npy_intp cnt, double *out) nogil
double random_standard_exponential(bitgen_t *bitgen_state) nogil
double random_standard_exponential_f(bitgen_t *bitgen_state) nogil
void random_standard_exponential_fill(bitgen_t *bitgen_state, npy_intp cnt, double *out) nogil
void random_standard_exponential_fill_f(bitgen_t *bitgen_state, npy_intp cnt, double *out) nogil
void random_standard_exponential_inv_fill(bitgen_t *bitgen_state, npy_intp cnt, double *out) nogil
void random_standard_exponential_inv_fill_f(bitgen_t *bitgen_state, npy_intp cnt, double *out) nogil
double random_standard_normal(bitgen_t* bitgen_state) nogil
void random_standard_normal_fill(bitgen_t *bitgen_state, npy_intp count, double *out) nogil
void random_standard_normal_fill_f(bitgen_t *bitgen_state, npy_intp count, float *out) nogil
double random_standard_gamma(bitgen_t *bitgen_state, double shape) nogil

float random_standard_uniform_f(bitgen_t *bitgen_state) nogil
void random_standard_uniform_fill_f(bitgen_t* bitgen_state, npy_intp cnt, float *out) nogil
float random_standard_normal_f(bitgen_t* bitgen_state) nogil
float random_standard_gamma_f(bitgen_t *bitgen_state, float shape) nogil

int64_t random_positive_int64(bitgen_t *bitgen_state) nogil
int32_t random_positive_int32(bitgen_t *bitgen_state) nogil
int64_t random_positive_int(bitgen_t *bitgen_state) D1FD nogil
uint64_t random_uint(bitgen_t *bitgen_state) nogil

double random_normal(bitgen_t *bitgen_state, double loc, double scale) nogil

double random_gamma(bitgen_t *bitgen_state, double shape, double scale) nogil
float random_gamma_f(bitgen_t *bitgen_state, float shape, float scale) nogil

double random_exponential(bitgen_t *bitgen_state, double scale) nogil
double random_uniform(bitgen_t *bitgen_state, double lower, double range) nogil
double random_beta(bitgen_t *bitgen_state, double a, double b) nogil
double random_chisquare(bitgen_t *bitgen_state, double df) nogil
double random_f(bitgen_t *bitgen_state, double dfnum, double dfden) nogil
double random_standard_cauchy(bitgen_t *bitgen_state) nogil
double random_pareto(bitgen_t *bitgen_state, double a) nogil
double random_weibull(bitgen_t *bitgen_state, double a) nogil
double random_power(bitgen_t *bitgen_state, double a) nogil
double random_laplace(bitgen_t *bitgen_state, double loc, double scale) nogil
double random_gumbel(bitgen_t *bitgen_state, double loc, double scale) nogil
double random_logistic(bitgen_t *bitgen_state, double loc, double scale) nogil
double random_lognormal(bitgen_t *bitgen_state, double mean, double sigma) nogil
double random_rayleigh(bitgen_t *bitgen_state, double mode) nogil
double random_standard_t(bitgen_t *bitgen_state, double df) nogil
double random_noncentral_chisquare(bitgen_t *bitgen_state, double df,
double nonc) nogil
double random_noncentral_f(bitgen_t *bitgen_state, double dfnum,
double dfden, double nonc) nogil
double random_wald(bitgen_t *bitgen_state, double mean, double scale) nogil
double random_vonmises(bitgen_t *bitgen_state, double mu, double kappa) nogil
double random_triangular(bitgen_t *bitgen_state, double left, double mode,
double right) nogil

int64_t random_poisson(bitgen_t *bitgen_state, double lam) nogil
int64_t random_negative_binomial(bitgen_t *bitgen_state, double n, double p) nogil
int64_t random_binomial(bitgen_t *bitgen_state, double p, int64_t n, binomial_t *binomial) nogil
int64_t random_logseries(bitgen_t *bitgen_state, double p) nogil
int64_t random_geometric_search(bitgen_t *bitgen_state, double p) nogil
int64_t random_geometric_inversion(bitgen_t *bitgen_state, double p) nogil
int64_t random_geometric(bitgen_t *bitgen_state, double p) nogil
int64_t random_zipf(bitgen_t *bitgen_state, double a) nogil
int64_t random_hypergeometric(bitgen_t *bitgen_state, int64_t good, int64_t bad,
int64_t sample) nogil

uint64_t random_interval(bitgen_t *bitgen_state, uint64_t max) nogil

# Generate random uint64 numbers in closed interval [off, off + rng].
uint64_t random_bounded_uint64(bitgen_t *bitgen_state,
uint64_t off, uint64_t rng,
uint64_t mask, bint use_masked) nogil

void random_multinomial(bitgen_t *bitgen_state, int64_t n, int64_t *mnix,
double *pix, npy_intp d, binomial_t *binomial) nogil

int random_multivariate_hypergeometric_count(bitgen_t *bitgen_state,
int64_t total,
size_t num_colors, int64_t *colors,
int64_t nsample,
size_t num_variates, int64_t *variates) nogil
void random_multivariate_hypergeometric_marginals(bitgen_t *bitgen_state,
int64_t total,
size_t num_colors, int64_t *colors,
int64_t nsample,
size_t num_variates, int64_t *variates) nogil

Loading
0