8000 PERF: Improve performance of special attribute lookups by eendebakpt · Pull Request #21423 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content
8000

PERF: Improve performance of special attribute lookups #21423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 3, 2022

Conversation

eendebakpt
Copy link
Contributor
@eendebakpt eendebakpt commented May 1, 2022

This PR reduces the overhead of ufunc_generic_fastcall, which is used by many numpy methods. For many input arguments a check is performed on the existence of the __array_ufunc__ attribute. The check involves a call to tp_getattro from PyTypeObject which requires a PyUniCode object.

This PR avoids construction of the PyUniCode for each invocation of the attribute lookup.

Benchmark

import pyperf
runner = pyperf.Runner()

runner.timeit(name=f"np.sqrt", stmt=f"np.sqrt(v)", setup='import numpy as np; v=np.float64(1.1)')
runner.timeit(name=f"np.cos", stmt=f"np.cos(v)", setup='import numpy as np; v=np.float64(1.1)')

Results

np.sqrt: Mean +- std dev: [base] 887 ns +- 37 ns -> [patch] 807 ns +- 49 ns: 1.10x faster
np.cos: Mean +- std dev: [base] 914 ns +- 46 ns -> [patch] 829 ns +- 61 ns: 1.10x faster

Geometric mean: 1.10x faster
Results of numpy benchmarks Benchmark:
python runtests.py --bench-compare main bench_ufunc

The results show some tests with improved and some with decreased performance. It looks like some instability on my system.
Most changes are in bench_ufunc_strides.BinaryInt, on a re-run that values changed and the increases or decreases do not look systematic.

Full results

endebakpt@woelmuis:~/numpy$ python runtests.py --bench-compare main "bench_ufunc"
· Creating environments
· Discovering benchmarks
·· Uninstalling from virtualenv-py3.8-Cython
·· Building ee8c683a <performance_cache_unicode_array_ufunc> for virtualenv-py3.8-Cython..................................................
·· Installing ee8c683a <performance_cache_unicode_array_ufunc> into virtualenv-py3.8-Cython.
· Running 68 total benchmarks (2 commits * 1 environments * 34 benchmarks)
[  0.00%] · For numpy commit fd646bd6 <main> (round 1/2):
[  0.00%] ·· Building for virtualenv-py3.8-Cython.....................................................
[  0.00%] ·· Benchmarking virtualenv-py3.8-Cython
[  0.00%] ··· Importing benchmark suite produced output:
[  0.00%] ···· NumPy CPU features: SSE SSE2 SSE3 SSSE3* SSE41* POPCNT* SSE42* AVX* F16C* FMA3* AVX2* AVX512F? AVX512CD? AVX512_KNL? AVX512_KNM? AVX512_SKX? AVX512_CLX? AVX512_CNL? AVX512_ICL?
[  0.74%] ··· Running (bench_ufunc.ArgParsing.time_add_arg_parsing--).........................
[ 19.12%] ··· Running (bench_ufunc_strides.AVX_UFunc_log.time_log--).....
[ 22.79%] ··· Running (bench_ufunc_strides.BinaryInt.time_ufunc--).
[ 23.53%] ··· Running (bench_ufunc_strides.LogisticRegression.time_train--)..
[ 25.00%] ··· Running (bench_ufunc_strides.Unary.time_ufunc--).
[ 25.00%] · For numpy commit ee8c683a <performance_cache_unicode_array_ufunc> (round 1/2):
[ 25.00%] ·· Building for virtualenv-py3.8-Cython..
[ 25.00%] ·· Benchmarking virtualenv-py3.8-Cython
[ 25.74%] ··· Running (bench_ufunc.ArgParsing.time_add_arg_parsing--).........................
[ 44.12%] ··· Running (bench_ufunc_strides.AVX_UFunc_log.time_log--).....
[ 47.79%] ··· Running (bench_ufunc_strides.BinaryInt.time_ufunc--).
[ 48.53%] ··· Running (bench_ufunc_strides.LogisticRegression.time_train--)..
[ 50.00%] ··· Running (bench_ufunc_strides.Unary.time_ufunc--).
[ 50.00%] · For numpy commit ee8c683a <performance_cache_unicode_array_ufunc> (round 2/2):
[ 50.00%] ·· Benchmarking virtualenv-py3.8-Cython
[ 50.74%] ··· bench_ufunc.ArgParsing.time_add_arg_parsing                                                                                                                                  ok
[ 50.74%] ··· =============================================================== ==========
                                         arg_kwarg                                      
              --------------------------------------------------------------- ----------
                                   (array(1.), array(2.))                      710±60ns 
                             (array(1.), array(2.), array(3.))                 598±10ns 
                           (array(1.), array(2.), out=array(3.))               681±20ns 
                          (array(1.), array(2.), out=(array(3.),))             660±30ns 
               (array(1.), array(2.), out=array(3.), subok=True, where=True)   692±20ns 
                             (array(1.), array(2.), subok=True)                788±60ns 
                       (array(1.), array(2.), subok=True, where=True)          823±60ns 
                 (array(1.), array(2.), array(3.), subok=True, where=True)     695±30ns 
              =============================================================== ==========

[ 51.47%] ··· bench_ufunc.ArgParsingReduce.time_add_reduce_arg_parsing                                                                                                                     ok
[ 51.47%] ··· ====================================================== =============
                                    arg_kwarg                                     
              ------------------------------------------------------ -------------
                                (array([0., 1.]))                     1.34±0.01μs 
                               (array([0., 1.]), 0)                   1.36±0.02μs 
                            (array([0., 1.]), axis=0)                 1.42±0.02μs 
                            (array([0., 1.]), 0, None)                1.36±0.02μs 
                      (array([0., 1.]), axis=0, dtype=None)           1.44±0.01μs 
                      (array([0., 1.]), 0, None, array(0.))             1.23±0μs  
               (array([0., 1.]), axis=0, dtype=None, out=array(0.))   1.29±0.01μs 
                         (array([0., 1.]), out=array(0.))             1.28±0.03μs 
              ====================================================== =============

[ 52.21%] ··· bench_ufunc.Broadcast.time_broadcast                                                                                                                                10.5±0.04ms
[ 52.94%] ··· bench_ufunc.Custom.time_and_bool                                                                                                                                    1.68±0.02μs
[ 53.68%] ··· bench_ufunc.Custom.time_nonzero                                                                                                                                      13.1±0.1μs
[ 54.41%] ··· bench_ufunc.Custom.time_not_bool                                                                                                                                    1.46±0.03μs
[ 55.15%] ··· bench_ufunc.Custom.time_or_bool                                                                                                                                     1.61±0.02μs
[ 55.88%] ··· bench_ufunc.CustomArrayFloorDivideInt.time_floor_divide_int                                                                                                                  ok
[ 55.88%] ··· ============== ============= ============= =============
              --                                size                  
              -------------- -----------------------------------------
                  dtype           100          10000        1000000   
              ============== ============= ============= =============
                numpy.int8    1.11±0.01μs    73.7±0.3μs   7.87±0.03ms 
               numpy.int16    1.11±0.02μs    75.1±0.3μs   8.29±0.05ms 
               numpy.int32    1.07±0.03μs    94.0±0.6μs   10.3±0.05ms 
               numpy.int64    1.64±0.02μs     161±1μs      17.6±0.2ms 
               numpy.uint8    1.00±0.03μs    38.7±0.1μs   3.76±0.04ms 
               numpy.uint16   1.02±0.03μs   38.8±0.06μs   3.79±0.01ms 
               numpy.uint32   1.03±0.03μs    38.7±0.1μs   3.83±0.04ms 
               numpy.uint64   1.44±0.03μs    82.3±0.3μs   8.27±0.06ms 
              ============== ============= ============= =============

[ 56.62%] ··· bench_ufunc.CustomInplace.time_char_or                                                                                                                               46.4±0.2μs
[ 57.35%] ··· bench_ufunc.CustomInplace.time_char_or_temp                                                                                                                          60.0±0.5μs
[ 58.09%] ··· bench_ufunc.CustomInplace.time_double_add                                                                                                                            56.6±0.1μs
[ 58.82%] ··· bench_ufunc.CustomInplace.time_double_add_temp                                                                                                                       75.1±0.1μs
[ 59.56%] ··· bench_ufunc.CustomInplace.time_float_add                                                                                                                             57.5±0.3μs
[ 60.29%] ··· bench_ufunc.CustomInplace.time_float_add_temp                                                                                                                        76.1±0.3μs
[ 61.03%] ··· bench_ufunc.CustomInplace.time_int_or                                                                                                                                56.5±0.2μs
[ 61.76%] ··· bench_ufunc.CustomInplace.time_int_or_temp                                                                                                                           71.6±0.6μs
[ 62.50%] ··· bench_ufunc.CustomScalar.time_add_scalar2                                                                                                                                    ok
[ 62.50%] ··· =============== ============
                   dtype                  
              --------------- ------------
               numpy.float32   6.17±0.2μs 
               numpy.float64   12.8±0.1μs 
              =============== ============

[ 63.24%] ··· bench_ufunc.CustomScalar.time_divide_scalar2                                                                                                                                 ok
[ 63.24%] ··· =============== =============
                   dtype                   
              --------------- -------------
               numpy.float32   12.4±0.06μs 
               numpy.float64    26.4±0.2μs 
              =============== =============

[ 63.97%] ··· bench_ufunc.CustomScalar.time_divide_scalar2_inplace                                                                                                                         ok
[ 63.97%] ··· =============== ============
                   dtype                  
              --------------- ------------
               numpy.float32   12.7±0.2μs 
               numpy.float64   26.4±0.3μs 
              =============== ============

[ 64.71%] ··· bench_ufunc.CustomScalar.time_less_than_scalar2                                                                                                                              ok
[ 64.71%] ··· =============== =============
                   dtype                   
              --------------- -------------
               numpy.float32   4.05±0.08μs 
               numpy.float64    7.19±0.1μs 
              =============== =============

[ 65.44%] ··· bench_ufunc.CustomScalarFloorDivideInt.time_floor_divide_int                                                                                                                 ok
[ 65.44%] ··· ============== ============= ============= ============= =============
              --                                     divisors                       
              -------------- -------------------------------------------------------
                  dtype            8             -8            43           -43     
              ============== ============= ============= ============= =============
                numpy.int8    3.03±0.08μs    3.16±0.2μs   3.01±0.09μs    3.08±0.2μs 
               numpy.int16    3.32±0.07μs   3.23±0.08μs    3.36±0.2μs    3.34±0.1μs 
               numpy.int32    5.60±0.02μs    5.51±0.1μs   5.61±0.05μs   5.44±0.06μs 
               numpy.int64    14.9±0.03μs    15.1±0.1μs    14.9±0.1μs   14.8±0.06μs 
               numpy.uint8     3.24±0.1μs       n/a       3.24±0.07μs       n/a     
               numpy.uint16    3.63±0.1μs       n/a       3.70±0.08μs       n/a     
               numpy.uint32    5.40±0.2μs       n/a        5.43±0.1μs       n/a     
               numpy.uint64    12.1±0.1μs       n/a        12.1±0.2μs       n/a     
              ============== ============= ============= ============= =============

[ 65.44%] ···· For parameters: <class 'numpy.uint8'>, -8
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint8'>, -43
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint16'>, -8
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint16'>, -43
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint32'>, -8
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint32'>, -43
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint64'>, -8
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint64'>, -43
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')

[ 66.18%] ··· bench_ufunc.Scalar.time_add_scalar                                                                                                                                     575±50ns
[ 66.91%] ··· bench_ufunc.Scalar.time_add_scalar_conv                                                                                                                                803±20ns
[ 67.65%] ··· bench_ufunc.Scalar.time_add_scalar_conv_complex                                                                                                                        825±30ns
[ 68.38%] ··· bench_ufunc.UFunc.time_ufunc_types                                                                                                                                           ok
[ 68.38%] ··· =============== =============
                   ufunc                   
              --------------- -------------
                    abs          808±4μs   
                  absolute       813±4μs   
                    add          385±1μs   
                   arccos      6.26±0.04ms 
                  arccosh      5.93±0.02ms 
                   arcsin      6.32±0.06ms 
                  arcsinh      5.82±0.05ms 
                   arctan      3.47±0.01ms 
                  arctan2        1.84±0ms  
                  arctanh      3.73±0.02ms 
                bitwise_and     32.8±0.3μs 
                bitwise_not    21.2±0.04μs 
                 bitwise_or     32.6±0.1μs 
                bitwise_xor     32.6±0.1μs 
                    cbrt       1.85±0.01ms 
                    ceil         208±1μs   
                    conj        189±0.6μs  
                 conjugate       190±1μs   
                  copysign      177±0.7μs  
                    cos        6.95±0.03ms 
                    cosh       6.12±0.05ms 
                  deg2rad        240±1μs   
                  degrees       240±0.9μs  
                   divide        681±4μs   
                   divmod        1.01±0ms  
                   equal         301±3μs   
                    exp        4.90±0.05ms 
                    exp2       4.70±0.03ms 
                   expm1       9.31±0.04ms 
                    fabs        237±0.8μs  
                float_power    10.3±0.04ms 
                   floor        209±0.4μs  
                floor_divide     887±2μs   
                    fmax        440±0.8μs  
                    fmin        434±0.5μs  
                    fmod         591±2μs   
                   frexp         342±2μs   
                    gcd         215±0.5μs  
                  greater        307±1μs   
               greater_equal     301±1μs   
                 heaviside       387±2μs   
                   hypot       1.04±0.01ms 
                   invert       21.5±0.3μs 
                  isfinite      169±0.7μs  
                   isinf        170±0.5μs  
                   isnan        146±0.3μs  
                   isnat         336±2ns   
                    lcm         336±0.9μs  
                   ldexp         213±1μs   
                 left_shift     98.8±0.3μs 
                    less        296±0.7μs  
                 less_equal      292±2μs   
                    log        3.12±0.01ms 
                   log10       3.35±0.02ms 
                   log1p       3.32±0.02ms 
                    log2       3.20±0.03ms 
                 logaddexp       348±10μs  
                 logaddexp2      330±1μs   
                logical_and      317±5μs   
                logical_not      200±2μs   
                 logical_or     262±0.8μs  
                logical_xor      379±6μs   
                   matmul       22.7±0.3ms 
                  maximum        416±2μs   
                  minimum        397±1μs   
                    mod          697±2μs   
                    modf         435±2μs   
                  multiply       397±1μs   
                  negative       224±1μs   
                 nextafter       399±2μs   
                 not_equal       311±2μs   
                  positive       231±1μs   
                   power       10.8±0.02ms 
                  rad2deg        240±1μs   
                  radians       240±0.7μs  
                 reciprocal      724±1μs   
                 remainder       697±3μs   
                right_shift     102±0.4μs  
                    rint         417±4μs   
                    sign         249±3μs   
                  signbit       92.1±0.4μs 
                    sin        6.70±0.05ms 
                    sinh       6.64±0.04ms 
                  spacing        432±3μs   
                    sqrt       1.52±0.01ms 
                   square        244±3μs   
                  subtract       379±2μs   
                    tan        8.30±0.04ms 
                    tanh       5.79±0.02ms 
                true_divide      702±40μs  
                   trunc        204±0.7μs  
              =============== =============

[ 69.12%] ··· bench_ufunc_strides.AVX_UFunc_log.time_log                                                                                                                                   ok
[ 69.12%] ··· ======== ============= ============
              --                 dtype           
              -------- --------------------------
               stride        f            d      
              ======== ============= ============
                 1       22.1±0.8μs   59.9±0.4μs 
                 2      28.8±0.09μs   60.4±0.4μs 
                 4       29.9±0.2μs   60.9±0.2μs 
              ======== ============= ============

[ 69.85%] ··· bench_ufunc_strides.AVX_cmplx_arithmetic.time_ufunc                                                                                                                          ok
[ 69.85%] ··· ========== ============= ============= ============= ============= ============= ============
              --                                           stride / dtype                                  
              ---------- ----------------------------------------------------------------------------------
                bfunc        1 / F         1 / D         2 / F         2 / D         4 / F        4 / D    
              ========== ============= ============= ============= ============= ============= ============
                 add      10.8±0.05μs    16.4±0.3μs   13.4±0.08μs   23.8±0.02μs   20.6±0.04μs   38.2±0.1μs 
               subtract   10.8±0.05μs   16.4±0.04μs   13.4±0.06μs   23.9±0.08μs   20.7±0.04μs   38.5±0.2μs 
               multiply   13.2±0.06μs    16.6±0.1μs    14.7±0.1μs   23.9±0.04μs    20.8±0.1μs   38.7±0.1μs 
                divide     42.9±0.2μs    48.4±0.1μs    43.1±0.2μs    48.5±0.2μs    44.1±0.7μs   49.4±0.2μs 
              ========== ============= ============= ============= ============= ============= ============

[ 70.59%] ··· bench_ufunc_strides.AVX_cmplx_funcs.time_ufunc                                                                                                                               ok
[ 70.59%] ··· ============ ============= ============ ============= ============= ============= ============
              --                                             stride / dtype                                 
              ------------ ---------------------------------------------------------------------------------
                 bfunc         1 / F        1 / D         2 / F         2 / D         4 / F        4 / D    
              ============ ============= ============ ============= ============= ============= ============
               reciprocal    62.8±0.3μs   72.1±0.4μs    63.3±0.3μs    72.1±0.2μs    62.9±0.4μs   72.7±0.3μs 
                absolute     50.6±0.2μs   50.7±0.1μs    50.9±0.1μs    52.1±0.9μs    51.2±0.3μs    54.2±1μs  
                 square     12.9±0.03μs   13.9±0.6μs   13.3±0.09μs    16.8±0.3μs   15.2±0.09μs   22.6±0.1μs 
               conjugate    8.29±0.03μs   11.2±0.2μs    9.52±0.1μs   14.7±0.05μs   12.2±0.07μs   21.9±0.2μs 
              ============ ============= ============ ============= ============= ============= ============

[ 71.32%] ··· bench_ufunc_strides.AVX_ldexp.time_ufunc                                                                                                                                     ok
[ 71.32%] ··· ======= ============ ============ ============
              --                      stride                
              ------- --------------------------------------
               dtype       1            2            4      
              ======= ============ ============ ============
                 f     57.8±0.7μs   58.1±0.1μs   58.2±0.3μs 
                 d     59.6±0.2μs   59.8±0.2μs   60.3±0.5μs 
              ======= ============ ============ ============

[ 72.06%] ··· bench_ufunc_strides.Binary.time_ufunc                                                                                                                                        ok
[ 72.06%] ··· ========= ============ ============ ============= ============ ============ =========== =========== ==========
              --                                                              stride_out / dtype                            
              ----------------------------------- --------------------------------------------------------------------------
                ufunc    stride_in0   stride_in1      1 / f        1 / d        2 / f        2 / d       4 / f      4 / d   
              ========= ============ ============ ============= ============ ============ =========== =========== ==========
               maximum       1            1         38.0±0.3μs   74.5±0.8μs    85.8±5μs    111±0.9μs    126±3μs    225±8μs  
               maximum       1            2        75.6±0.04μs    95.3±1μs     111±8μs      138±1μs     129±4μs    277±10μs 
               maximum       1            4         79.4±0.4μs    162±5μs     115±0.3μs     220±8μs     146±2μs    356±20μs 
               maximum       2            1         75.9±0.3μs   95.4±0.9μs    105±8μs      141±1μs    130±0.9μs   267±10μs 
               maximum       2            2         129±0.08μs    125±4μs     146±0.3μs     174±2μs    149±0.9μs   296±9μs  
               maximum       2            4         130±0.4μs     214±3μs     150±0.7μs     263±2μs     163±2μs    412±10μs 
               maximum       4            1         79.5±0.3μs    158±2μs      113±1μs      215±8μs     144±2μs    389±20μs 
               maximum       4            2         130±0.4μs     201±7μs      151±2μs      260±10μs    171±2μs    419±6μs  
               maximum       4            4          137±2μs      307±7μs      163±2μs      385±4μs     199±3μs    595±30μs 
               minimum       1            1         37.8±0.1μs   73.9±0.3μs    86.2±5μs    110±0.7μs    126±3μs    217±5μs  
               minimum       1            2         75.7±0.1μs   95.9±0.9μs    103±10μs     139±1μs     128±4μs    260±8μs  
               minimum       1            4         79.3±0.3μs    160±2μs     112±0.5μs     214±1μs     144±2μs    354±6μs  
               minimum       2            1         72.7±0.6μs    95.9±1μs    98.9±10μs     141±1μs     131±2μs    272±20μs 
               minimum       2            2         113±0.6μs     124±1μs      150±2μs      173±7μs    154±0.9μs   291±6μs  
               minimum       2            4          117±2μs      208±3μs      157±4μs      264±4μs     175±1μs    419±7μs  
               minimum       4            1         79.0±0.5μs    158±2μs      112±1μs      215±2μs     149±4μs    386±20μs 
               minimum       4            2         116±0.3μs     204±4μs     153±0.6μs     262±3μs     166±2μs    425±30μs 
               minimum       4            4          138±6μs      304±3μs      170±2μs      386±2μs     199±7μs    566±10μs 
                 fmax        1            1         38.0±0.2μs   74.5±0.1μs   60.2±0.2μs    110±2μs     94.2±1μs   215±6μs  
                 fmax        1            2         75.9±0.4μs    95.5±1μs     98.4±1μs     136±2μs     112±4μs    261±7μs  
                 fmax        1            4         79.5±0.5μs    160±3μs      111±2μs      222±4μs     133±2μs    349±4μs  
                 fmax        2            1        75.6±0.04μs    95.5±2μs    93.8±0.2μs    138±1μs     112±4μs    271±10μs 
                 fmax        2            2         120±0.1μs     124±2μs     142±0.5μs     171±2μs    144±0.6μs   316±20μs 
                 fmax        2            4          127±3μs      205±5μs      155±3μs      265±5μs    160±0.8μs   416±5μs  
                 fmax        4            1         79.2±0.3μs    158±3μs     107±0.7μs     216±7μs     139±2μs    382±20μs 
                 fmax        4            2         123±0.9μs     204±6μs     146±0.7μs     260±2μs     156±2μs    420±4μs  
                 fmax        4            4          129±2μs      303±3μs      163±2μs      384±4μs     194±3μs    558±10μs 
                 fmin        1            1         37.8±0.2μs    74.0±1μs     62.5±2μs    110±0.6μs    93.3±2μs   212±8μs  
                 fmin        1            2         75.5±0.1μs    96.1±1μs    93.9±0.3μs    139±2μs    108±0.6μs   265±5μs  
                 fmin        1            4         79.6±0.3μs    163±5μs     110±0.6μs     226±7μs     137±3μs    354±4μs  
                 fmin        2            1         72.1±0.2μs    97.5±2μs     96.0±2μs     139±1μs    108±0.4μs   268±20μs 
                 fmin        2            2         128±0.2μs     122±1μs      138±1μs      171±2μs     144±1μs    299±10μs 
                 fmin        2            4          133±4μs      198±10μs     151±3μs      269±10μs    164±2μs    422±4μs  
                 fmin        4            1         77.9±0.3μs    161±2μs      110±1μs      223±8μs     131±2μs    380±7μs  
                 fmin        4            2         130±0.3μs     206±6μs      144±1μs      261±5μs     166±4μs    417±2μs  
                 fmin        4            4          148±5μs      301±3μs      157±2μs      389±8μs     199±6μs    549±9μs  
              ========= ============ ============ ============= ============ ============ =========== =========== ==========

[ 72.79%] ··· bench_ufunc_strides.BinaryInt.time_ufunc                                                                                                                                     ok
[ 72.79%] ··· ========= ============ ============ ============ ======= =============
                ufunc    stride_in0   stride_in1   stride_out   dtype               
              --------- ------------ ------------ ------------ ------- -------------
               maximum       1            1            1          b      8.65±0.2μs 
               maximum       1            1            1          B      8.42±0.2μs 
               maximum       1            1            1          h     17.6±0.07μs 
               maximum       1            1            1          H     17.8±0.09μs 
               maximum       1            1            1          i      33.8±0.2μs 
               maximum       1            1            1          I      33.7±0.1μs 
               maximum       1            1            1          l      65.9±0.2μs 
               maximum       1            1            1          L      66.9±0.5μs 
               maximum       1            1            1          q      66.5±0.4μs 
               maximum       1            1            1          Q      67.2±0.5μs 
               maximum       1            1            2          b       72.1±1μs  
               maximum       1            1            2          B      73.3±0.9μs 
               maximum       1            1            2          h      71.6±0.1μs 
               maximum       1            1            2          H       73.4±2μs  
               maximum       1            1            2          i      72.1±0.7μs 
               maximum       1            1            2          I      72.0±0.3μs 
               maximum       1            1            2          l       108±5μs   
               maximum       1            1            2          L       105±1μs   
               maximum       1            1            2          q       103±2μs   
               maximum       1            1            2          Q       110±6μs   
               maximum       1            1            4          b      75.6±0.7μs 
               maximum       1            1            4          B      74.8±0.3μs 
               maximum       1            1            4          h      71.9±0.2μs 
               maximum       1            1            4          H      72.7±0.6μs 
               maximum       1            1            4          i       87.4±1μs  
               maximum       1            1            4          I       87.6±1μs  
               maximum       1            1            4          l       202±4μs   
               maximum       1            1            4          L       198±7μs   
               maximum       1            1            4          q       194±5μs   
               maximum       1            1            4          Q       200±5μs   
               maximum       1            2            1          b      71.0±0.6μs 
               maximum       1            2            1          B       72.4±2μs  
               maximum       1            2            1          h      71.7±0.4μs 
               maximum       1            2            1          H      71.6±0.3μs 
               maximum       1            2            1          i      72.4±0.3μs 
               maximum       1            2            1          I      72.0±0.5μs 
               maximum       1            2            1          l      92.9±0.7μs 
               maximum       1            2            1          L       93.2±1μs  
               maximum       1            2            1          q       94.2±2μs  
               maximum       1            2            1          Q       95.4±1μs  
               maximum       1            2            2          b      71.0±0.4μs 
               maximum       1            2            2          B       74.4±1μs  
               maximum       1            2            2          h       74.5±3μs  
               maximum       1            2            2          H      71.8±0.4μs 
               maximum       1            2            2          i      74.1±0.2μs 
               maximum       1            2            2          I       80.4±6μs  
               maximum       1            2            2          l       128±3μs   
               maximum       1            2            2          L       132±3μs   
               maximum       1            2            2          q      130±0.6μs  
               maximum       1            2            2          Q       132±3μs   
               maximum       1            2            4          b      75.1±0.6μs 
               maximum       1            2            4          B      75.4±0.4μs 
               maximum       1            2            4          h      72.3±0.3μs 
               maximum       1            2            4          H      71.9±0.2μs 
               maximum       1            2            4          i       93.2±1μs  
               maximum       1            2            4          I      92.5±0.5μs 
               maximum       1            2            4          l       240±6μs   
               maximum       1            2            4          L       240±8μs   
               maximum       1            2            4          q       243±4μs   
               maximum       1            2            4          Q       251±10μs  
               maximum       1            4            1          b      70.1±0.4μs 
               maximum       1            4            1          B       73.0±3μs  
               maximum       1            4            1          h      72.0±0.3μs 
               maximum       1            4            1          H      71.9±0.3μs 
               maximum       1            4            1          i      83.0±0.2μs 
               maximum       1            4            1          I       83.1±3μs  
               maximum       1            4            1          l       155±5μs   
               maximum       1            4            1          L       155±6μs   
               maximum       1            4            1          q       155±2μs   
               maximum       1            4            1          Q       154±5μs   
               maximum       1            4            2          b      70.9±0.4μs 
               maximum       1            4            2          B      70.9±0.3μs 
               maximum       1            4            2          h      72.0±0.3μs 
               maximum       1            4            2          H      72.1±0.2μs 
               maximum       1            4            2          i      91.5±0.9μs 
               maximum       1            4            2          I       94.8±4μs  
               maximum       1            4            2          l       209±2μs   
               maximum       1            4            2          L       219±5μs   
               maximum       1            4            2          q       205±4μs   
               maximum       1            4            2          Q       207±9μs   
               maximum       1            4            4          b       76.5±1μs  
               maximum       1            4            4          B      77.4±0.8μs 
               maximum       1            4            4          h      73.0±0.3μs 
               maximum       1            4            4          H       77.2±5μs  
               maximum       1            4            4          i       114±2μs   
               maximum       1            4            4          I       115±3μs   
               maximum       1            4            4          l       331±20μs  
               maximum       1            4            4          L       374±30μs  
               maximum       1            4            4          q       329±3μs   
               maximum       1            4            4          Q       332±10μs  
               maximum       2            1            1          b      69.9±0.3μs 
               maximum       2            1            1          B      75.8±0.4μs 
               maximum       2            1            1          h       77.7±2μs  
               maximum       2            1            1          H      71.9±0.5μs 
               maximum       2            1            1          i       79.1±7μs  
               maximum       2            1            1          I       78.4±7μs  
               maximum       2            1            1          l       94.8±1μs  
               maximum       2            1            1          L       99.6±4μs  
               maximum       2            1            1          q      93.4±0.6μs 
               maximum       2            1            1          Q       95.6±1μs  
               maximum       2            1            2          b       73.2±2μs  
               maximum       2            1            2          B       73.1±2μs  
               maximum       2            1            2          h      71.4±0.2μs 
               maximum       2            1            2          H      71.4±0.1μs 
               maximum       2            1            2          i      72.8±0.2μs 
               maximum       2            1            2          I      73.0±0.5μs 
               maximum       2            1            2          l       126±2μs   
               maximum       2            1            2          L       128±2μs   
               maximum       2            1            2          q       132±4μs   
               maximum       2            1            2          Q       127±2μs   
               maximum       2            1            4          b      75.2±0.3μs 
               maximum       2            1            4          B      74
57AE
.9±0.3μs 
               maximum       2            1            4          h      72.7±0.8μs 
               maximum       2            1            4          H      72.1±0.6μs 
               maximum       2            1            4          i      93.9±0.8μs 
               maximum       2            1            4          I       102±6μs   
               maximum       2            1            4          l       238±10μs  
               maximum       2            1            4          L       259±6μs   
               maximum       2            1            4          q       244±20μs  
               maximum       2            1            4          Q       250±20μs  
               maximum       2            2            1          b      70.0±0.3μs 
               maximum       2            2            1          B       73.8±3μs  
               maximum       2            2            1          h      71.9±0.3μs 
               maximum       2            2            1          H       75.6±4μs  
               maximum       2            2            1          i       77.1±6μs  
               maximum       2            2            1          I       81.2±8μs  
               maximum       2            2            1          l       120±4μs   
               maximum       2            2            1          L       122±3μs   
               maximum       2            2            1          q       125±5μs   
               maximum       2            2            1          Q      120±0.7μs  
               maximum       2            2            2          b      71.2±0.6μs 
               maximum       2            2            2          B      71.2±0.4μs 
               maximum       2            2            2          h      71.6±0.2μs 
               maximum       2            2            2          H      71.3±0.2μs 
               maximum       2            2            2          i       77.9±8μs  
               maximum       2            2            2          I       77.8±1μs  
               maximum       2            2            2          l       161±4μs   
               maximum       2            2            2          L       157±2μs   
               maximum       2            2            2          q       156±2μs   
               maximum       2            2            2          Q       159±4μs   
               maximum       2            2            4          b      74.7±0.4μs 
               maximum       2            2            4          B      75.5±0.4μs 
               maximum       2            2            4          h      72.1±0.3μs 
               maximum       2            2            4          H      72.5±0.5μs 
               maximum       2            2            4          i      101±0.6μs  
               maximum       2            2            4          I      103±0.8μs  
               maximum       2            2            4          l       303±10μs  
               maximum       2            2            4          L       304±30μs  
               maximum       2            2            4          q       299±10μs  
               maximum       2            2            4          Q       277±8μs   
               maximum       2            4            1          b      70.9±0.4μs 
               maximum       2            4            1          B       72.9±4μs  
               maximum       2            4            1          h      72.1±0.3μs 
               maximum       2            4            1          H      72.2±0.3μs 
               maximum       2            4            1          i       88.8±2μs  
               maximum       2            4            1          I       92.5±6μs  
               maximum       2            4            1          l       191±2μs   
               maximum       2            4            1          L       200±8μs   
               maximum       2            4            1          q       194±5μs   
               maximum       2            4            1          Q       195±3μs   
               maximum       2            4            2          b      71.1±0.3μs 
               maximum       2            4            2          B       74.2±3μs  
               maximum       2            4            2          h      73.1±0.4μs 
               maximum       2            4            2          H      72.7±0.4μs 
               maximum       2            4            2          i       103±3μs   
               maximum       2            4            2          I       107±1μs   
               maximum       2            4            2          l       252±4μs   
               maximum       2            4            2          L       258±10μs  
               maximum       2            4            2          q       251±3μs   
               maximum       2            4            2          Q       252±6μs   
               maximum       2            4            4          b      76.0±0.4μs 
               maximum       2            4            4          B       77.6±1μs  
               maximum       2            4            4          h      73.8±0.5μs 
               maximum       2            4            4          H      73.9±0.5μs 
               maximum       2            4            4          i       130±2μs   
               maximum       2            4            4          I       132±3μs   
               maximum       2            4            4          l       398±4μs   
               maximum       2            4            4          L       405±5μs   
               maximum       2            4            4          q       406±20μs  
               maximum       2            4            4          Q       403±5μs   
               maximum       4            1            1          b      70.4±0.3μs 
               maximum       4            1            1          B      70.9±0.4μs 
               maximum       4            1            1          h      71.8±0.3μs 
               maximum       4            1            1          H      72.2±0.2μs 
               maximum       4            1            1          i      84.8±0.5μs 
               maximum       4            1            1          I      85.5±0.9μs 
               maximum       4            1            1          l       156±4μs   
               maximum       4            1            1          L       152±5μs   
               maximum       4            1            1          q       152±4μs   
               maximum       4            1            1          Q       154±4μs   
               maximum       4            1            2          b      71.5±0.3μs 
               maximum       4            1            2          B      70.8±0.3μs 
               maximum       4            1            2          h      72.1±0.2μs 
               maximum       4            1            2          H      72.1±0.3μs 
               maximum       4            1            2          i       96.3±7μs  
               maximum       4            1            2          I       94.6±2μs  
               maximum       4            1            2          l       213±8μs   
               maximum       4            1            2          L       220±10μs  
               maximum       4            1            2          q       222±20μs  
               maximum       4            1            2          Q       213±6μs   
               maximum       4            1            4          b       76.4±1μs  
               maximum       4            1            4          B       77.2±2μs  
               maximum       4            1            4          h       75.9±3μs  
               maximum       4            1            4          H       75.9±3μs  
               maximum       4            1            4          i       122±2μs   
               maximum       4            1            4          I       119±4μs   
               maximum       4            1            4          l       371±20μs  
               maximum       4            1            4          L       365±10μs  
               maximum       4            1            4          q       384±30μs  
               maximum       4            1            4          Q       380±20μs  
               maximum       4            2            1          b      72.1±0.9μs 
               maximum       4            2            1          B       71.8±2μs  
               maximum       4            2            1          h       79.9±4μs  
               maximum       4            2            1          H       78.9±5μs  
               maximum       4            2            1          i       97.7±4μs  
               maximum       4            2            1          I       92.2±3μs  
               maximum       4            2            1          l       204±5μs   
               maximum       4            2            1          L       190±3μs   
               maximum       4            2            1          q       206±3μs   
               maximum       4            2            1          Q       204±10μs  
               maximum       4            2            2          b       74.3±3μs  
               maximum       4            2            2          B       72.9±2μs  
               maximum       4            2            2          h       80.4±4μs  
               maximum       4            2            2          H       75.1±1μs  
               maximum       4            2            2          i       99.3±3μs  
               maximum       4            2            2          I       102±4μs   
               maximum       4            2            2          l       255±10μs  
               maximum       4            2            2          L       262±8μs   
               maximum       4            2            2          q       277±10μs  
               maximum       4            2            2          Q       263±6μs   
               maximum       4            2            4          b       78.7±3μs  
               maximum       4            2            4          B       78.6±3μs  
               maximum       4            2            4          h       75.4±2μs  
               maximum       4            2            4          H       78.8±5μs  
               maximum       4            2            4          i       140±3μs   
               maximum       4            2            4          I       129±2μs   
               maximum       4            2            4          l       423±20μs  
               maximum       4            2            4          L       442±20μs  
               maximum       4            2            4          q       439±10μs  
               maximum       4            2            4          Q       457±20μs  
               maximum       4            4            1          b       72.9±3μs  
               maximum       4            4            1          B       79.1±2μs  
               maximum       4            4            1          h      73.6±0.6μs 
               maximum       4            4            1          H       77.6±4μs  
               maximum       4            4            1          i       111±7μs   
               maximum       4            4            1          I       108±4μs   
               maximum       4            4            1          l       310±10μs  
               maximum       4            4            1          L       295±9μs   
               maximum       4            4            1          q       297±3μs   
               maximum       4            4            1          Q       302±9μs   
               maximum       4            4            2          b      71.0±0.3μs 
               maximum       4            4            2          B       77.2±6μs  
               maximum       4            4            2          h       80.7±7μs  
               maximum       4            4            2          H       87.4±7μs  
               maximum       4            4            2          i       118±4μs   
               maximum       4            4            2          I       116±2μs   
               maximum       4            4            2          l       374±10μs  
               maximum       4            4            2          L       384±20μs  
               maximum       4            4            2          q       389±20μs  
               maximum       4            4            2          Q       389±20μs  
               maximum       4            4            4          b       79.4±3μs  
               maximum       4            4            4          B       77.2±2μs  
               maximum       4            4            4          h      76.8±0.8μs 
               maximum       4            4            4          H      76.2±0.7μs 
               maximum       4            4            4          i       161±4μs   
               maximum       4            4            4          I       161±2μs   
               maximum       4            4            4          l       564±10μs  
               maximum       4            4            4          L       627±40μs  
               maximum       4            4            4          q       616±50μs  
               maximum       4            4            4          Q       565±50μs  
               minimum       1            1            1          b     8.43±0.08μs 
               minimum       1            1            1          B      8.45±0.2μs 
               minimum       1            1            1          h     17.7±0.06μs 
               minimum       1            1            1          H      17.7±0.1μs 
               minimum       1            1            1          i      33.7±0.2μs 
               minimum       1            1            1          I     33.7±0.08μs 
               minimum       1            1            1          l      66.1±0.5μs 
               minimum       1            1            1          L      66.4±0.6μs 
               minimum       1            1            1          q      66.3±0.4μs 
               minimum       1            1            1          Q      66.4±0.3μs 
               minimum       1            1            2          b      71.0±0.3μs 
               minimum       1            1            2          B       79.8±3μs  
               minimum       1            1            2          h      71.3±0.3μs 
               minimum       1            1            2          H       81.6±4μs  
               minimum       1            1            2          i      72.3±0.6μs 
               minimum       1            1            2          I       84.5±7μs  
               minimum       1            1            2          l      102±0.2μs  
               minimum       1            1            2          L      102±0.9μs  
               minimum       1            1            2          q       110±2μs   
               minimum       1            1            2          Q      102±0.6μs  
               minimum       1            1            4          b      75.0±0.5μs 
               minimum       1            1            4          B      79.3±0.3μs 
               minimum       1            1            4          h       74.2±3μs  
               minimum       1            1            4          H      78.0±0.5μs 
               minimum       1            1            4          i       86.8±2μs  
               minimum       1            1            4          I       88.3±1μs  
               minimum       1            1            4          l       194±6μs   
               minimum       1            1            4          L       195±4μs   
               minimum       1            1            4          q       197±5μs   
               minimum       1            1            4          Q       199±4μs   
               minimum       1            2            1          b       73.3±2μs  
               minimum       1            2            1          B      77.4±0.5μs 
               minimum       1            2            1          h       74.4±3μs  
               minimum       1            2            1          H      77.5±0.2μs 
               minimum       1            2            1          i      83.0±0.7μs 
               minimum       1            2            1          I       78.9±6μs  
               minimum       1            2            1          l       95.3±2μs  
               minimum       1            2            1          L      99.8±0.9μs 
               minimum       1            2            1          q       93.0±5μs  
               minimum       1            2            1          Q       106±6μs   
               minimum       1            2            2          b       72.8±1μs  
               minimum       1            2            2          B      79.4±0.3μs 
               minimum       1            2            2          h      72.3±0.6μs 
               minimum       1            2            2          H       82.0±5μs  
               minimum       1            2            2          i       79.7±6μs  
               minimum       1            2            2          I      79.6±0.3μs 
               minimum       1            2            2          l       128±5μs   
               minimum       1            2            2          L       129±5μs   
               minimum       1            2            2          q       126±2μs   
               minimum       1            2            2          Q       127±3μs   
               minimum       1            2            4          b       77.8±2μs  
               minimum       1            2            4          B      79.2±0.2μs 
               minimum       1            2            4          h      71.9±0.4μs 
               minimum       1            2            4          H      78.2±0.3μs 
               minimum       1            2            4          i      92.7±0.3μs 
               minimum       1            2            4          I      94.2±0.7μs 
               minimum       1            2            4          l       244±6μs   
               minimum       1            2            4          L       243±7μs   
               minimum       1            2            4          q       245±10μs  
               minimum       1            2            4          Q       238±6μs   
               minimum       1            4            1          b      70.7±0.3μs 
               minimum       1            4            1          B      77.9±0.2μs 
               minimum       1            4            1          h       72.3±2μs  
               minimum       1            4            1          H      78.3±0.4μs 
               minimum       1            4            1          i      83.4±0.2μs 
               minimum       1            4            1          I      92.1±0.6μs 
               minimum       1            4            1          l       155±3μs   
               minimum       1            4            1          L       149±3μs   
               minimum       1            4            1          q       154±6μs   
               minimum       1            4            1          Q       151±5μs   
               minimum       1            4            2          b       75.9±3μs  
               minimum       1            4            2          B      85.6±0.3μs 
               minimum       1            4            2          h      72.2±0.3μs 
               minimum       1            4            2          H      78.1±0.4μs 
               minimum       1            4            2          i      89.7±0.3μs 
               minimum       1            4            2          I       97.0±1μs  
               minimum       1            4            2          l       206±7μs   
               minimum       1            4            2          L       208±2μs   
               minimum       1            4            2          q       211±4μs   
               minimum       1            4            2          Q       212±10μs  
               minimum       1            4            4          b       75.4±2μs  
               minimum       1            4            4          B       80.2±2μs  
               minimum       1            4            4          h      72.5±0.5μs 
               minimum       1            4            4          H      78.5±0.2μs 
               minimum       1            4            4          i       113±2μs   
               minimum       1            4            4          I       118±2μs   
               minimum       1            4            4          l       350±30μs  
               minimum       1            4            4          L       336±7μs   
               minimum       1            4            4          q       335±30μs  
               minimum       1            4            4          Q       370±30μs  
               minimum       2            1            1          b      70.8±0.3μs 
               minimum       2            1            1          B      78.0±0.4μs 
               minimum       2            1            1          h      71.1±0.1μs 
               minimum       2            1            1          H      77.5±0.4μs 
               minimum       2            1            1          i      72.4±0.3μs 
               minimum       2            1            1          I      78.7±0.4μs 
               minimum       2            1            1          l       95.1±1μs  
               minimum       2            1            1          L       102±2μs   
               minimum       2            1            1          q       94.1±1μs  
               minimum       2            1            1          Q      10
B41A
1±0.8μs  
               minimum       2            1            2          b      70.7±0.1μs 
               minimum       2            1            2          B      78.4±0.5μs 
               minimum       2            1            2          h      71.8±0.5μs 
               minimum       2            1            2          H      77.6±0.2μs 
               minimum       2            1            2          i       81.0±7μs  
               minimum       2            1            2          I       78.8±1μs  
               minimum       2            1            2          l       130±5μs   
               minimum       2            1            2          L       138±3μs   
               minimum       2            1            2          q       132±4μs   
               minimum       2            1            2          Q       127±2μs   
               minimum       2            1            4          b      75.0±0.3μs 
               minimum       2            1            4          B       79.7±1μs  
               minimum       2            1            4          h      72.3±0.2μs 
               minimum       2            1            4          H      78.6±0.5μs 
               minimum       2            1            4          i      94.2±0.7μs 
               minimum       2            1            4          I       104±8μs   
               minimum       2            1            4          l       246±10μs  
               minimum       2            1            4          L       255±6μs   
               minimum       2            1            4          q       249±7μs   
               minimum       2            1            4          Q       241±8μs   
               minimum       2            2            1          b       71.2±2μs  
               minimum       2            2            1          B      78.5±0.3μs 
               minimum       2            2            1          h       75.8±4μs  
               minimum       2            2            1          H       83.0±6μs  
               minimum       2            2            1          i       80.6±7μs  
               minimum       2            2            1          I       98.1±2μs  
               minimum       2            2            1          l       125±9μs   
               minimum       2            2            1          L       127±5μs   
               minimum       2            2            1          q       115±2μs   
               minimum       2            2            1          Q       128±5μs   
               minimum       2            2            2          b      71.4±0.3μs 
               minimum       2            2            2          B      79.2±0.4μs 
               minimum       2            2            2          h      71.5±0.2μs 
               minimum       2            2            2          H      77.8±0.4μs 
               minimum       2            2            2          i       75.7±1μs  
               minimum       2            2            2          I       80.3±1μs  
               minimum       2            2            2          l       163±4μs   
               minimum       2            2            2          L       158±2μs   
               minimum       2            2            2          q      159±0.9μs  
               minimum       2            2            2          Q       162±4μs   
               minimum       2            2            4          b       76.0±1μs  
               minimum       2            2            4          B      79.6±0.2μs 
               minimum       2            2            4          h       78.2±5μs  
               minimum       2            2            4          H      79.6±0.4μs 
               minimum       2            2            4          i       102±1μs   
               minimum       2            2            4          I       103±2μs   
               minimum       2            2            4          l       302±10μs  
               minimum       2            2            4          L       298±10μs  
               minimum       2            2            4          q       295±10μs  
               minimum       2            2            4          Q       303±3μs   
               minimum       2            4            1          b      70.7±0.6μs 
               minimum       2            4            1          B      78.8±0.3μs 
               minimum       2            4            1          h      72.7±0.2μs 
               minimum       2            4            1          H      78.6±0.8μs 
               minimum       2            4            1          i       87.6±4μs  
               minimum       2            4            1          I       102±6μs   
               minimum       2            4            1          l       196±3μs   
               minimum       2            4            1          L       194±5μs   
               minimum       2            4            1          q       197±9μs   
               minimum       2            4            1          Q       194±4μs   
               minimum       2            4            2          b      71.4±0.4μs 
               minimum       2            4            2          B       80.2±5μs  
               minimum       2            4            2          h       77.7±5μs  
               minimum       2            4            2          H       85.7±7μs  
               minimum       2            4            2          i      94.0±0.8μs 
               minimum       2            4            2          I       102±2μs   
               minimum       2            4            2          l       250±2μs   
               minimum       2            4            2          L       247±5μs   
               minimum       2            4            2          q       253±6μs   
               minimum       2            4            2          Q       256±10μs  
               minimum       2            4            4          b       76.0±1μs  
               minimum       2            4            4          B       83.7±4μs  
               minimum       2            4            4          h       74.1±1μs  
               minimum       2            4            4          H       79.7±1μs  
               minimum       2            4            4          i       129±3μs   
               minimum       2            4            4          I       131±3μs   
               minimum       2            4            4          l       405±5μs   
               minimum       2            4            4          L       400±8μs   
               minimum       2            4            4          q       408±10μs  
               minimum       2            4            4          Q       411±7μs   
               minimum       4            1            1          b      70.8±0.6μs 
               minimum       4            1            1          B      79.0±0.5μs 
               minimum       4            1            1          h      72.0±0.3μs 
               minimum       4            1            1          H       84.8±7μs  
               minimum       4            1            1          i       89.0±4μs  
               minimum       4            1            1          I       97.0±5μs  
               minimum       4            1            1          l       160±9μs   
               minimum       4            1            1          L       164±5μs   
               minimum       4            1            1          q       153±3μs   
               minimum       4            1            1          Q       160±2μs   
               minimum       4            1            2          b      71.5±0.5μs 
               minimum       4            1            2          B       84.4±5μs  
               minimum       4            1            2          h      72.2±0.3μs 
               minimum       4            1            2          H      78.3±0.3μs 
               minimum       4            1            2          i       92.3±3μs  
               minimum       4            1            2          I       98.9±1μs  
               minimum       4            1            2          l       207±4μs   
               minimum       4            1            2          L       209±7μs   
               minimum       4            1            2          q       217±7μs   
               minimum       4            1            2          Q       224±8μs   
               minimum       4            1            4          b       78.2±3μs  
               minimum       4            1            4          B       79.5±4μs  
               minimum       4            1            4          h       80.8±6μs  
               minimum       4            1            4          H      79.2±0.8μs 
               minimum       4            1            4          i       117±3μs   
               minimum       4            1            4          I       121±3μs   
               minimum       4            1            4          l       405±30μs  
               minimum       4            1            4          L       368±20μs  
               minimum       4            1            4          q       413±20μs  
               minimum       4            1            4          Q       365±10μs  
               minimum       4            2            1          b       76.1±4μs  
               minimum       4            2            1          B       84.6±5μs  
               minimum       4            2            1          h       79.5±6μs  
               minimum       4            2            1          H       84.9±7μs  
               minimum       4            2            1          i      89.0±0.4μs 
               minimum       4            2            1          I       98.6±2μs  
               minimum       4            2            1          l       196±10μs  
               minimum       4            2            1          L       198±3μs   
               minimum       4            2            1          q       195±8μs   
               minimum       4            2            1          Q       207±9μs   
               minimum       4            2            2          b      72.0±0.5μs 
               minimum       4            2            2          B      79.6±0.4μs 
               minimum       4            2            2          h      73.4±0.5μs 
               minimum       4            2            2          H       88.6±7μs  
               minimum       4            2            2          i      97.7±0.7μs 
               minimum       4            2            2          I       103±1μs   
               minimum       4            2            2          l       261±10μs  
               minimum       4            2            2          L       265±10μs  
               minimum       4            2            2          q       258±6μs   
               minimum       4            2            2          Q       249±9μs   
               minimum       4            2            4          b       79.2±2μs  
               minimum       4            2            4          B      79.7±0.6μs 
               minimum       4            2            4          h       80.9±7μs  
               minimum       4            2            4          H      79.3±0.4μs 
               minimum       4            2            4          i       127±2μs   
               minimum       4            2            4          I       131±2μs   
               minimum       4            2            4          l       424±30μs  
               minimum       4            2            4          L       404±30μs  
               minimum       4            2            4          q       400±6μs   
               minimum       4            2            4          Q       406±3μs   
               minimum       4            4            1          b      70.3±0.3μs 
               minimum       4            4            1          B      78.1±0.2μs 
               minimum       4            4            1          h      72.2±0.1μs 
               minimum       4            4            1          H      78.4±0.5μs 
               minimum       4            4            1          i       104±1μs   
               minimum       4            4            1          I       110±3μs   
               minimum       4            4            1          l       313±10μs  
               minimum       4            4            1          L       298±9μs   
               minimum       4            4            1          q       308±10μs  
               minimum       4            4            1          Q       289±10μs  
               minimum       4            4            2          b       76.5±5μs  
               minimum       4            4            2          B       85.5±6μs  
               minimum       4            4            2          h      73.6±0.2μs 
               minimum       4            4            2          H      78.9±0.3μs 
               minimum       4            4            2          i       118±4μs   
               minimum       4            4            2          I       133±9μs   
               minimum       4            4            2          l       367±10μs  
               minimum       4            4            2          L       369±20μs  
               minimum       4            4            2          q       368±10μs  
               minimum       4            4            2          Q       388±20μs  
               minimum       4            4            4          b      76.5±0.4μs 
               minimum       4            4            4          B      78.5±0.2μs 
               minimum       4            4            4          h      75.8±0.4μs 
               minimum       4            4            4          H      80.6±0.5μs 
               minimum       4            4            4          i       166±3μs   
               minimum       4            4            4          I       167±5μs   
               minimum       4            4            4          l       546±20μs  
               minimum       4            4            4          L       577±30μs  
               minimum       4            4            4          q       554±40μs  
               minimum       4            4            4          Q       564±30μs  
              ========= ============ ============ ============ ======= =============

[ 73.53%] ··· bench_ufunc_strides.LogisticRegression.time_train                                                                                                                            ok
[ 73.53%] ··· =============== ============
                   dtype                  
              --------------- ------------
               numpy.float32   2.67±0.01s 
               numpy.float64   4.42±0.02s 
              =============== ============

[ 74.26%] ··· bench_ufunc_strides.Mandelbrot.time_mandel                                                                                                                           12.7±0.02s
[ 75.00%] ··· bench_ufunc_strides.Unary.time_ufunc                                                                                                                                         ok
[ 75.00%] ··· ========================= =========== ============= ============= ============= ============= ============= =============
              --                                                                     stride_out / dtype                                
              ------------------------------------- -----------------------------------------------------------------------------------
                        ufunc            stride_in      1 / f         1 / d         2 / f         2 / d         4 / f         4 / d    
              ========================= =========== ============= ============= ============= ============= ============= =============
                  <ufunc 'absolute'>         1        25.4±0.1μs    49.1±0.4μs    44.5±0.3μs     81.5±1μs     74.4±0.9μs     164±4μs   
                  <ufunc 'absolute'>         2        37.0±0.2μs    68.0±0.9μs     58.7±1μs     102±0.7μs     83.1±0.6μs     202±4μs   
                  <ufunc 'absolute'>         4        54.2±0.7μs     114±1μs      71.6±0.6μs     165±6μs       103±2μs       284±6μs   
                   <ufunc 'arccos'>          1         989±7μs     1.53±0.06ms     990±8μs     1.53±0.03ms   1.02±0.01ms   1.65±0.06ms 
                   <ufunc 'arccos'>          2         985±5μs     1.55±0.02ms     995±10μs    1.53±0.01ms     999±20μs    1.57±0.07ms 
                   <ufunc 'arccos'>          4         981±5μs     1.58±0.02ms     989±4μs     1.58±0.03ms   1.04±0.03ms   1.59±0.01ms 
                  <ufunc 'arccosh'>          1       2.05±0.01ms   2.33±0.02ms   2.06±0.02ms   2.37±0.03ms   2.08±0.02ms   2.43±0.06ms 
                  <ufunc 'arccosh'>          2       2.07±0.01ms   2.35±0.02ms   2.07±0.02ms   2.42±0.03ms   2.09±0.05ms   2.40±0.08ms 
                  <ufunc 'arccosh'>          4       2.09±0.02ms   2.37±0.02ms   2.08±0.03ms   2.42±0.02ms   2.09±0.03ms    2.36±0.1ms 
                   <ufunc 'arcsin'>          1         840±7μs     1.48±0.01ms     839±10μs    1.50±0.04ms     860±10μs    1.65±0.08ms 
                   <ufunc 'arcsin'>          2         846±6μs     1.49±0.01ms     843±10μs    1.50±0.01ms     847±10μs    1.55±0.06ms 
                   <ufunc 'arcsin'>          4         853±20μs    1.54±0.03ms     842±4μs     1.54±0.01ms     847±10μs    1.55±0.06ms 
                  <ufunc 'arcsinh'>          1       2.31±0.01ms   2.82±0.03ms   2.30±0.02ms   2.85±0.04ms   2.38±0.04ms    3.02±0.1ms 
                  <ufunc 'arcsinh'>          2       2.30±0.01ms   2.83±0.02ms   2.32±0.02ms   2.84±0.02ms   2.35±0.03ms    2.86±0.1ms 
                  <ufunc 'arcsinh'>          4       2.29±0.02ms   2.82±0.02ms   2.31±0.02ms   2.84±0.02ms   2.34±0.03ms   2.84±0.02ms 
                   <ufunc 'arctan'>          1         1.09±0ms    1.98±0.01ms   1.09±0.01ms   2.01±0.03ms   1.13±0.02ms   2.13±0.07ms 
                   <ufunc 'arctan'>          2       1.09±0.01ms   1.98±0.01ms   1.10±0.01ms   1.99±0.01ms   1.10±0.01ms   2.02±0.07ms 
                   <ufunc 'arctan'>          4       1.09±0.01ms   2.01±0.02ms     1.12±0ms    2.02±0.01ms   1.10±0.02ms   2.05±0.01ms 
                  <ufunc 'arctanh'>          1       2.30±0.01ms   2.52±0.02ms   2.29±0.02ms   2.56±0.05ms   2.37±0.05ms    2.74±0.1ms 
                  <ufunc 'arctanh'>          2       2.31±0.06ms   2.56±0.02ms   2.32±0.01ms   2.54±0.03ms   2.31±0.02ms    2.57±0.1ms 
                  <ufunc 'arctanh'>          4       2.30±0.01ms   2.57±0.03ms   2.31±0.02ms   2.61±0.02ms   2.33±0.04ms   2.56±0.04ms 
                    <ufunc 'cbrt'>           1       1.97±0.01ms   2.19±0.01ms   1.99±0.01ms   2.25±0.05ms   2.04±0.04ms   2.38±0.08ms 
                    <ufunc 'cbrt'>           2       1.97±0.01ms   2.23±0.03ms   1.98±0.01ms   2.22±0.02ms   1.98±0.01ms   2.27±0.09ms 
                    <ufunc 'cbrt'>           4       1.97±0.01ms   2.24±0.02ms   1.98±0.01ms   2.25±0.02ms   2.01±0.04ms   2.28±0.03ms 
                    <ufunc 'ceil'>           1        25.4±0.2μs    49.4±0.2μs    45.1±0.3μs     81.0±2μs      75.0±1μs      163±2μs   
                    <ufunc 'ceil'>           2        37.2±0.9μs    68.4±0.3μs     57.1±1μs      101±3μs       82.8±1μs      193±6μs   
                    <ufunc 'ceil'>           4        55.1±0.7μs     113±3μs      70.9±0.7μs     161±4μs       104±3μs       285±10μs  
               <ufunc 'conjugate'> (0)       1        45.6±0.3μs    53.9±0.4μs     50.1±2μs      83.2±1μs      75.4±1μs      166±4μs   
               <ufunc 'conjugate'> (0)       2         48.8±1μs     69.4±0.6μs    53.9±0.7μs     101±1μs       83.2±1μs      193±8μs   
               <ufunc 'conjugate'> (0)       4        56.9±0.3μs     114±5μs      70.0±0.9μs     163±3μs       103±2μs       284±2μs   
                    <ufunc 'cos'>            1         147±1μs       832±8μs       210±5μs       846±20μs      209±5μs       911±40μs  
                    <ufunc 'cos'>            2         217±3μs       838±4μs       276±2μs       840±10μs     274±0.7μs      873±30μs  
                    <ufunc 'cos'>            4         227±3μs       852±6μs       286±1μs       875±10μs      286±5μs       872±7μs   
                    <ufunc 'cosh'>           1       1.31±0.01ms   1.33±0.01ms     1.31±0ms    1.34±0.03ms   1.36±0.03ms   1.45±0.05ms 
                    <ufunc 'cosh'>           2         1.31±0ms    1.34±0.01ms   1.32±0.01ms   1.34±0.02ms   1.31±0.01ms   1.37±0.05ms 
                    <ufunc 'cosh'>           4       1.31±0.01ms   1.36±0.01ms   1.31±0.01ms     1.36±0ms    1.31±0.03ms   1.37±0.01ms 
                  <ufunc 'deg2rad'>          1         176±1μs       176±1μs      176±0.9μs      201±3μs       205±4μs       342±10μs  
                  <ufunc 'deg2rad'>          2        176±0.9μs      177±2μs       176±2μs      203±0.9μs      201±2μs       345±8μs   
                  <ufunc 'deg2rad'>          4        177±0.3μs      189±2μs       176±1μs       241±3μs       208±4μs       404±10μs  
                  <ufunc 'degrees'>          1         176±1μs       176±2μs       176±2μs       201±3μs       205±3μs       340±8μs   
                  <ufunc 'degrees'>          2         176±1μs      179±0.6μs      178±2μs       204±1μs       201±2μs       345±6μs   
                  <ufunc 'degrees'>          4        177±0.6μs      188±1μs       176±2μs       236±5μs       209±5μs       395±10μs  
                    <ufunc 'exp'>            1         154±3μs       608±5μs       4.45±0ms      614±10μs    4.64±0.09ms     660±20μs  
                    <ufunc 'exp'>            2         222±6μs       605±5μs     5.03±0.05ms     616±6μs     5.03±0.01ms     648±20μs  
                    <ufunc 'exp'>            4         222±1μs       621±6μs       5.01±0ms      622±3μs      5.06±0.1ms     642±10μs  
                    <ufunc 'exp2'>           1         329±3μs       450±3μs       331±2μs       456±9μs       344±6μs       491±20μs  
                    <ufunc 'exp2'>           2         333±1μs       455±6μs       337±3μs       461±5μs       335±4μs       477±20μs  
                    <ufunc 'exp2'>           4        342±0.9μs      483±5μs       348±2μs       496±3μs       348±8μs       528±2μs   
                   <ufunc 'expm1'>           1       1.10±0.01ms   1.06±0.01ms   1.09±0.01ms   1.07±0.02ms   1.13±0.02ms   1.16±0.05ms 
                   <ufunc 'expm1'>           2       1.09±0.01ms   1.07±0.01ms   1.10±0.01ms   1.07±0.01ms   1.09±0.01ms   1.09±0.04ms 
                   <ufunc 'expm1'>           4       1.12±0.01ms   1.08±0.01ms   1.10±0.05ms   1.09±0.02ms   1.10±0.02ms   1.10±0.01ms 
                    <ufunc 'fabs'>           1         177±2μs       176±2μs       178±1μs       200±3μs       205±3μs       342±10μs  
                    <ufunc 'fabs'>           2         181±4μs      177±0.8μs      176±2μs       201±1μs       200±2μs       345±9μs   
                    <ufunc 'fabs'>           4         178±1μs       189±1μs       177±1μs       237±3μs       210±4μs       408±6μs   
                   <ufunc 'floor'>           1        25.5±0.1μs    49.4±0.2μs    45.4±0.5μs    82.6±0.8μs    75.2±0.7μs     162±6μs   
                   <ufunc 'floor'>           2        36.7±0.1μs    68.0±0.4μs     57.8±1μs     101±0.9μs     84.1±0.8μs     199±5μs   
                   <ufunc 'floor'>           4        54.2±0.3μs     112±5μs       71.6±1μs      165±6μs       102±2μs       288±20μs  
                    <ufunc 'log'>            1         210±7μs       590±4μs     4.51±0.01ms     601±9μs     4.71±0.08ms     642±30μs  
                    <ufunc 'log'>            2         279±5μs       595±6μs     5.10±0.06ms     593±4μs     5.09±0.02ms     606±30μs  
                    <ufunc 'log'>            4         291±4μs       603±6μs     5.11±0.01ms     607±4μs      5.14±0.1ms     626±9μs   
                   <ufunc 'log10'>           1         786±4μs       1.02±0ms      795±7μs     1.03±0.02ms     820±20μs    1.11±0.04ms 
                   <ufunc 'log10'>           2         787±3μs     1.04±0.01ms     801±5μs     1.03±0.02ms     789±5μs     1.05±0.04ms 
                   <ufunc 'log10'>           4         787±5μs     1.08±0.03ms     795±6μs     1.04±0.01ms     797±20μs    1.05±0.02ms 
                   <ufunc 'log1p'>           1       1.16±0.01ms   1.16±0.01ms   1.15±0.01ms   1.18±0.02ms   1.20±0.03ms   1.27±0.05ms 
                   <ufunc 'log1p'>           2       1.17±0.01ms   1.18±0.01ms   1.16±0.01ms   1.18±0.01ms   1.16±0.01ms   1.20±0.05ms 
                   <ufunc 'log1p'>           4       1.17±0.01ms   1.19±0.01ms   1.18±0.01ms   1.24±0.02ms   1.19±0.03ms   1.22±0.02ms 
                    <ufunc 'log2'>           1         380±2μs       816±3μs       377±1μs       848±20μs      392±8μs       885±30μs  
                    <ufunc 'log2'>           2         377±2μs       816±7μs       380±4μs       819±6μs       383±1μs       895±20μs  
                    <ufunc 'log2'>           4         385±7μs       834±5μs       378±1μs       835±6μs       393±9μs       897±70μs  
                <ufunc 'logical_not'>        1        102±0.7μs      128±7μs      142±0.8μs      146±3μs       150±2μs       225±6μs   
                <ufunc 'logical_not'>        2        102±0.6μs      132±1μs      143±0.9μs      158±1μs       146±2μs       252±7μs   
                <ufunc 'logical_not'>        4        109±0.6μs      174±6μs      152±0.9μs      227±10μs      161±1μs       353±10μs  
                  <ufunc 'negative'>         1        26.1±0.8μs    49.4±0.5μs    53.0±0.4μs     83.9±2μs      76.1±1μs      166±5μs   
                  <ufunc 'negative'>         2         54.1±1μs      72.0±1μs     57.6±0.7μs     101±2μs      83.7±0.7μs     200±2μs   
                  <ufunc 'negative'>         4        61.6±0.5μs     116±2μs      73.1±0.3μs     161±4μs       102±3μs       313±20μs  
                  <ufunc 'positive'>         1        46.9±0.8μs    54.2±0.6μs    48.3±0.5μs     82.5±1μs     75.9±0.9μs     167±4μs   
                  <ufunc 'positive'>         2        48.7±0.2μs    70.1±0.7μs    54.8±0.9μs     101±1μs       83.6±1μs      199±7μs   
                  <ufunc 'positive'>         4        57.0±0.5μs     115±3μs      69.9±0.6μs     165±5μs       103±2μs       308±10μs  
                  <ufunc 'rad2deg'>          1        176±0.8μs      177±1μs       182±5μs       202±3μs       205±4μs       341±10μs  
                  <ufunc 'rad2deg'>          2        177±0.7μs      176±1μs       178±2μs       204±4μs       201±3μs       343±8μs   
                  <ufunc 'rad2deg'>          4        178±0.9μs      197±6μs       178±1μs       236±4μs       207±5μs       411±7μs   
                  <ufunc 'radians'>          1         177±1μs       177±2μs      176±0.4μs      202±3μs       204±3μs       341±10μs  
                  <ufunc 'radians'>          2         176±1μs       179±1μs       177±2μs       203±3μs       200±2μs       344±10μs  
                  <ufunc 'radians'>          4         177±1μs       189±2μs       177±1μs       236±4μs       206±5μs       409±10μs  
                 <ufunc 'reciprocal'>        1        53.7±0.3μs     206±2μs       53.4±1μs      209±4μs      75.8±0.9μs     225±9μs   
                 <ufunc 'reciprocal'>        2        53.6±0.2μs     206±2μs       64.0±2μs      207±1μs      83.4±0.7μs     225±9μs   
                 <ufunc 'reciprocal'>        4        54.9±0.3μs     212±2μs      72.6±0.3μs     218±6μs       103±2μs       301±10μs  
                    <ufunc 'rint'>           1        25.7±0.2μs    50.2±0.4μs    45.6±0.4μs     81.9±1μs      75.1±2μs      164±4μs   
                    <ufunc 'rint'>           2        36.8±0.4μs    68.8±0.8μs     58.2±1μs      101±2μs       82.7±1μs      195±5μs   
                    <ufunc 'rint'>           4        54.3±0.5μs     113±2μs       71.4±1μs      161±7μs       102±1μs       292±9μs   
                    <ufunc 'sign'>           1        69.7±0.3μs    68.8±0.7μs    69.5±0.5μs     84.8±2μs      79.9±2μs      168±3μs   
                    <ufunc 'sign'>           2         71.7±3μs     78.4±0.5μs     70.7±1μs      108±2μs      86.6±0.9μs     200±6μs   
                    <ufunc 'sign'>           4        73.3±0.2μs     124±4μs       80.7±2μs      170±3μs       103±2μs       299±10μs  
                    <ufunc 'sin'>            1         142±1μs       973±9μs       210±5μs       991±20μs      212±7μs     1.06±0.04ms 
                    <ufunc 'sin'>            2         218±9μs       970±6μs       268±2μs       982±6μs      265±0.6μs      993±40μs  
                    <ufunc 'sin'>            4         227±7μs       982±9μs       291±8μs     1.03±0.03ms     284±10μs      1.02±0ms  
                    <ufunc 'sinh'>           1       1.97±0.02ms   2.00±0.02ms   1.96±0.01ms   2.05±0.03ms   2.04±0.03ms   2.16±0.07ms 
                    <ufunc 'sinh'>           2       1.96±0.01ms   2.01±0.02ms   1.97±0.02ms   2.01±0.01ms    2.13±0.3ms   2.05±0.08ms 
                    <ufunc 'sinh'>           4       1.97±0.01ms   2.02±0.01ms   1.97±0.02ms   2.00±0.01ms   2.00±0.03ms   2.03±0.01ms 
                    <ufunc 'sqrt'>           1        53.1±0.1μs    205±0.7μs      54.2±1μs      207±5μs       75.8±1μs      224±7μs   
                    <ufunc 'sqrt'>           2        52.4±0.2μs    205±0.9μs      60.7±2μs      207±4μs      83.2±0.3μs     221±6μs   
                    <ufunc 'sqrt'>           4        54.4±0.1μs     209±1μs       71.8±1μs      217±2μs       102±2μs       281±6μs   
                   <ufunc 'square'>          1         25.4±9μs     49.2±0.3μs    45.0±0.3μs     81.8±1μs      74.3±1μs      164±3μs   
                   <ufunc 'square'>          2        36.6±0.2μs    67.1±0.3μs    56.0±0.9μs    102±0.8μs     82.8±0.9μs     195±8μs   
                   <ufunc 'square'>          4        54.2±0.4μs     113±4μs      70.9±0.8μs     167±4μs       100±2μs       308±10μs  
                    <ufunc 'tan'>            1       1.46±0.01ms   2.31±0.01ms   1.45±0.02ms   2.31±0.04ms   1.50±0.03ms   2.47±0.09ms 
                    <ufunc 'tan'>            2       1.45±0.01ms   2.33±0.02ms   1.45±0.02ms   2.31±0.02ms   1.46±0.01ms   2.36±0.09ms 
                    <ufunc 'tan'>            4       1.48±0.01ms   2.34±0.03ms   1.50±0.02ms   2.34±0.02ms   1.47±0.03ms   2.38±0.04ms 
                    <ufunc 'tanh'>           1         453±2μs     1.64±0.01ms     488±3μs     1.67±0.03ms     507±10μs    1.80±0.07ms 
                    <ufunc 'tanh'>           2         509±1μs     1.72±0.01ms     558±4μs       1.73±0ms      555±3μs     1.77±0.08ms 
                    <ufunc 'tanh'>           4         526±3μs     1.75±0.01ms     557±6μs     1.77±0.02ms     559±10μs    1.80±0.03ms 
                   <ufunc 'trunc'>           1       25.5±0.05μs    49.6±0.1μs    45.5±0.7μs     81.7±1μs      74.8±1μs      162±3μs   
                   <ufunc 'trunc'>           2        36.8±0.2μs    69.4±0.5μs     61.5±2μs      103±2μs       83.8±1μs      199±2μs   
                   <ufunc 'trunc'>           4        55.2±0.8μs     118±5μs       72.9±2μs      165±3μs       104±2μs       312±30μs  
               <ufunc 'conjugate'> (1)       1        45.5±0.2μs    54.1±0.5μs     50.1±2μs      82.6±1μs     75.9±0.4μs     163±3μs   
               <ufunc 'conjugate'> (1)       2        48.5±0.2μs    69.3±0.4μs     53.9±1μs      100±1μs      82.3±0.2μs     198±8μs   
               <ufunc 'conjugate'> (1)       4        57.5±0.9μs     119±5μs      69.6±0.3μs     171±4μs       101±2μs       304±10μs  
                 <ufunc '_ones_like'>        1        35.2±0.2μs    38.4±0.3μs    38.7±0.3μs     65.2±1μs      66.1±1μs      138±5μs   
                 <ufunc '_ones_like'>        2        36.1±0.3μs    38.0±0.2μs    38.7±0.3μs    63.7±0.4μs    63.8±0.3μs     135±3μs   
                 <ufunc '_ones_like'>        4        35.4±0.3μs    38.2±0.2μs    38.9±0.4μs    64.6±0.7μs     64.8±1μs      134±3μs   
              ========================= =========== ============= ============= ============= ============= ============= =============

[ 75.00%] · For numpy commit fd646bd6 <main> (round 2/2):
[ 75.00%] ·· Building for virtualenv-py3.8-Cython..
[ 75.00%] ·· Benchmarking virtualenv-py3.8-Cython
[ 75.74%] ··· bench_ufunc.ArgParsing.time_add_arg_parsing                                                                                                                                  ok
[ 75.74%] ··· =============================================================== =========
                                         arg_kwarg                                     
              --------------------------------------------------------------- ---------
                                   (array(1.), array(2.))                      680±4ns 
                             (array(1.), array(2.), array(3.))                 591±8ns 
                           (array(1.), array(2.), out=array(3.))               670±2ns 
                          (array(1.), array(2.), out=(array(3.),))             657±4ns 
               (array(1.), array(2.), out=array(3.), subok=True, where=True)   685±5ns 
                             (array(1.), array(2.), subok=True)                746±6ns 
                       (array(1.), array(2.), subok=True, where=True)          758±4ns 
                 (array(1.), array(2.), array(3.), subok=True, where=True)     679±9ns 
              =============================================================== =========

[ 76.47%] ··· bench_ufunc.ArgParsingReduce.time_add_reduce_arg_parsing                                                                                                                     ok
[ 76.47%] ··· ====================================================== =============
                                    arg_kwarg                                     
              ------------------------------------------------------ -------------
                                (array([0., 1.]))                     1.37±0.02μs 
                               (array([0., 1.]), 0)                   1.39±0.03μs 
                            (array([0., 1.]), axis=0)                 1.45±0.03μs 
                            (array([0., 1.]), 0, None)                1.43±0.06μs 
                      (array([0., 1.]), axis=0, dtype=None)           1.51±0.08μs 
                      (array([0., 1.]), 0, None, array(0.))           1.22±0.02μs 
               (array([0., 1.]), axis=0, dtype=None, out=array(0.))   1.32±0.04μs 
                         (array([0., 1.]), out=array(0.))             1.27±0.04μs 
              ====================================================== =============

[ 77.21%] ··· bench_ufunc.Broadcast.time_broadcast                                                                                                                                10.5±0.07ms
[ 77.94%] ··· bench_ufunc.Custom.time_and_bool                                                                                                                                    1.70±0.02μs
[ 78.68%] ··· bench_ufunc.Custom.time_nonzero                                                                                                                                     13.1±0.06μs
[ 79.41%] ··· bench_ufunc.Custom.time_not_bool                                                                                                                                    1.49±0.03μs
[ 80.15%] ··· bench_ufunc.Custom.time_or_bool                                                                                                                                     1.64±0.03μs
[ 80.88%] ··· bench_ufunc.CustomArrayFloorDivideInt.time_floor_divide_int                                                                                                                  ok
[ 80.88%] ··· ============== ============= ============ =============
              --                               size                  
              -------------- ----------------------------------------
                  dtype           100         10000        1000000   
              ============== ============= ============ =============
                numpy.int8    1.07±0.01μs   73.9±0.5μs   7.82±0.04ms 
               numpy.int16    1.11±0.01μs   76.1±0.3μs    8.24±0.1ms 
               numpy.int32    1.07±0.01μs   93.8±0.4μs    10.4±0.1ms 
               numpy.int64    1.63±0.02μs   161±0.8μs    17.4±0.04ms 
               numpy.uint8    1.00±0.01μs   38.7±0.1μs   3.79±0.03ms 
               numpy.uint16   1.02±0.01μs   38.9±0.2μs   3.84±0.04ms 
               numpy.uint32   1.03±0.03μs   38.9±0.2μs   3.80±0.03ms 
               numpy.uint64   1.44±0.01μs   82.4±0.3μs    8.20±0.2ms 
              ============== ============= ============ =============

[ 81.62%] ··· bench_ufunc.CustomInplace.time_char_or                                                                                                                               46.2±0.2μs
[ 82.35%] ··· bench_ufunc.CustomInplace.time_char_or_temp                                                                                                                          59.7±0.5μs
[ 83.09%] ··· bench_ufunc.CustomInplace.time_double_add                                                                                                                            56.7±0.1μs
[ 83.82%] ··· bench_ufunc.CustomInplace.time_double_add_temp                                                                                                                       75.7±0.1μs
[ 84.56%] ··· bench_ufunc.CustomInplace.time_float_add                                                                                                                             57.7±0.1μs
[ 85.29%] ··· bench_ufunc.CustomInplace.time_float_add_temp                                                                                                                        76.6±0.1μs
[ 86.03%] ··· bench_ufunc.CustomInplace.time_int_or                                                                                                                                56.0±0.1μs
[ 86.76%] ··· bench_ufunc.CustomInplace.time_int_or_temp                                                                                                                           71.6±0.4μs
[ 87.50%] ··· bench_ufunc.CustomScalar.time_add_scalar2                                                                                                                                    ok
[ 87.50%] ··· =============== =============
                   dtype                   
              --------------- -------------
               numpy.float32   6.15±0.09μs 
               numpy.float64   12.5±0.06μs 
              =============== =============

[ 88.24%] ··· bench_ufunc.CustomScalar.time_divide_scalar2                                                                                                                                 ok
[ 88.24%] ··· =============== =============
                   dtype                   
              --------------- -------------
               numpy.float32   12.5±0.06μs 
               numpy.float64    26.3±0.5μs 
              =============== =============

[ 88.97%] ··· bench_ufunc.CustomScalar.time_divide_scalar2_inplace                                                                                                                         ok
[ 88.97%] ··· =============== =============
                   dtype                   
              --------------- -------------
               numpy.float32   12.7±0.09μs 
               numpy.float64    26.1±0.1μs 
              =============== =============

[ 89.71%] ··· bench_ufunc.CustomScalar.time_less_than_scalar2                                                                                                                              ok
[ 89.71%] ··· =============== =============
                   dtype                   
              --------------- -------------
               numpy.float32   4.08±0.03μs 
               numpy.float64    7.06±0.1μs 
              =============== =============

[ 90.44%] ··· bench_ufunc.CustomScalarFloorDivideInt.time_floor_divide_int                                                                                                                 ok
[ 90.44%] ··· ============== ============= ============= ============= =============
              --                                     divisors                       
              -------------- -------------------------------------------------------
                  dtype            8             -8            43           -43     
              ============== ============= ============= ============= =============
                numpy.int8    3.03±0.07μs   3.02±0.07μs   3.09±0.02μs   3.02±0.06μs 
               numpy.int16    3.34±0.09μs   3.32±0.09μs   3.34±0.03μs   3.31±0.09μs 
               numpy.int32     5.58±0.1μs   5.58±0.07μs    5.81±0.2μs    5.60±0.1μs 
               numpy.int64     14.9±0.1μs    15.0±0.1μs   14.9±0.07μs   14.9±0.08μs 
               numpy.uint8    3.34±0.02μs       n/a        3.32±0.2μs       n/a     
               numpy.uint16    3.79±0.2μs       n/a        3.77±0.1μs       n/a     
               numpy.uint32    5.49±0.1μs       n/a       5.53±0.05μs       n/a     
               numpy.uint64    12.0±0.1μs       n/a        12.1±0.2μs       n/a     
              ============== ============= ============= ============= =============

[ 90.44%] ···· For parameters: <class 'numpy.uint8'>, -8
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint8'>, -43
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint16'>, -8
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint16'>, -43
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint32'>, -8
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint32'>, -43
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint64'>, -8
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')
               
               For parameters: <class 'numpy.uint64'>, -43
               asv: skipped: NotImplementedError('Skipping test for negative divisor with unsigned type')

[ 91.18%] ··· bench_ufunc.Scalar.time_add_scalar                                                                                          
F438
                                           547±10ns
[ 91.91%] ··· bench_ufunc.Scalar.time_add_scalar_conv                                                                                                                                807±30ns
[ 92.65%] ··· bench_ufunc.Scalar.time_add_scalar_conv_complex                                                                                                                        831±40ns
[ 93.38%] ··· bench_ufunc.UFunc.time_ufunc_types                                                                                                                                           ok
[ 93.38%] ··· =============== =============
                   ufunc                   
              --------------- -------------
                    abs          806±4μs   
                  absolute       809±2μs   
                    add         386±0.5μs  
                   arccos      6.23±0.02ms 
                  arccosh      5.94±0.02ms 
                   arcsin      6.31±0.01ms 
                  arcsinh      5.78±0.07ms 
                   arctan      3.46±0.01ms 
                  arctan2      1.84±0.01ms 
                  arctanh      3.71±0.01ms 
                bitwise_and     32.7±0.2μs 
                bitwise_not    21.2±0.04μs 
                 bitwise_or     32.7±0.1μs 
                bitwise_xor     32.5±0.2μs 
                    cbrt       1.90±0.04ms 
                    ceil         213±5μs   
                    conj         191±3μs   
                 conjugate      190±0.9μs  
                  copysign      179±0.8μs  
                    cos        6.89±0.04ms 
                    cosh       6.03±0.03ms 
                  deg2rad       239±0.3μs  
                  degrees       240±0.4μs  
                   divide        679±4μs   
                   divmod        1.01±0ms  
                   equal         309±3μs   
                    exp        4.90±0.01ms 
                    exp2       4.71±0.02ms 
                   expm1       9.26±0.04ms 
                    fabs         239±2μs   
                float_power    10.3±0.07ms 
                   floor        210±0.9μs  
                floor_divide     890±3μs   
                    fmax         439±1μs   
                    fmin         430±1μs   
                    fmod         590±2μs   
                   frexp         341±1μs   
                    gcd         215±0.6μs  
                  greater       302±0.9μs  
               greater_equal     301±2μs   
                 heaviside      394±0.9μs  
                   hypot         1.04±0ms  
                   invert       21.3±0.1μs 
                  isfinite       171±1μs   
                   isinf         171±1μs   
                   isnan        146±0.7μs  
                   isnat         352±5ns   
                    lcm          338±2μs   
                   ldexp        213±0.5μs  
                 left_shift     98.7±0.6μs 
                    less        295±0.9μs  
                 less_equal      294±2μs   
                    log        3.13±0.01ms 
                   log10       3.40±0.04ms 
                   log1p       3.34±0.01ms 
                    log2       3.16±0.01ms 
                 logaddexp       350±1μs   
                 logaddexp2      338±2μs   
                logical_and      312±2μs   
                logical_not     192±0.8μs  
                 logical_or      261±1μs   
                logical_xor      371±1μs   
                   matmul      22.6±0.05ms 
                  maximum        415±2μs   
                  minimum       396±0.3μs  
                    mod          693±2μs   
                    modf         433±2μs   
                  multiply      394±0.4μs  
                  negative       222±1μs   
                 nextafter       399±2μs   
                 not_equal       311±2μs   
                  positive       230±3μs   
                   power       10.7±0.03ms 
                  rad2deg       240±0.7μs  
                  radians        240±1μs   
                 reciprocal     724±0.8μs  
                 remainder       694±4μs   
                right_shift     101±0.2μs  
                    rint         414±2μs   
                    sign        246±0.9μs  
                  signbit        91.7±2μs  
                    sin        6.69±0.03ms 
                    sinh       6.59±0.02ms 
                  spacing        428±2μs   
                    sqrt       1.52±0.01ms 
                   square        244±2μs   
                  subtract      380±0.7μs  
                    tan        8.32±0.09ms 
                    tanh       5.77±0.01ms 
                true_divide      679±2μs   
                   trunc         207±1μs   
              =============== =============

[ 94.12%] ··· bench_ufunc_strides.AVX_UFunc_log.time_log                                                                                                                                   ok
[ 94.12%] ··· ======== ============ ============
              --                 dtype          
              -------- -------------------------
               stride       f            d      
              ======== ============ ============
                 1      22.1±0.8μs   60.1±0.6μs 
                 2      29.3±0.4μs   60.5±0.2μs 
                 4      30.5±0.5μs   60.6±0.3μs 
              ======== ============ ============

[ 94.85%] ··· bench_ufunc_strides.AVX_cmplx_arithmetic.time_ufunc                                                                                                                          ok
[ 94.85%] ··· ========== ============= ============= ============= ============= ============= =============
              --                                            stride / dtype                                  
              ---------- -----------------------------------------------------------------------------------
                bfunc        1 / F         1 / D         2 / F         2 / D         4 / F         4 / D    
              ========== ============= ============= ============= ============= ============= =============
                 add      10.8±0.07μs    16.3±0.1μs   13.5±0.09μs   23.9±0.09μs   20.7±0.08μs   38.0±0.06μs 
               subtract   10.9±0.02μs   16.2±0.07μs   13.6±0.08μs   23.8±0.06μs   20.6±0.05μs    38.2±0.1μs 
               multiply   13.3±0.06μs    16.6±0.1μs    14.6±0.2μs    24.1±0.1μs   20.8±0.08μs    39.0±0.2μs 
                divide    42.5±0.07μs    48.5±0.2μs    43.6±0.4μs    48.4±0.3μs    42.9±0.2μs    49.0±0.2μs 
              ========== ============= ============= ============= ============= ============= =============

[ 95.59%] ··· bench_ufunc_strides.AVX_cmplx_funcs.time_ufunc                                                                                                                               ok
[ 95.59%] ··· ============ ============= ============ ============= ============ ============ =============
              --                                            stride / dtype                                 
              ------------ --------------------------------------------------------------------------------
                 bfunc         1 / F        1 / D         2 / F        2 / D        4 / F         4 / D    
              ============ ============= ============ ============= ============ ============ =============
               reciprocal    62.9±0.4μs   72.0±0.2μs    63.3±0.2μs   72.1±0.5μs   63.3±0.3μs    72.6±0.3μs 
                absolute     50.8±0.2μs   51.4±0.3μs    50.9±0.1μs   51.1±0.2μs   51.2±0.2μs    52.0±0.1μs 
                 square      13.0±0.1μs   14.6±0.9μs   13.3±0.09μs   16.9±0.4μs   15.5±0.3μs   22.6±0.08μs 
               conjugate    8.34±0.04μs   11.1±0.2μs    9.49±0.2μs   14.7±0.1μs   12.2±0.1μs   21.6±0.08μs 
              ============ ============= ============ ============= ============ ============ =============

[ 96.32%] ··· bench_ufunc_strides.AVX_ldexp.time_ufunc                                                                                                                                     ok
[ 96.32%] ··· ======= ============ ============ ============
              --                      stride                
              ------- --------------------------------------
               dtype       1            2            4      
              ======= ============ ============ ============
                 f     57.8±0.8μs   57.8±0.5μs   57.7±0.3μs 
                 d     59.8±0.1μs   60.2±0.2μs   60.0±0.2μs 
              ======= ============ ============ ============

[ 97.06%] ··· bench_ufunc_strides.Binary.time_ufunc                                                                                                                                        ok
[ 97.06%] ··· ========= ============ ============ ============= ============ ============ =========== =========== ==========
              --                                                              stride_out / dtype                            
              ----------------------------------- --------------------------------------------------------------------------
                ufunc    stride_in0   stride_in1      1 / f        1 / d        2 / f        2 / d       4 / f      4 / d   
              ========= ============ ============ ============= ============ ============ =========== =========== ==========
               maximum       1            1         37.9±0.2μs   74.3±0.4μs    85.5±5μs    110±0.8μs    126±3μs    220±3μs  
               maximum       1            2         75.8±0.4μs    97.0±1μs     110±8μs      141±1μs     129±4μs    260±10μs 
               maximum       1            4          80.3±3μs     162±3μs      120±3μs     216±0.6μs    145±3μs    353±3μs  
               maximum       2            1         75.9±0.2μs   95.5±0.6μs    105±9μs     140±0.9μs    130±1μs    267±20μs 
               maximum       2            2         129±0.2μs     124±2μs     146±0.3μs     178±2μs     149±1μs    312±10μs 
               maximum       2            4          131±2μs      201±4μs      151±3μs      276±6μs     166±3μs    434±7μs  
               maximum       4            1         80.3±0.4μs    165±7μs     114±0.8μs     216±5μs     144±4μs    400±40μs 
               maximum       4            2         131±0.7μs     205±2μs      156±4μs      269±10μs    171±6μs    421±3μs  
               maximum       4            4          139±4μs      309±20μs     166±4μs      388±9μs     198±4μs    594±50μs 
               minimum       1            1         38.1±0.2μs   74.3±0.3μs    85.5±5μs     114±2μs     127±4μs    220±7μs  
               minimum       1            2        75.7±0.09μs    97.3±1μs     102±10μs     142±4μs     132±5μs    266±6μs  
               minimum       1            4         79.6±0.3μs    159±4μs     112±0.6μs     220±4μs     148±4μs    389±30μs 
               minimum       2            1         74.0±0.2μs   95.9±0.6μs    105±9μs      141±2μs    130±0.3μs   270±10μs 
               minimum       2            2         113±0.1μs     126±2μs     150±0.5μs     173±2μs    152±0.4μs   292±20μs 
               minimum       2            4          118±3μs      207±3μs      158±3μs      276±10μs    170±4μs    453±40μs 
               minimum       4            1         79.4±0.8μs    162±3μs     113±0.9μs     215±2μs     144±2μs    384±10μs 
               minimum       4            2          116±4μs      208±3μs      155±1μs      277±9μs     170±5μs    421±4μs  
               minimum       4            4          127±2μs      300±3μs      165±2μs      391±4μs     203±5μs    569±20μs 
                 fmax        1            1         38.0±0.2μs   74.5±0.8μs    62.8±2μs     112±2μs     94.5±2μs   212±7μs  
                 fmax        1            2         75.7±0.1μs   95.9±0.5μs   97.4±0.2μs    138±1μs    109±0.5μs   266±8μs  
                 fmax        1            4         79.8±0.3μs    164±3μs      113±2μs      225±3μs     138±4μs    349±20μs 
                 fmax        2            1         75.7±0.2μs    95.6±1μs    93.7±0.3μs    138±2μs    108±0.3μs   261±10μs 
                 fmax        2            2         120±0.04μs    123±3μs      144±1μs      173±4μs     145±4μs    317±20μs 
                 fmax        2            4          123±2μs      204±7μs     146±0.5μs     272±9μs     163±5μs    474±40μs 
                 fmax        4            1         79.4±0.3μs    161±3μs      111±2μs      216±4μs     132±1μs    419±20μs 
                 fmax        4            2         123±0.2μs     205±7μs      147±2μs      262±5μs     159±3μs    448±30μs 
                 fmax        4            4          130±1μs      307±20μs     160±3μs      387±6μs     195±5μs    551±10μs 
                 fmin        1            1         37.9±0.2μs   73.5±0.5μs    62.5±2μs    111±0.8μs    93.3±2μs   214±4μs  
                 fmin        1            2        75.7±0.03μs    97.3±2μs    94.2±0.3μs    141±2μs    110±0.8μs   263±6μs  
                 fmin        1            4         80.5±0.5μs    162±2μs     110±0.3μs     216±4μs     132±3μs    351±9μs  
                 fmin        2            1          73.8±1μs    95.8±0.8μs   93.7±0.2μs   140±0.8μs   109±0.9μs   264±10μs 
                 fmin        2            2         128±0.2μs     124±2μs     139±0.7μs     174±2μs     142±2μs    288±5μs  
                 fmin        2            4          134±4μs      203±5μs      145±1μs      268±6μs    166±0.7μs   459±30μs 
                 fmin        4            1          78.1±1μs     162±2μs      113±4μs      215±5μs     131±2μs    383±20μs 
                 fmin        4            2          131±2μs      204±7μs      146±3μs      287±20μs    172±10μs   474±40μs 
                 fmin        4            4          139±3μs      329±10μs     164±9μs      409±20μs    212±20μs   607±40μs 
              ========= ============ ============ ============= ============ ============ =========== =========== ==========

[ 97.79%] ··· bench_ufunc_strides.BinaryInt.time_ufunc                                                                                                                                     ok
[ 97.79%] ··· ========= ============ ============ ============ ======= =============
                ufunc    stride_in0   stride_in1   stride_out   dtype               
              --------- ------------ ------------ ------------ ------- -------------
               maximum       1            1            1          b     8.70±0.04μs 
               maximum       1            1            1          B      9.16±0.3μs 
               maximum       1            1            1          h     17.7±0.08μs 
               maximum       1            1            1          H     17.9±0.07μs 
               maximum       1            1            1          i      33.9±0.2μs 
               maximum       1            1            1          I      33.8±0.1μs 
               maximum       1            1            1          l      65.9±0.5μs 
               maximum       1            1            1          L      66.4±0.3μs 
               maximum       1            1            1          q      66.6±0.2μs 
               maximum       1            1            1          Q      66.7±0.5μs 
               maximum       1            1            2          b       72.7±2μs  
               maximum       1            1            2          B       71.4±1μs  
               maximum       1            1            2          h      71.8±0.6μs 
               maximum       1            1            2          H      71.8±0.2μs 
               maximum       1            1            2          i      72.8±0.9μs 
               maximum       1            1            2          I      72.3±0.2μs 
               maximum       1            1            2          l       111±1μs   
               maximum       1            1            2          L       108±3μs   
               maximum       1            1            2          q       109±5μs   
               maximum       1            1            2          Q      102±0.9μs  
               maximum       1            1            4          b      75.8±0.3μs 
               maximum       1            1            4          B      74.8±0.3μs 
               maximum       1            1            4          h      71.8±0.3μs 
               maximum       1            1            4          H       75.3±2μs  
               maximum       1            1            4          i       93.5±6μs  
               maximum       1            1            4          I      88.3±0.9μs 
               maximum       1            1            4          l       204±1μs   
               maximum       1            1            4          L       202±2μs   
               maximum       1            1            4          q       200±3μs   
               maximum       1            1            4          Q       198±4μs   
               maximum       1            2            1          b      71.0±0.9μs 
               maximum       1            2            1          B      70.4±0.3μs 
               maximum       1            2            1          h      72.1±0.3μs 
               maximum       1            2            1          H      72.2±0.4μs 
               maximum       1            2            1          i      72.8±0.3μs 
               maximum       1            2            1          I      72.8±0.4μs 
               maximum       1            2            1          l      96.9±0.9μs 
               maximum       1            2            1          L      94.0±0.6μs 
               maximum       1            2            1          q      93.9±0.2μs 
               maximum       1            2            1          Q      95.3±0.7μs 
               maximum       1            2            2          b      71.5±0.3μs 
               maximum       1            2            2          B      71.1±0.2μs 
               maximum       1            2            2          h      71.7±0.3μs 
               maximum       1            2            2          H      71.6±0.5μs 
               maximum       1            2            2          i      73.5±0.3μs 
               maximum       1            2            2          I      73.3±0.3μs 
               maximum       1            2            2          l      127±0.4μs  
               maximum       1            2            2          L       134±6μs   
               maximum       1            2            2          q       126±2μs   
               maximum       1            2            2          Q      126±0.7μs  
               maximum       1            2            4          b      75.6±0.3μs 
               maximum       1            2            4          B      75.2±0.3μs 
               maximum       1            2            4          h      72.1±0.2μs 
               maximum       1            2            4          H      72.3±0.3μs 
               maximum       1            2            4          i       97.2±5μs  
               maximum       1            2            4          I      92.6±0.6μs 
               maximum       1            2            4          l       249±7μs   
               maximum       1            2            4          L       252±5μs   
               maximum       1            2            4          q       247±7μs   
               maximum       1            2            4          Q       248±7μs   
               maximum       1            4            1          b      70.4±0.5μs 
               maximum       1            4            1          B      70.4±0.2μs 
               maximum       1            4            1          h      72.3±0.8μs 
               maximum       1            4            1          H       75.7±4μs  
               maximum       1            4            1          i       89.2±4μs  
               maximum       1            4            1          I       84.1±2μs  
               maximum       1            4            1          l       154±3μs   
               maximum       1            4            1          L       152±3μs   
               maximum       1            4            1          q       158±10μs  
               maximum       1            4            1          Q       156±4μs   
               maximum       1            4            2          b      71.6±0.7μs 
               maximum       1            4            2          B       73.2±4μs  
               maximum       1            4            2          h       72.6±4μs  
               maximum       1            4            2          H      72.2±0.6μs 
               maximum       1            4            2          i      90.7±0.7μs 
               maximum       1            4            2          I      90.2±0.9μs 
               maximum       1            4            2          l       206±1μs   
               maximum       1            4            2          L       207±10μs  
               maximum       1            4            2          q       209±6μs   
               maximum       1            4            2          Q       207±10μs  
               maximum       1            4            4          b      75.4±0.4μs 
               maximum       1            4            4          B      75.7±0.5μs 
               maximum       1            4            4          h       77.4±5μs  
               maximum       1            4            4          H      73.3±0.4μs 
               maximum       1            4            4          i       114±2μs   
               maximum       1            4            4          I       119±1μs   
               maximum       1            4            4          l       335±20μs  
               maximum       1            4            4          L       333±20μs  
               maximum       1            4            4          q       359±20μs  
               maximum       1            4            4          Q       385±9μs   
               maximum       2            1            1          b       73.3±3μs  
               maximum       2            1            1          B       73.4±3μs  
               maximum       2            1            1          h      71.5±0.3μs 
               maximum       2            1            1          H      71.4±0.1μs 
               maximum       2            1            1          i      72.4±0.3μs 
               maximum       2            1            1          I      72.2±0.3μs 
               maximum       2            1            1          l      94.2±0.5μs 
               maximum       2            1            1          L      95.3±0.8μs 
               maximum       2            1            1          q       101±4μs   
               maximum       2            1            1          Q       96.0±2μs  
               maximum       2            1            2          b      70.6±0.4μs 
               maximum       2            1            2          B      71.0±0.4μs 
               maximum       2            1            2          h       74.9±4μs  
               maximum       2            1            2          H      71.6±0.2μs 
               maximum       2            1            2          i       80.4±7μs  
               maximum       2            1            2          I      73.4±0.7μs 
               maximum       2            1            2          l       131±3μs   
               maximum       2            1            2          L       128±3μs   
               maximum       2            1            2          q       127±2μs   
               maximum       2            1            2          Q       130±2μs   
               maximum       2            1            4          b      75.8±0.5μs 
               maximum       2            1            4          B      75.7±0.5μs 
               maximum       2            1            4          h      72.7±0.4μs 
               maximum       2            1            4          H      72.5±0.3μs 
               maximum       2            1            4          i       93.0±1μs  
               maximum       2            1            4          I      94.7±0.8μs 
               maximum       2            1            4          l       248±6μs   
               maximum       2            1            4          L       240±8μs   
               maximum       2            1            4          q       247±9μs   
               maximum       2            1            4          Q       243±6μs   
               maximum       2            2            1          b      70.6±0.4μs 
               maximum       2            2            1          B      70.8±0.1μs 
               maximum       2            2            1          h      71.6±0.3μs 
               maximum       2            2            1          H      79.7±0.7μs 
               maximum       2            2            1          i      73.4±0.3μs 
               maximum       2            2            1          I       81.4±8μs  
               maximum       2            2            1          l      118±0.9μs  
               maximum       2            2            1          L       126±7μs   
               maximum       2            2            1          q       122±2μs   
               maximum       2            2            1          Q       120±3μs   
               maximum       2            2            2          b      71.3±0.5μs 
               maximum       2            2            2          B      71.5±0.5μs 
               maximum       2            2            2          h       76.2±4μs  
               maximum       2            2            2          H       71.8±4μs  
               maximum       2            2            2          i      75.5±0.4μs 
               maximum       2            2            2          I      75.9±0.6μs 
               maximum       2            2            2          l       155±2μs   
               maximum       2            2            2          L       157±5μs   
               maximum       2            2            2          q       160±6μs   
               maximum       2            2            2          Q       160±2μs   
               maximum       2            2            4          b      74.7±0.1μs 
               maximum       2            2            4          B       75.9±1μs  
               maximum       2            2            4          h      71.8±0.5μs 
               maximum       2            2            4          H      72.8±0.6μs 
               maximum       2            2            4          i       101±1μs   
               maximum       2            2            4          I       102±1μs   
               maximum       2            2            4          l       272±2μs   
               maximum       2            2            4          L       292±20μs  
               maximum       2            2            4          q       293±20μs  
               maximum       2            2            4          Q       296±20μs  
               maximum       2            4            1          b       75.7±4μs  
               maximum       2            4            1          B      71.1±0.3μs 
               maximum       2            4            1          h      72.2±0.3μs 
               maximum       2            4            1          H      72.4±0.6μs 
               maximum       2            4            1          i       87.7±1μs  
               maximum       2            4            1          I       86.8±1μs  
               maximum       2            4            1          l       198±10μs  
               maximum       2            4            1          L       196±4μs   
               maximum       2            4            1          q       197±3μs   
               maximum       2            4            1          Q       204±6μs   
               maximum       2            4            2          b      71.5±0.2μs 
               maximum       2            4            2          B      71.5±0.6μs 
               maximum       2            4            2          h      73.0±0.4μs 
               maximum       2            4            2          H      72.1±0.3μs 
               maximum       2            4            2          i       95.9±1μs  
               maximum       2            4            2          I       95.1±1μs  
               maximum       2            4            2          l       259±9μs   
               maximum       2            4            2          L       258±9μs   
               maximum       2            4            2          q       256±20μs  
               maximum       2            4            2          Q       249±4μs   
               maximum       2            4            4          b      76.0±0.2μs 
               maximum       2            4            4          B      76.3±0.4μs 
               maximum       2            4            4          h      73.6±0.5μs 
               maximum       2            4            4          H      73.8±0.7μs 
               maximum       2            4            4          i       128±2μs   
               maximum       2            4            4          I      127±0.7μs  
               maximum       2            4            4          l       412±30μs  
               maximum       2            4            4          L       467±20μs  
               maximum       2            4            4          q       465±4μs   
               maximum       2            4            4          Q       431±40μs  
               maximum       4            1            1          b       74.4±4μs  
               maximum       4            1            1          B      71.2±0.1μs 
               maximum       4            1            1          h      72.0±0.2μs 
               maximum       4            1            1          H      71.9±0.2μs 
               maximum       4            1            1          i      85.1±0.4μs 
               maximum       4            1            1          I      85.4±0.6μs 
               maximum       4            1            1          l       151±10μs  
               maximum       4            1            1          L       154±2μs   
               maximum       4            1            1          q       150±6μs   
               maximum       4            1            1          Q       151±3μs   
               maximum       4            1            2          b      71.7±0.5μs 
               maximum       4            1            2          B      71.4±0.2μs 
               maximum       4            1            2          h      72.6±0.5μs 
               maximum       4            1            2          H      73.2±0.6μs 
               maximum       4            1            2          i       94.5±2μs  
               maximum       4            1            2          I      91.7±0.5μs 
               maximum       4            1            2          l       206±7μs   
               maximum       4            1            2          L       208±5μs   
               maximum       4            1            2          q       205±9μs   
               maximum       4            1            2          Q       211±7μs   
               maximum       4            1            4          b       78.1±3μs  
               maximum       4            1            4          B      76.3±0.8μs 
               maximum       4            1            4          h       73.5±1μs  
               maximum       4            1            4          H      72.9±0.6μs 
               maximum       4            1            4          i       119±3μs   
               maximum       4            1            4          I       117±3μs   
               maximum       4            1            4          l       362±7μs   
               maximum       4            1            4          L       356±20μs  
               maximum       4            1            4          q       363±10μs  
               maximum       4            1            4          Q       354±10μs  
               maximum       4            2            1          b      71.1±0.3μs 
               maximum       4            2            1          B       71.2±1μs  
               maximum       4            2            1          h      72.6±0.3μs 
               maximum       4            2            1          H      72.1±0.2μs 
               maximum       4            2            1          i      88.8±0.8μs 
               maximum       4            2            1          I       93.6±5μs  
               maximum       4            2            1          l       188±4μs   
               maximum       4            2            1          L       194±7μs   
               maximum       4            2            1          q       192±4μs   
               maximum       4            2            1          Q       196±5μs   
               maximum       4            2            2          b       76.5±5μs  
               maximum       4            2            2          B      71.4±0.3μs 
               maximum       4            2            2          h      73.2±0.9μs 
               maximum       4            2            2          H      84.6±0.5μs 
               maximum       4            2            2          i       108±4μs   
               maximum       4            2            2          I       101±3μs   
               maximum       4            2            2          l       241±4μs   
               maximum       4            2            2          L       249±10μs  
               maximum       4            2            2          q       247±9μs   
               maximum       4            2            2          Q       252±10μs  
               maximum       4            2            4          b      76.2±0.4μs 
               maximum       4            2            4          B      75.9±0.3μs 
               maximum       4            2            4          h      74.0±0.2μs 
               maximum       4            2            4          H      74.4±0.3μs 
               maximum       4            2            4          i       126±3μs   
               maximum       4            2            4          I       128±3μs   
               maximum       4            2            4          l       402±5μs   
               maximum       4            2            4          L       403±7μs   
               maximum       4            2            4          q       405±5μs   
               maximum       4            2            4          Q       420±30μs  
               maximum       4            4            1          b       71.8±1μs  
               maximum       4            4            1          B      71.0±0.3μs 
               maximum       4            4            1          h      72.9±0.4μs 
               maximum       4            4            1          H      73.2±0.4μs 
               maximum       4            4            1          i       110±2μs   
               maximum       4            4            1          I       107±3μs   
               maximum       4            4            1          l       288±10μs  
               maximum       4            4            1          L       296±20μs  
               maximum       4            4            1          q       296±8μs   
               maximum       4            4            1          Q       303±10μs  
               maximum       4            4            2          b      71.6±0.1μs 
               maximum       4            4            2          B      71.5±0.2μs 
               maximum       4            4            2          h      73.7±0.7μs 
               maximum       4            4            2          H      73.5±0.7μs 
               maximum       4            4            2          i       123±2μs   
               maximum       4            4            2          I       118±2μs   
               maximum       4            4            2          l       366±6μs   
               maximum       4            4            2          L       365±7μs   
               maximum       4            4            2          q       373±7μs   
               maximum       4            4            2          Q       380±30μs  
               maximum       4            4            4          b       79.7±3μs  
               maximum       4            4            4          B       81.8±2μs  
               maximum       4            4            4          h       84.9±9μs  
               maximum       4            4            4          H      76.1±0.7μs 
               maximum       4            4            4          i       162±4μs   
               maximum       4            4            4          I      160±0.7μs  
               maximum       4            4            4          l       544±10μs  
               maximum       4            4            4          L       617±10μs  
               maximum       4            4            4          q       550±10μs  
               maximum       4            4            4          Q       617±4μs   
               minimum       1            1            1          b      8.68±0.2μs 
               minimum       1            1            1          B     8.70±0.09μs 
               minimum       1            1            1          h      17.9±0.1μs 
               minimum       1            1            1          H      17.8±0.1μs 
               minimum       1            1            1          i      33.8±0.2μs 
               minimum       1            1            1          I      33.8±0.2μs 
               minimum       1            1            1          l      66.4±0.9μs 
               minimum       1            1            1          L      66.5±0.6μs 
               minimum       1            1            1          q      66.6±0.6μs 
               minimum       1            1            1          Q      66.8±0.6μs 
               minimum       1            1            2          b       72.3±2μs  
               minimum       1            1            2          B       80.6±2μs  
               minimum       1            1            2          h      71.6±0.2μs 
               minimum       1            1            2          H      77.7±0.3μs 
               minimum       1            1            2          i      71.8±0.3μs 
               minimum       1            1            2          I      78.1±0.3μs 
               minimum       1            1            2          l      102±0.4μs  
               minimum       1            1            2          L       104±2μs   
               minimum       1            1            2          q       104±1μs   
               minimum       1            1            2          Q       104±1μs   
               minimum       1            1            4          b      74.8±0.1μs 
               minimum       1            1            4          B      79.1±0.3μs 
               minimum       1            1            4          h       72.2±2μs  
               minimum       1            1            4          H      78.0±0.3μs 
               minimum       1            1            4          i       92.6±6μs  
               minimum       1            1            4          I       95.8±7μs  
               minimum       1            1            4          l       196±5μs   
               minimum       1            1            4          L       201±7μs   
               minimum       1            1            4          q       196±5μs   
               minimum       1            1            4          Q       199±6μs   
               minimum       1            2            1          b      70.4±0.3μs 
               minimum       1            2            1          B       80.9±3μs  
               minimum       1            2            1          h       71.6±3μs  
               minimum       1            2            1          H      77.6±0.4μs 
               minimum       1            2            1          i      72.2±0.3μs 
               minimum       1            2            1          I      78.3±0.4μs 
               minimum       1            2            1          l      93.4±0.4μs 
               minimum       1            2            1          L      102±0.8μs  
               minimum       1            2            1          q       102±5μs   
               minimum       1            2            1          Q      100±0.8μs  
               minimum       1            2            2          b      71.3±0.2μs 
               minimum       1            2            2          B      79.7±0.3μs 
               minimum       1            2            2          h      71.7±0.2μs 
               minimum       1            2            2          H      78.3±0.4μs 
               minimum       1            2            2          i      73.7±0.4μs 
               minimum       1            2            2          I      78.9±0.5μs 
               minimum       1            2            2          l       131±4μs   
               minimum       1            2            2          L       127±3μs   
               minimum       1            2            2          q       125±2μs   
               minimum       1            2            2          Q       132±6μs   
               minimum       1            2            4          b      75.7±0.3μs 
               minimum       1            2            4          B      79.1±0.3μs 
               minimum       1            2            4          h      72.0±0.4μs 
               minimum       1            2            4          H      78.0±0.2μs 
               minimum       1            2            4          i      92.7±0.3μs 
               minimum       1            2            4          I      94.0±0.8μs 
               minimum       1            2            4          l       246±10μs  
               minimum       1            2            4          L       237±9μs   
               minimum       1            2            4          q       244±5μs   
               minimum       1            2            4          Q       242±10μs  
               minimum       1            4            1          b       73.7±3μs  
               minimum       1            4            1          B      78.4±0.3μs 
               minimum       1            4            1          h      72.0±0.4μs 
               minimum       1            4            1          H      78.1±0.4μs 
               minimum       1            4            1          i       90.8±1μs  
               minimum       1            4            1          I       93.0±4μs  
               minimum       1            4            1          l       153±3μs   
               minimum       1            4            1          L       151±3μs   
               minimum       1            4            1          q       156±3μs   
               minimum       1            4            1          Q       154±2μs   
               minimum       1            4            2          b      71.7±0.4μs 
               minimum       1            4            2          B      79.0±0.4μs 
               minimum       1            4            2          h       81.1±1μs  
               minimum       1            4            2          H      79.0±0.6μs 
               minimum       1            4            2          i      91.0±0.5μs 
               minimum       1            4            2          I      99.2±0.8μs 
               minimum       1            4            2          l       206±2μs   
               minimum       1            4            2          L       208±8μs   
               minimum       1            4            2          q       210±7μs   
               minimum       1            4            2          Q       212±2μs   
               minimum       1            4            4          b      75.6±0.5μs 
               minimum       1            4            4          B       83.0±4μs  
               minimum       1            4            4          h      73.4±0.4μs 
               minimum       1            4            4          H      79.2±0.4μs 
               minimum       1            4            4          i      113±0.3μs  
               minimum       1            4            4          I       118±1μs   
               minimum       1            4            4          l       337±20μs  
               minimum       1            4            4          L       382±20μs  
               minimum       1            4            4          q       366±30μs  
               minimum       1            4            4          Q       382±20μs  
               minimum       2            1            1          b      71.1±0.4μs 
               minimum       2            1            1          B      78.9±0.6μs 
               minimum       2            1            1          h      71.6±0.5μs 
               minimum       2            1            1          H      77.9±0.5μs 
               minimum       2            1            1          i      72.5±0.4μs 
               minimum       2            1            1          I      79.1±0.2μs 
               minimum       2            1            1          l       99.5±2μs  
               minimum       2            1            1          L       102±1μs   
               minimum       2            1            1          q       98.1±1μs  
               minimum       2            1            1          Q       104±2μs   
               minimum       2            1            2          b       73.5±2μs  
               minimum       2            1            2          B       81.8±3μs  
               minimum       2            1            2          h       75.2±3μs  
               minimum       2            1            2          H       82.5±5μs  
               minimum       2            1            2          i      73.8±0.6μs 
               minimum       2            1            2          I      79.0±0.6μs 
               minimum       2            1            2          l       128±1μs   
               minimum       2            1            2          L       128±1μs   
               minimum       2            1            2          q       127±2μs   
               minimum       2            1            2          Q      131±0.5μs  
               minimum       2            1            4          b      75.9±0.8μs 
               minimum       2            1            4          B       84.5±1μs  
               minimum       2            1            4          h       76.9±4μs  
               minimum       2            1            4          H       88.5±5μs  
               minimum       2            1            4          i       95.2±1μs  
               minimum       2            1            4          I      95.5±0.6μs 
               minimum       2            1            4          l       249±10μs  
               minimum       2            1            4          L       244±5μs   
               minimum       2            1            4          q       248±10μs  
               minimum       2            1            4          Q       252±4μs   
               minimum       2            2            1          b      70.8±0.1μs 
               minimum       2            2            1          B      78.1±0.4μs 
               minimum       2            2            1          h      71.4±0.3μs 
               minimum       2            2            1          H       83.7±5μs  
               minimum       2            2            1          i      73.4±0.3μs 
               minimum       2            2            1          I      79.0±0.3μs 
               minimum       2            2            1          l       121±3μs   
               minimum       2            2            1          L      119±0.9μs  
               minimum       2            2            1          q       120±5μs   
               minimum       2            2            1          Q       120±2μs   
               minimum       2            2            2          b       74.2±3μs  
               minimum       2            2            2          B      79.0±0.2μs 
               minimum       2            2            2          h       75.5±4μs  
               minimum       2            2            2          H      77.6±0.5μs 
               minimum       2            2            2          i      76.7±0.8μs 
               minimum       2            2            2          I      80.6±0.9μs 
               minimum       2            2            2          l       161±5μs   
               minimum       2            2            2          L       161±5μs   
               minimum       2            2            2          q       162±9μs   
               minimum       2            2            2          Q       162±9μs   
               minimum       2            2            4          b      75.0±0.3μs 
               minimum       2            2            4          B      79.3±0.6μs 
               minimum       2            2            4          h      72.4±0.5μs 
               minimum       2            2            4          H       80.2±3μs  
               minimum       2            2            4          i       104±5μs   
               minimum       2            2            4          I      103±0.7μs  
               minimum       2            2            4          l       268±6μs   
               minimum       2            2            4          L       272±10μs  
               minimum       2            2            4          q       285±10μs  
               minimum       2            2            4          Q       271±2μs   
               minimum       2            4            1          b       74.7±4μs  
               minimum       2            4            1          B      78.7±0.3μs 
               minimum       2            4            1          h      72.3±0.4μs 
               minimum       2            4            1          H      78.3±0.4μs 
               minimum       2            4            1          i      86.6±0.4μs 
               minimum       2            4            1          I       96.7±2μs  
               minimum       2            4            1          l       194±2μs   
               minimum       2            4            1          L       197±4μs   
               minimum       2            4            1          q       199±6μs   
               minimum       2            4            1          Q       196±3μs   
               minimum       2            4            2          b       72.5±3μs  
               minimum       2            4            2          B      79.4±0.5μs 
               minimum       2            4            2          h      72.4±0.5μs 
               minimum       2            4            2          H       92.2±5μs  
               minimum       2            4            2          i      93.9±0.6μs 
               minimum       2            4            2          I       107±5μs   
               minimum       2            4            2          l       249±9μs   
               minimum       2            4            2          L       268±5μs   
               minimum       2            4            2          q       254±8μs   
               minimum       2            4            2          Q       265±2μs   
               minimum       2            4            4          b       77.7±1μs  
               minimum       2            4            4          B      79.5±0.6μs 
               minimum       2            4            4          h       80.1±6μs  
               minimum       2            4            4          H      79.7±0.4μs 
               minimum       2            4            4          i       127±2μs   
               minimum       2            4            4          I       129±4μs   
               minimum       2            4            4          l       409±30μs  
               minimum       2            4            4          L       398±4μs   
               minimum       2            4            4          q       399±20μs  
               minimum       2            4            4          Q       410±30μs  
               minimum       4            1            1          b       76.8±4μs  
               minimum       4            1            1          B      79.1±0.4μs 
               minimum       4            1            1          h       77.5±5μs  
               minimum       4            1            1          H      78.1±0.3μs 
               minimum       4            1            1          i      85.3±0.3μs 
               minimum       4            1            1          I       92.6±1μs  
               minimum       4            1            1          l       154±3μs   
               minimum       4            1            1          L       154±3μs   
               minimum       4            1            1          q       152±2μs   
               minimum       4            1            1          Q       153±4μs   
               minimum       4            1            2          b       75.9±5μs  
               minimum       4            1            2          B      79.6±0.3μs 
               minimum       4            1            2          h      72.2±0.2μs 
               minimum       4            1            2          H      78.3±0.2μs 
               minimum       4            1            2          i      91.3±0.6μs 
               minimum       4            1            2          I       98.3±1μs  
               minimum       4            1            2          l       207±5μs   
               minimum       4            1            2          L       206±5μs   
               minimum       4            1            2          q       205±2μs   
               minimum       4            1            2          Q       212±7μs   
               minimum       4            1            4          b      75.0±0.3μs 
               minimum       4            1            4          B      79.0±0.3μs 
               minimum       4            1            4          h      73.1±0.7μs 
               minimum       4            1            4          H      79.1±0.9μs 
               minimum       4            1            4          i       117±3μs   
               minimum       4            1            4          I       124±4μs   
               minimum       4            1            4          l       373±20μs  
               minimum       4            1            4          L       364±20μs  
               minimum       4            1            4          q       390±10μs  
               minimum       4            1            4          Q       358±20μs  
               minimum       4            2            1          b      71.3±0.3μs 
               minimum       4            2            1          B      78.8±0.3μs 
               minimum       4            2            1          h      83.7±0.2μs 
               minimum       4            2            1          H       85.2±7μs  
               minimum       4            2            1          i      98.7±0.7μs 
               minimum       4            2            1          I       98.2±4μs  
               minimum       4            2            1          l       191±2μs   
               minimum       4            2            1          L       190±7μs   
               minimum       4            2            1          q       195±5μs   
               minimum       4            2            1          Q       203±8μs   
               minimum       4            2            2          b      72.4±0.5μs 
               minimum       4            2            2          B       85.5±5μs  
               minimum       4            2            2          h      73.4±0.9μs 
               minimum       4            2            2          H       79.8±2μs  
               minimum       4            2            2          i       97.3±1μs  
               minimum       4            2            2          I       104±2μs   
               minimum       4            2            2          l       258±10μs  
               minimum       4            2            2          L       264±8μs   
               minimum       4            2            2          q       248±10μs  
               minimum       4            2            2          Q       265±7μs   
               minimum       4            2            4          b      76.1±0.6μs 
               minimum       4            2            4          B      79.5±0.6μs 
               minimum       4            2            4          h      73.4±0.2μs 
               minimum       4            2            4          H      79.5±0.6μs 
               minimum       4            2            4          i       130±2μs   
               minimum       4            2            4          I       129±1μs   
               minimum       4            2            4          l       412±20μs  
               minimum       4            2            4          L       434±40μs  
               minimum       4            2            4          q       413±30μs  
               minimum       4            2            4          Q       419±30μs  
               minimum       4            4            1          b      71.2±0.4μs 
               minimum       4            4            1          B      78.7±0.1μs 
               minimum       4            4            1          h      72.7±0.3μs 
               minimum       4            4            1          H      78.4±0.3μs 
               minimum       4            4            1          i      103±0.9μs  
               minimum       4            4            1          I       111±3μs   
               minimum       4            4            1          l       299±9μs   
               minimum       4            4            1          L       302±10μs  
               minimum       4            4            1          q       296±6μs   
               minimum       4            4            1          Q       292±10μs  
               minimum       4            4            2          b      71.4±0.2μs 
               minimum       4            4            2          B      79.0±0.3μs 
               minimum       4            4            2          h      72.9±0.3μs 
               minimum       4            4            2          H      79.3±0.6μs 
               minimum       4            4            2          i       115±2μs   
               minimum       4            4            2          I       120±2μs   
               minimum       4            4            2          l       364±3μs   
               minimum       4            4            2          L       371±20μs  
               minimum       4            4            2          q       370±20μs  
               minimum       4            4            2          Q       409±20μs  
               minimum       4            4            4          b      76.6±0.5μs 
               minimum       4            4            4          B      79.3±0.2μs 
               minimum       4            4            4          h      75.8±0.4μs 
               minimum       4            4            4          H      81.4±0.4μs 
               minimum       4            4            4          i       164±3μs   
               minimum       4            4            4          I       162±5μs   
               minimum       4            4            4          l       547±10μs  
               minimum       4            4            4          L       547±20μs  
               minimum       4            4            4          q       547±10μs  
               minimum       4            4            4          Q       540±6μs   
              ========= ============ ============ ============ ======= =============

[ 98.53%] ··· bench_ufunc_strides.LogisticRegression.time_train                                                                               
10000
                                             ok
[ 98.53%] ··· =============== ============
                   dtype                  
              --------------- ------------
               numpy.float32   2.65±0.01s 
               numpy.float64    4.41±0s   
              =============== ============

[ 99.26%] ··· bench_ufunc_strides.Mandelbrot.time_mandel                                                                                                                           12.7±0.02s
[100.00%] ··· bench_ufunc_strides.Unary.time_ufunc                                                                                                                                         ok
[100.00%] ··· ========================= =========== ============= ============= ============= ============= ============= =============
              --                                                                     stride_out / dtype                                
              ------------------------------------- -----------------------------------------------------------------------------------
                        ufunc            stride_in      1 / f         1 / d         2 / f         2 / d         4 / f         4 / d    
              ========================= =========== ============= ============= ============= ============= ============= =============
                  <ufunc 'absolute'>         1        25.9±0.4μs    49.7±0.4μs    44.8±0.3μs     81.7±2μs      74.9±1μs      163±4μs   
                  <ufunc 'absolute'>         2         37.9±1μs     68.0±0.6μs    58.3±0.8μs     100±2μs      83.1±0.7μs     204±5μs   
                  <ufunc 'absolute'>         4        54.1±0.3μs     116±3μs       70.8±1μs      169±6μs       102±2μs       287±10μs  
                   <ufunc 'arccos'>          1         988±3μs     1.52±0.01ms     983±5μs     1.54±0.04ms   1.03±0.02ms   1.66±0.07ms 
                   <ufunc 'arccos'>          2         988±8μs     1.52±0.02ms     992±10μs    1.53±0.01ms     988±10μs    1.55±0.07ms 
                   <ufunc 'arccos'>          4       1.00±0.01ms   1.57±0.02ms     993±8μs     1.57±0.02ms     994±20μs    1.59±0.01ms 
                  <ufunc 'arccosh'>          1       2.06±0.01ms   2.35±0.05ms   2.06±0.01ms   2.37±0.03ms   2.08±0.01ms   2.42±0.06ms 
                  <ufunc 'arccosh'>          2       2.08±0.01ms   2.35±0.03ms   2.06±0.02ms   2.36±0.05ms   2.08±0.01ms   2.38±0.04ms 
                  <ufunc 'arccosh'>          4       2.07±0.02ms   2.36±0.02ms   2.07±0.02ms   2.42±0.05ms   2.07±0.01ms    2.36±0.1ms 
                   <ufunc 'arcsin'>          1         836±4μs     1.48±0.01ms     836±5μs     1.50±0.03ms     866±10μs    1.62±0.05ms 
                   <ufunc 'arcsin'>          2         841±9μs     1.50±0.01ms     842±10μs    1.50±0.01ms     840±8μs     1.52±0.08ms 
                   <ufunc 'arcsin'>          4         840±2μs     1.52±0.01ms     847±7μs     1.59±0.05ms     870±20μs    1.66±0.06ms 
                  <ufunc 'arcsinh'>          1       2.31±0.01ms   2.80±0.02ms   2.32±0.03ms   2.81±0.06ms   2.37±0.04ms    3.02±0.1ms 
                  <ufunc 'arcsinh'>          2       2.30±0.01ms   2.82±0.02ms   2.31±0.01ms   2.82±0.02ms   2.31±0.01ms    2.83±0.1ms 
                  <ufunc 'arcsinh'>          4       2.29±0.01ms   2.83±0.03ms   2.33±0.03ms   2.84±0.02ms   2.32±0.05ms   2.84±0.02ms 
                   <ufunc 'arctan'>          1         1.09±0ms    1.98±0.01ms   1.09±0.01ms   1.98±0.04ms   1.13±0.02ms   2.13±0.06ms 
                   <ufunc 'arctan'>          2         1.09±0ms    1.98±0.01ms   1.08±0.01ms   1.99±0.01ms   1.09±0.01ms   2.06±0.06ms 
                   <ufunc 'arctan'>          4       1.11±0.04ms   2.00±0.01ms   1.12±0.02ms   2.02±0.03ms   1.13±0.02ms   2.05±0.02ms 
                  <ufunc 'arctanh'>          1       2.30±0.01ms   2.51±0.02ms   2.30±0.01ms   2.55±0.05ms   2.38±0.05ms    2.76±0.1ms 
                  <ufunc 'arctanh'>          2       2.30±0.01ms   2.56±0.02ms   2.31±0.02ms   2.53±0.06ms   2.30±0.01ms    2.55±0.1ms 
                  <ufunc 'arctanh'>          4       2.30±0.01ms   2.53±0.04ms   2.30±0.02ms   2.64±0.06ms   2.32±0.04ms   2.58±0.02ms 
                    <ufunc 'cbrt'>           1       1.98±0.01ms   2.21±0.01ms     1.97±0ms    2.22±0.04ms   2.04±0.05ms    2.39±0.1ms 
                    <ufunc 'cbrt'>           2       1.97±0.01ms   2.22±0.02ms   1.99±0.01ms   2.25±0.04ms   1.97±0.01ms   2.26±0.09ms 
                    <ufunc 'cbrt'>           4       1.98±0.01ms   2.22±0.02ms   2.00±0.02ms   2.28±0.04ms   2.02±0.03ms   2.25±0.02ms 
                    <ufunc 'ceil'>           1        25.8±0.2μs    49.5±0.1μs    45.2±0.3μs     81.8±1μs      74.6±1μs      161±4μs   
                    <ufunc 'ceil'>           2        37.0±0.6μs    69.4±0.8μs     57.8±1μs      101±3μs       84.9±3μs      204±6μs   
                    <ufunc 'ceil'>           4        55.3±0.9μs     118±3μs      71.3±0.8μs     168±3μs       104±2μs       293±10μs  
               <ufunc 'conjugate'> (0)       1        45.4±0.4μs    54.5±0.4μs    48.4±0.3μs     83.6±1μs     75.6±0.8μs     166±2μs   
               <ufunc 'conjugate'> (0)       2        48.6±0.2μs    69.6±0.9μs    54.5±0.8μs     103±2μs       82.4±1μs      197±7μs   
               <ufunc 'conjugate'> (0)       4        56.8±0.3μs     118±1μs      69.5±0.6μs     165±4μs       104±1μs       317±20μs  
                    <ufunc 'cos'>            1        147±0.6μs      841±4μs       211±5μs       842±20μs      209±5μs       915±40μs  
                    <ufunc 'cos'>            2         218±2μs       840±2μs       274±2μs       841±10μs      274±2μs       864±40μs  
                    <ufunc 'cos'>            4         225±2μs       858±4μs       286±2μs       868±9μs       288±5μs       874±6μs   
                    <ufunc 'cosh'>           1         1.31±0ms    1.35±0.01ms   1.31±0.01ms   1.37±0.02ms   1.37±0.03ms   1.46±0.06ms 
                    <ufunc 'cosh'>           2       1.31±0.02ms   1.35±0.01ms   1.33±0.01ms   1.35±0.02ms   1.33±0.02ms   1.38±0.05ms 
                    <ufunc 'cosh'>           4       1.31±0.01ms   1.36±0.01ms   1.31±0.01ms   1.36±0.01ms   1.32±0.03ms     1.36±0ms  
                  <ufunc 'deg2rad'>          1         176±1μs       176±2μs      176±0.3μs      202±3μs       204±3μs       340±8μs   
                  <ufunc 'deg2rad'>          2         175±1μs       179±2μs       178±2μs       203±2μs       201±2μs       345±8μs   
                  <ufunc 'deg2rad'>          4         178±2μs       190±2μs       177±1μs       248±3μs       207±4μs       408±20μs  
                  <ufunc 'degrees'>          1         176±2μs       176±1μs       175±1μs       200±4μs       204±3μs       341±10μs  
                  <ufunc 'degrees'>          2        176±0.8μs      177±2μs       177±1μs       203±2μs       200±1μs       344±8μs   
                  <ufunc 'degrees'>          4        177±0.6μs      188±6μs      178±0.6μs      235±3μs       207±5μs       410±8μs   
                    <ufunc 'exp'>            1         154±3μs       608±6μs       4.49±0ms      610±10μs    4.67±0.09ms     658±20μs  
                    <ufunc 'exp'>            2         214±2μs       610±5μs     5.05±0.08ms     617±8μs     5.06±0.01ms     632±20μs  
                    <ufunc 'exp'>            4         223±2μs       638±20μs    5.07±0.01ms     628±9μs      5.08±0.1ms     645±20μs  
                    <ufunc 'exp2'>           1         330±1μs       450±2μs       329±2μs       455±9μs       341±7μs       491±20μs  
                    <ufunc 'exp2'>           2         335±2μs       457±3μs       333±4μs       460±7μs       337±3μs       476±20μs  
                    <ufunc 'exp2'>           4         342±2μs       482±3μs       345±4μs       515±10μs      352±6μs       535±20μs  
                   <ufunc 'expm1'>           1       1.09±0.01ms   1.06±0.01ms     1.09±0ms    1.07±0.02ms   1.13±0.02ms   1.16±0.05ms 
                   <ufunc 'expm1'>           2       1.09±0.01ms   1.06±0.01ms   1.10±0.01ms   1.07±0.01ms     1.10±0ms    1.11±0.04ms 
                   <ufunc 'expm1'>           4       1.09±0.01ms   1.07±0.01ms     1.10±0ms    1.10±0.02ms   1.11±0.02ms   1.10±0.02ms 
                    <ufunc 'fabs'>           1         176±1μs       176±1μs       177±1μs       203±2μs       204±3μs       341±10μs  
                    <ufunc 'fabs'>           2        176±0.8μs      178±1μs       177±1μs       202±2μs       201±2μs       341±10μs  
                    <ufunc 'fabs'>           4        176±0.9μs      189±2μs       177±2μs       245±8μs       212±4μs       400±10μs  
                   <ufunc 'floor'>           1        25.8±0.3μs    50.4±0.6μs    45.0±0.3μs     83.1±1μs      74.9±1μs      166±6μs   
                   <ufunc 'floor'>           2        36.9±0.3μs     71.1±2μs      58.1±1μs      107±2μs      83.6±0.7μs     198±3μs   
                   <ufunc 'floor'>           4        54.3±0.3μs     115±2μs      72.9±0.8μs     164±1μs       102±2μs       290±20μs  
                    <ufunc 'log'>            1         211±7μs       592±5μs     4.52±0.01ms     594±10μs     4.71±0.1ms     642±30μs  
                    <ufunc 'log'>            2         274±2μs       597±8μs     5.09±0.07ms     601±20μs    5.09±0.01ms     608±30μs  
                    <ufunc 'log'>            4         289±3μs       607±4μs     5.11±0.02ms     639±30μs     5.13±0.1ms     676±40μs  
                   <ufunc 'log10'>           1         788±2μs     1.03±0.01ms     793±4μs     1.03±0.02ms     821±8μs     1.12±0.05ms 
                   <ufunc 'log10'>           2         792±7μs     1.02±0.01ms     791±5μs     1.03±0.01ms     788±7μs     1.04±0.05ms 
                   <ufunc 'log10'>           4         788±4μs     1.06±0.02ms     791±6μs     1.06±0.03ms     795±20μs    1.05±0.01ms 
                   <ufunc 'log1p'>           1       1.16±0.01ms   1.17±0.01ms   1.15±0.01ms   1.19±0.03ms   1.20±0.03ms   1.27±0.05ms 
                   <ufunc 'log1p'>           2       1.15±0.01ms   1.17±0.01ms   1.19±0.09ms     1.17±0ms    1.16±0.01ms   1.19±0.05ms 
                   <ufunc 'log1p'>           4       1.16±0.01ms   1.18±0.02ms   1.16±0.01ms   1.19±0.01ms   1.18±0.03ms   1.21±0.01ms 
                    <ufunc 'log2'>           1         381±2μs       819±4μs       383±3μs       853±10μs      397±8μs       886±30μs  
                    <ufunc 'log2'>           2         379±2μs       821±6μs       383±4μs       887±20μs      381±3μs       840±40μs  
                    <ufunc 'log2'>           4         392±10μs      895±40μs      382±3μs       842±10μs      396±5μs       849±10μs  
                <ufunc 'logical_not'>        1        101±0.3μs     125±0.6μs      147±5μs       159±10μs      160±5μs       226±5μs   
                <ufunc 'logical_not'>        2        102±0.5μs     133±0.7μs     144±0.8μs      168±8μs       147±1μs       247±10μs  
                <ufunc 'logical_not'>        4        110±0.7μs      172±2μs       152±1μs       225±4μs       160±2μs       358±9μs   
                  <ufunc 'negative'>         1        25.6±0.4μs    49.6±0.4μs    53.6±0.3μs     85.3±2μs      77.0±2μs      166±3μs   
                  <ufunc 'negative'>         2        53.9±0.3μs     72.2±1μs     57.6±0.9μs     103±2μs      83.2±0.6μs     202±5μs   
                  <ufunc 'negative'>         4        61.5±0.2μs     116±3μs       72.1±1μs      167±3μs       104±2μs       306±8μs   
                  <ufunc 'positive'>         1        45.7±0.4μs    54.6±0.5μs    48.8±0.4μs     82.8±2μs      76.8±2μs      163±3μs   
                  <ufunc 'positive'>         2        48.9±0.3μs    69.7±0.5μs    54.0±0.8μs     100±1μs       83.8±1μs      195±4μs   
                  <ufunc 'positive'>         4        57.0±0.5μs     116±2μs      69.7±0.3μs     161±5μs       103±2μs       288±6μs   
                  <ufunc 'rad2deg'>          1         178±1μs       177±1μs       176±1μs       202±3μs       206±3μs       347±20μs  
                  <ufunc 'rad2deg'>          2        176±0.7μs     178±0.9μs      177±2μs       203±2μs       199±1μs       344±6μs   
                  <ufunc 'rad2deg'>          4         177±1μs       187±2μs       177±2μs       233±6μs       211±3μs       395±20μs  
                  <ufunc 'radians'>          1        176±0.7μs     176±0.6μs      177±1μs       201±3μs       204±2μs       339±7μs   
                  <ufunc 'radians'>          2        176±0.4μs      181±3μs       181±2μs       208±4μs       202±2μs       350±10μs  
                  <ufunc 'radians'>          4        181±0.7μs      194±1μs       180±2μs       239±7μs       213±3μs       401±4μs   
                 <ufunc 'reciprocal'>        1        54.7±0.5μs    210±0.9μs     54.3±0.6μs     210±4μs       76.1±1μs      225±8μs   
                 <ufunc 'reciprocal'>        2         54.5±1μs      209±3μs       63.2±2μs      210±2μs      85.6±0.7μs     227±7μs   
                 <ufunc 'reciprocal'>        4        55.3±0.7μs     214±2μs       73.3±1μs      219±3μs       104±2μs       307±7μs   
                    <ufunc 'rint'>           1        26.0±0.2μs    50.4±0.6μs    45.6±0.3μs    83.0±0.9μs     75.2±1μs      168±5μs   
                    <ufunc 'rint'>           2        37.2±0.5μs     70.6±2μs      58.8±2μs      105±4μs      84.3±0.6μs     199±4μs   
                    <ufunc 'rint'>           4        55.0±0.5μs     116±1μs      73.0±0.6μs     169±3μs       103±3μs       298±10μs  
                    <ufunc 'sign'>           1        70.2±0.6μs    70.6±0.8μs    71.8±0.7μs     87.8±2μs      79.6±1μs      169±4μs   
                    <ufunc 'sign'>           2        70.5±0.4μs     79.1±1μs      71.8±1μs      104±1μs       87.8±2μs      205±7μs   
                    <ufunc 'sign'>           4        74.5±0.9μs     124±3μs       80.3±1μs      177±9μs       105±3μs       312±10μs  
                    <ufunc 'sin'>            1         146±7μs       987±10μs      211±6μs     1.01±0.02ms     209±5μs     1.06±0.03ms 
                    <ufunc 'sin'>            2         211±2μs       991±8μs       269±2μs       996±10μs      265±3μs     1.02±0.04ms 
                    <ufunc 'sin'>            4         220±1μs     1.01±0.02ms     282±2μs     1.02±0.03ms     287±10μs    1.05±0.04ms 
                    <ufunc 'sinh'>           1       1.98±0.03ms   2.03±0.02ms   2.00±0.04ms   2.06±0.03ms   2.04±0.03ms   2.17±0.08ms 
                    <ufunc 'sinh'>           2       1.98±0.01ms   2.02±0.02ms   2.00±0.03ms   2.03±0.03ms   1.97±0.03ms   2.08±0.07ms 
                    <ufunc 'sinh'>           4       1.97±0.01ms   2.02±0.02ms   1.97±0.02ms   2.04±0.03ms   1.98±0.04ms   2.04±0.01ms 
                    <ufunc 'sqrt'>           1        53.2±0.3μs     205±1μs      52.7±0.3μs     211±3μs       75.7±1μs      226±7μs   
                    <ufunc 'sqrt'>           2        52.5±0.2μs     207±1μs       62.5±2μs      209±2μs       84.4±1μs      223±6μs   
                    <ufunc 'sqrt'>           4        54.7±0.4μs     211±1μs       72.2±1μs      216±2μs       104±2μs       284±5μs   
                   <ufunc 'square'>          1         25.9±9μs     49.2±0.4μs     45.3±4μs      82.0±1μs      74.6±1μs      165±4μs   
                   <ufunc 'square'>          2        36.9±0.3μs    68.5±0.6μs     56.5±1μs      100±1μs       83.3±1μs      199±7μs   
                   <ufunc 'square'>          4        53.9±0.1μs     111±1μs      70.5±0.8μs     165±6μs       104±1μs       289±10μs  
                    <ufunc 'tan'>            1       1.44±0.01ms   2.30±0.01ms   1.45±0.01ms   2.30±0.06ms   1.51±0.04ms   2.46±0.09ms 
                    <ufunc 'tan'>            2       1.45±0.01ms   2.31±0.01ms   1.45±0.02ms   2.40±0.06ms   1.46±0.01ms   2.35±0.09ms 
                    <ufunc 'tan'>            4       1.47±0.01ms   2.32±0.01ms   1.47±0.01ms   2.33±0.01ms   1.51±0.03ms   2.35±0.02ms 
                    <ufunc 'tanh'>           1         457±8μs     1.64±0.02ms     493±10μs    1.68±0.03ms     508±10μs    1.82±0.07ms 
                    <ufunc 'tanh'>           2         519±7μs     1.72±0.01ms     557±6μs     1.73±0.01ms     556±4μs     1.78±0.08ms 
                    <ufunc 'tanh'>           4         530±6μs     1.75±0.01ms     558±5μs     1.77±0.01ms     559±10μs    1.81±0.01ms 
                   <ufunc 'trunc'>           1        25.7±0.2μs     50.6±1μs     45.1±0.2μs     83.3±2μs      75.4±1μs      161±3μs   
                   <ufunc 'trunc'>           2        36.8±0.3μs     69.7±2μs      57.4±1μs      100±1μs      84.5±0.3μs     195±6μs   
                   <ufunc 'trunc'>           4        53.9±0.1μs     113±4μs       72.4±1μs      162±1μs       103±2μs       290±4μs   
               <ufunc 'conjugate'> (1)       1        47.2±0.9μs    54.3±0.6μs     50.1±2μs      82.5±2μs      75.6±1μs      162±3μs   
               <ufunc 'conjugate'> (1)       2        49.0±0.2μs     70.3±1μs      54.6±1μs      102±1μs       84.1±1μs      198±7μs   
               <ufunc 'conjugate'> (1)       4        57.7±0.7μs     114±7μs       70.1±1μs      165±3μs       102±3μs       296±20μs  
                 <ufunc '_ones_like'>        1        35.3±0.2μs   37.8±0.09μs    38.1±0.2μs    64.8±0.9μs     66.0±1μs      135±5μs   
                 <ufunc '_ones_like'>        2        35.5±0.9μs    38.2±0.2μs    39.2±0.6μs    64.2±0.6μs    64.0±0.4μs     135±4μs   
                 <ufunc '_ones_like'>        4        35.9±0.4μs    38.2±0.2μs    39.0±0.3μs    64.7±0.4μs    64.9±0.9μs     136±3μs   
              ========================= =========== ============= ============= ============= ============= ============= =============

       before           after         ratio
     [fd646bd6]       [ee8c683a]
     <main>           <performance_cache_unicode_array_ufunc>
+      79.0±0.3μs         98.1±2μs     1.24  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 1, 'I')
+      72.2±0.3μs       83.0±0.7μs     1.15  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 2, 1, 'i')
+        95.1±1μs          107±1μs     1.12  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 2, 'I')
+         271±2μs          303±3μs     1.12  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 2, 4, 'Q')
+      71.0±0.3μs         79.1±2μs     1.11  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 4, 1, 'B')
+         126±3μs          140±3μs     1.11  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'i')
+         120±2μs          133±9μs     1.11  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 4, 2, 'I')
+      72.6±0.3μs         79.9±4μs     1.10  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'h')
+      73.2±0.9μs         80.4±4μs     1.10  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'h')
+      73.3±0.3μs         80.4±6μs     1.10  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 1, 2, 2, 'I')
+      72.1±0.2μs         78.9±5μs     1.10  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'H')
+      71.5±0.3μs         77.7±2μs     1.09  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 1, 'h')
+         128±1μs          138±3μs     1.09  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 2, 'L')
+      79.0±0.4μs       85.6±0.3μs     1.08  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'B')
+         405±5μs         439±10μs     1.08  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'q')
+         192±4μs          206±3μs     1.07  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 1, 'q')
+      71.3±0.3μs         76.1±4μs     1.07  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'b')
+      74.4±0.3μs         78.8±5μs     1.06  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 4, 'H')
+       146±0.5μs          155±3μs     1.06  bench_ufunc_strides.Binary.time_ufunc('fmax', 2, 4, 2, 'f')
-     1.43±0.06μs      1.36±0.02μs     0.95  bench_ufunc.ArgParsingReduce.time_add_reduce_arg_parsing((array([0., 1.]), 0, None))
-        74.4±4μs       70.4±0.3μs     0.95  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 1, 1, 'b')
-         107±2μs        101±0.9μs     0.94  bench_ufunc_strides.Unary.time_ufunc(<ufunc 'floor'>, 2, 2, 'd')
-        887±20μs          819±6μs     0.92  bench_ufunc_strides.Unary.time_ufunc(<ufunc 'log2'>, 2, 2, 'd')
-        90.8±1μs       83.4±0.2μs     0.92  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 1, 'i')
-        80.4±7μs       72.8±0.2μs     0.91  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 1, 2, 'i')
-        607±40μs          549±9μs     0.90  bench_ufunc_strides.Binary.time_ufunc('fmin', 4, 4, 4, 'd')
-      98.7±0.7μs       89.0±0.4μs     0.90  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 4, 2, 1, 'i')
-        81.1±1μs       72.2±0.3μs     0.89  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 1, 4, 2, 'h')
-        88.5±5μs       78.6±0.5μs     0.89  bench_ufunc_strides.BinaryInt.time_ufunc('minimum', 2, 1, 4, 'H')
-      84.6±0.5μs         75.1±1μs     0.89  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 4, 2, 2, 'H')
-        467±20μs          405±5μs     0.87  bench_ufunc_strides.BinaryInt.time_ufunc('maximum', 2, 4, 4, 'L')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

@eendebakpt eendebakpt marked this pull request as draft May 1, 2022 21:08
Cache PyUniCode object with value __array_ufunc__ for attribute lookup
@eendebakpt eendebakpt force-pushed the performance_cache_unicode_array_ufunc branch from 8dc6a24 to ee8c683 Compare May 1, 2022 21:31
@eendebakpt eendebakpt changed the title PERF: Improve performance ufunc_generic_fastcall PERF: Improve performance ufunc_generic_fastcall for scalar input May 1, 2022
@eendebakpt eendebakpt marked this pull request as ready for review May 1, 2022 22:29
@seberg
Copy link
Member
seberg commented May 2, 2022

Oh, I had not realized that this branch kicks in for our scalars. I think this makes sense in any case, although I am curious about cleaning the code up a bit more. Would you be up for that?

The main point is that I am pretty sure there are two things to note here:

  1. maybe_get_attr has probably no meaningful advantage over PyObject_GetAttr when we already have the strings (at least not anymore). The reason is twofold:
    • The comment reads incorrect. The function suppresses exceptions, it only avoids them if an object has no attributes at all. But that should be no meaningful object in existence.
    • The char * version will be cached on the unicode object.
      I suppose our own version might avoid a tiny bit of extra indirections, etc. but I got the feeling that it is probably not meaningful.
  2. This is internal API, that is used only in about 10 places. It would be awesome to just change it everywhere to avoid duplicated code.

There are two ways to go about, either introduce a static in each function to cache the interned string, or add it here and/or here as necessary. Either seems fine to me.

I don't want to overstretch you though. So if you like, I could also try to do the larger refactor and you can review it instead?

@eendebakpt
Copy link
Contributor Author

@seberg I can take a shot a refactoring. There is an issue though with both PyObject_GetAttr and the current maybe_get_attr implementation. For the numpy scalars there is no attribute __array_ufunc__, which means that PyObject_GetAttr generates an exception that has to be cleared later (PyObject_HasAttr combines that). Generating and clearing the exception is quite expensive.

A screenshot of profiling results from calculating np.sqrt(np.float64(1.1)):

sqrt-profile

About 37% of the ufunc_generic_fastcall is spend in PyUFync_getNonDefaultArrayUFunc. About 10% is in PyUnicode_InternFromString (this is resolved by this PR), and about 22% in generating a formatted exception for missing the attribute __array_ufunc__. The methods PyObject_GetAttr and PyObject_HasAttr` both suffer from this as well.

There are some options to eliminate the exception generation:

  1. Add a fast return for the numpy scalar types (e.g. np.float64) just as there is for np.ndarray See https://github.com/numpy/numpy/blob/main/numpy/core/src/common/ufunc_override.c#L30

  2. a. Implement a specialized version of PyObject_GetAttr without the exception generation in numpy
    b. The same, but try to implement it upstream in cpython

  3. Set __array_ufunc__ to None on numpy scalars as getting an existing attribute is much faster than finding out an attribute does not exist. This does require numpy being able to handle the case where __array_ufunc__ is None

Since I an new to numpy I cannot see the implications of these options (and perhaps there are more), so I welcome your thoughts on this.

@eendebakpt
Copy link
Contributor Author

@seberg The attribute lookup is slow due to the fact that numpy looks for the attribute on the type and not the instance

/*
 * Lookup a special method, following the python approach of looking up
 * on the type object, rather than on the instance itself.
 *
 * Assumes that the special method is a numpy-specific one, so does not look
 * at builtin types, nor does it look at a base ndarray.
 *
 * In future, could be made more like _Py_LookupSpecial
 */
static NPY_INLINE PyObject *
PyArray_LookupSpecial(PyObject *obj, char const *name)
...

The following benchmarks shows what happens:

import numpy as np
x=np.float64(1.1)

%timeit hasattr(x, '__array_ufunc__')
%timeit hasattr(x, 'all')
%timeit hasattr(np.float64, 'all')
%timeit hasattr(np.float64, '__array_ufunc__') 

Result:

53.9 ns ± 0.65 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
67.2 ns ± 0.488 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
93.3 ns ± 0.251 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
268 ns ± 0.569 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

The difference in execution times is because in the first 2 tests we follow this path:

https://github.com/python/cpython/blob/364ed9409269fb321dc4eafdea677c09a4bc0d8d/Objects/object.c#L952

and in the last two we follow

  1. https://github.com/python/cpython/blob/364ed9409269fb321dc4eafdea677c09a4bc0d8d/Objects/object.c#L961
  2. https://github.com/python/cpython/blob/364ed9409269fb321dc4eafdea677c09a4bc0d8d/Objects/typeobject.c#L4411
  3. https://github.com/python/cpython/blob/364ed9409269fb321dc4eafdea677c09a4bc0d8d/Objects/typeobject.c#L3902

Not sure why the attribute lookup for types should be slower, I will look into this.

@seberg
Copy link
Member
seberg commented May 2, 2022

About the earlier option:

  1. A fast return may work. It does need to check the super class, so it is not nearly as fast as an "exact" check, though. But since we likely got an arbitrary object that may be OK at that point.
  2. Not sure a special GetAttr is possible? Since getattr is a method?
  3. None already has the meaning of "ufuncs" not supported, it should work to set it to the same value as ndarray.__array_ufunc__ or another singleton though.

Looking up on the type is correct, I am surprised that looking up on the instance is so much faster! Squinting at the Python code, it might be to do with metaclass handling, but these don't even have a metaclass (well besides type itself).
I guess there might be some way to improve that (in Python or NumPy), but I need to look at it for longer.


But, no matter all of those other things, storing the interned string somewhere and working with that exclusively still seems like a good start, that is independend of the other things you found?

@eendebakpt
Copy link
Contributor Author

The third option works:

import numpy as np
from numpy import sin, cos, sqrt
import time

class myfloat(np.float64):
    pass
myfloat.__array_ufunc__ = np.ndarray.__array_ufunc__


def test(my_class):
    for jj in range(1000):
        v=my_class(np.random.rand())    
        for kk in range(2000):
            w=sqrt(v) + cos(v) + sin(v)

t0=time.perf_counter()
test(np.float64)
dt1=time.perf_counter()-t0
print(dt1)

t0=time.perf_counter()
test(myfloat)
dt2=time.perf_counter()-t0
print(dt2)

print(f'factor {dt1/dt2:.2f} faster')

results in a 5% speedup (and perhaps when added on the C side). I will first complete refactoring the unicode string conversion and then continue on the attribute lookup.

@seberg
Copy link
Member
seberg commented May 3, 2022

Yeah, but I had not double checked that it will not have any effect on binary operators in weird subclassing situation. Not that this is likely to matter in practice :/. Defining __array_ufunc__ would tell NumPy that x + myobj does not need to call myobj.__radd__ unfortunately. So we would need to introduce a new singleton to get that right (or say it is OK with a release note, maybe.)

The alternative would be to use PyArray_CheckAnyScalarExact which may well be fast enough in practice.

Squinting at it, if it is important to not slow down unknown/arbitrary objects in some of these places, we could even add a bloom filter (a bloom filter is like a mini set that can only tell you that something may be included). But I doubt that is worthwhile, more of a fun thought ;).
It might for example make sense for "May be a Python builtin object", only then we need to check them all.

Lets focus on the other things first! Interesting CPython issue, so it is the error formatting mainly/only? I guess CPython would have to add a fast-path for type there to avoid the error...

@seberg
Copy link
Member
seberg commented May 3, 2022

Ah, so you know what I was looking at. The binary operator logic in question is the logic following here:

attr = PyArray_LookupSpecial(other, "__array_ufunc__");

…ance

Refactor code so the unicode names are interned
@seberg seberg changed the title PERF: Improve performance ufunc_generic_fastcall for scalar input PERF: Improve performance of special attribute lookups May 3, 2022
Frankly, we should not use a header anymore there, it should be a C
file now, but that is better for another PR/day.
@seberg
Copy link
Member
seberg commented May 3, 2022

Going to put this in once CI is happy, thanks @eendebakpt! We should have a few benchmarks that notice this just fine. At least one of the array-coercion does np.array(range([1])) gets faster, because converting arbitrary objects will be faster (also np.array(object()) is much faster).

Further, the benchmarks for __array_function__ should notice this as well. Some of the benchmarks I added for scalars here will too. So overall, I think we are covered well enough, especially since this arguably makes the code cleaner anyway.

@seberg
Copy link
Member
seberg commented May 3, 2022

@eendebakpt since you managed to run the benchmarks. If you like, it would be cool to followup with a small ufunc benchark to test ufunc call overhead on NumPy scalars (and maybe also 0-D or just very small arrays).
Since most of our benchmarks aim at larger arrays, that is something that is missing a bit right now.

But, I am happy with these changes, so will put them in, thanks.

@seberg seberg merged commit b222eb6 into numpy:main May 3, 2022
@eendebakpt eendebakpt deleted the performance_cache_unicode_array_ufunc branch May 3, 2022 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0