ENH: add Wright's generalized Bessel function #11313

lorentzenchr · 2020-01-03T18:59:18Z

Reference issue

This function is one crucial step towards #11291. See also mailing list thread https://mail.python.org/pipermail/scipy-dev/2020-January/023918.html.

What does this implement/fix?

This PR adds Wright's generalized Bessel function as scipy.special.wright_bessel, see https://dlmf.nist.gov/10.46.E1.

Additional information.

~~This is a pure Python implementation that aims for correctness and not for speed.~~ I did not find any public implementation as a reference.
As it is an entire function, the series representation is always valid. ~~I tried asymptotic expansions for large argument z, without any luck.~~

Edit:
So far, only non-negative values of rho=a, beta=b and z=x are implemented. There are 5 different approaches depending on the ranges of the arguments.

Taylor series expansion in x=0 [1], for x <= 1.
Involves gamma funtions in each term.
Taylor series expansion in x=0 [2], for large a.
Taylor series in a=0, for tiny a and not too large x.
Asymptotic expansion for large x [3, 4].
Suitable for large x while still small a and b.
Integral representation [5], in principle for all arguments.

References:

[1] https://dlmf.nist.gov/10.46.E1
[2] P. K. Dunn, G. K. Smyth (2005), Series evaluation of Tweedie exponential
dispersion model densities. Statistics and Computing 15 (2005): 267-280.
[3] E. M. Wright (1935), The asymptotic expansion of the generalized Bessel
function. Proc. London Math. Soc. (2) 38, pp. 257-270.
https://doi.org/10.1112/plms/s2-38.1.257
[4] R. B. Paris (2017), The asymptotics of the generalised Bessel function,
Mathematica Aeterna, Vol. 7, 2017, no. 4, 381 - 406,
https://arxiv.org/abs/1711.03006
[5] Y. F. Luchko (2008), Algorithms for Evaluation of the Wright Function for
the Real Arguments' Values, Fractional Calculus and Applied Analysis 11(1)
http://sci-gems.math.bas.bg/jspui/bitstream/10525/1298/1/fcaa-vol11-num1-2008-57p-75p.pdf

person142 · 2020-01-04T00:08:44Z

I will note that a pure Python implementation is basically a nonstarter in special. Probably best to discuss this on the mailing list before proceeding further here.

rlucas7 · 2020-01-04T00:22:02Z

@lorentzenchr this will need unit tests. Given that this calls scipy.special.iv() it might help to take a look at the file here:
https://github.com/scipy/scipy/blob/fdf4cd83e0a80a083da3790453a12862b3460793/scipy/special/tests/test_basic.py
and follow a similar format, in fact it probably makes sense to add your test(s) to that file.

You'll want to consider typical case inputs as well as Nan and inf inputs to the function in your tests (and code).

rlucas7 · 2020-01-04T00:23:06Z

I will note that a pure Python implementation is basically a nonstarter in special. Probably best to discuss this on the mailing list before proceeding further here.

I Agree.

lorentzenchr · 2020-01-04T10:11:38Z

@rlucas7 I'm happy to add tests once it is settled that this PR is not for nothing.

@person142 @rlucas7 Why is a pure python implementation a nonstarter?

My plan:

Start discussion on scipy dev mailing list
Make pure python implementation sound and ready
Cythonize (maybe someone can assist me here, i.e. needs to call scipy.special.rgamma in loop)

person142 · 2020-01-04T16:37:30Z

Why is a pure python implementation a nonstarter?

Main reasons are:

It's too slow
Lack of typing forces too much boilerplate, e.g. I'm seeing things like this in the
implementation:
```
b = 1. * b  # make it at least a float
```
Conditional logic is very painful to express when operating on arrays, writing scalar kernels is much clearer
With all the ufunc machinery we have it's not significantly harder to write it in C/Cython/Fortran (I might even claim it's easier.)

lorentzenchr · 2020-01-04T16:55:49Z

@person142 Thank you very much for your explanations!
I have one good reason against a scalar kernel: speed. The kth element of the series expansion has rgamma(a*k+b). This is independent of z. I expect the input a and b to be scalar, but z may be an array. Therefore, one can reuse the values of rgamma for all z. As rgamma is the heaviest element per term, this gives a great speedup for array input z.

person142 · 2020-01-04T17:06:23Z

I expect the input a and b to be scalar, but z may be an array.

So this is where we might want to keep the end goal in mind-if we add this as a public function in special then it's going to have to be a ufunc (for API consistency reasons), so a and b could also be arrays.

On the other hand, if we say "this is just for getting at the Tweedie distributions, where a and b will be scalars", then we can think about instead making this a private function in special (we do this pretty frequently already), which frees us from the constraints of the public special API.

I was trying to get at something similar in my response on the mailing list:

public function in special: lots of extra work
private function in special: much more flexibility, hopefully leading to less work.

lorentzenchr · 2020-01-04T17:20:05Z

🤔 Right now, I like the private special function.

lorentzenchr · 2020-01-10T18:35:55Z

@person142 Meanwhile, I think this could start as a private function with limited range for arguments, e.g. all greater equal 0, but still Cython so that it would be easier to make it public.

Currently, I'm working on a stable form for 0<z<=1. Large z seems to be challenging.

8000

rlucas7 · 2020-01-11T02:55:39Z

Are you working towards support for complex values ? If so, I’d recommend focus on real values only. Tweedie distribution is real valued and I haven’t seen any references to any complex values applications.

…

On Jan 10, 2020, at 1:36 PM, Christian Lorentzen ***@***.***> wrote: @person142 Meanwhile, I think this could start as a private function with limited range for arguments, e.g. all greater equal 0, but still Cython so that it would be easier to make it public. Currently, I'm working on a stable form for 0<z<=1. Large z seems to be challenging. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

lorentzenchr · 2020-01-11T10:25:31Z

Are you working towards support for complex values ? If so, I’d recommend focus on real values only. Tweedie distribution is real valued and I haven’t seen any references to any complex values applications.

Exactly, for the moment all values real, and even positive: (a, b, x) >= 0. Makes it a lot easier.

WarrenWeckesser · 2020-01-11T12:43:06Z

For what it's worth, here are some functions that use mpmath. wright_bessel relies on mpmath.nsum to compute an approximation to the infinite series, so it suffers from whatever limitation mpmath.nsum has. The function wright_bessel_rho1 is an alternative that uses the connection between the function and modified Bessel function when rho is 1.

import mpmath


def _wright_bessel_term(k, z, rho, beta): 
   numer = mpmath.power(z, k)
   denom = mpmath.gamma(k*rho + beta)*mpmath.factorial(k)
   return numer / denom


def wright_bessel(z, rho, beta):
    """
    Wright's generalized Bessel function.

    Also known as the Bessel-Maitland function.

    Parameter convention corresponds to

        https://dlmf.nist.gov/10.46#E1

    but here the `z` parameter is given first.
    """
    if z == 0:
        return 1 / mpmath.gamma(beta)
    return mpmath.nsum(lambda k: _wright_bessel_term(k, z, rho, beta),
                       [0, mpmath.inf])


def wright_bessel_rho1(x, beta):
    """
    Wright's generalized Bessel function.

    Also known as the Bessel-Maitland function.

    Parameter convention corresponds to

        https://dlmf.nist.gov/10.46#E1

    but here the `x` parameter is given first, and rho is fixed at 1.
    `x` is assumed to be real.
    """
    beta = mpmath.mpf(beta)
    nu = beta - 1
    if x > 0:
        r = mpmath.sqrt(x)
        w = mpmath.besseli(nu, 2*r) / mpmath.power(r, nu)
    elif x < 0:
        r = mpmath.sqrt(-x)
        w = mpmath.besselj(nu, 2*r) / mpmath.power(r, nu)
    else:
        # x == 0
        w = 1 / mpmath.gamma(beta)
    return w

For example,

In [123]: import mpmath

In [124]: mpmath.mp.dps = 40

In [125]: z = 0.9

In [126]: wright_bessel(z, 1.0, 0.4)
Out[126]: mpf('1.834786964889907297648075016830869147931931')

In [127]: wright_bessel_rho1(z, 0.4)
Out[127]: mpf('1.834786964889907297648075016830869147931931')

In [128]: z = -19.0

In [129]: wright_bessel(z, 1.0, 0.4)
Out[129]: mpf('-0.5596387413411752048288086838416299778273557')

In [130]: wright_bessel_rho1(z, 0.4)
Out[130]: mpf('-0.5596387413411752048288086838416299778273213')

scipy/special/wright_bessel.py

lorentzenchr · 2020-01-12T12:40:32Z

@WarrenWeckesser Thank you for the MPMATH examples. When I played around with my mpmath implementation, I noticed that nsum tries to find convergence every 10 steps. Therefore, I most often set steps=[100].

person142 · 2020-01-13T00:05:57Z

Large z seems to be challenging.

@lorentzenchr yeah, an asymptotic series will almost surely be needed. Feel free to ping me about implementation details if it's not something you're familiar with. Other than that, let me know when this is ready for more review. I'm going to add the "needs work" label so that it stays off my review queue until then.

lorentzenchr · 2020-01-18T20:40:23Z

@person142 Thanks for your guidance.
For my use case, it suffices to implement non-negative arguments only wright_bessel(a, b, x). Going for this I'm making progress. There are the typical 3 main domains for this functions:

small x where one can use the Taylor series (so far implemented for x<=1)
intermediate x where an integral representation exists, see paper but be aware of misprints. (I have a working implementation, not pushed yet.)
large x where an asymptotic series makes sense, see Wright 1935 and Paris 2017 (I have a working implementation, not pushed yet.)

I have 4 questions right now:

We intend to add this function as an internal function. Which precision should I aim for? Only the taylor series gets around 1e-15.
How to best choose the limits of the domains?
Is it ok to use scipy.integrate.quad for the intermediate x-domain? Possibly in Cython?
How to write tests (except for special, known arguments), e.g. against values from a mpmath implementation? This function has 3 arguments, so a data grid gets large very quickly.

Would I better address such questions on the mailing list?

person142 · 2020-01-18T21:48:09Z

@lorentzenchr

Which precision should I aim for?

Ideally relative error < 1e-13, but we sometimes go as low as 1e-7. After that we tend to intentionally start returning nans. Around the zeros of a function (where the condition number is high) it's ok to lose relative accuracy. It might be sensible here to restrict the parameters to some box where accurate computation is possible and raise an exception on the stats side if those limits are exceeded.

How to best choose the limits of the domains?

That's always a tough question. Sometimes you can prove things about rates of convergence, and that informs what domain you can use. See e.g.

https://github.com/scipy/scipy/blob/master/scipy/special/_hypergeometric.pxd#L65

for an example of that. Sometimes you compute across a wide range of parameters using all methods, plot where the accuracy is high for each method, and use that to figure out your regions. See e.g.

https://github.com/scipy/scipy/blob/master/scipy/special/_precompute/struve_convergence.py

for an example of that. Another technique is computing with multiple methods at runtime and keeping running error estimates, and then returning the result that has the lowest error estimate. Obviously this one really slows down the computation, so it should be more of a last resort.

Is it ok to use scipy.integrate.quad for the intermediate x-domain

Using quad is generally a bad idea because adaptive quadrature is fairly slow. Ideally you pick e.g. a Gaussian quadrature method and precompute the points/weights that will give you the desired precision. (Maybe using a paneling technique where you split the domain into multiple pieces and use Gaussian quadtrature on each piece.) Picking the right quadrature method can be an entire task unto itself.

This function has 3 arguments, so a data grid gets large very quickly.

Some tricks here:

Write tests specifically for the boundaries between methods. Things often go wrong in those regions.
Precompute the Mpmath values and store the results in a data file (we have a whole setup for this). Since Mpmath is often the limiting factor in the test this lets you make the grid bigger.
For when the grid really is just too big: switch to random sampling.

- add 5th order in a - use polevl - use Horner's scheme for polynomials in x - precompute constants C[..] numerically - better choosen order parameter - better range of validity - added tests for new range limits

- test more data points - print failing data points in _precompute/wright_bessel_data.py

- fine tuning for choice of eps in integral representation - higher bound for use of asymptotic expansion meaning the integral representation is used more often

lorentzenchr · 2020-10-15T13:46:42Z

@person142 I addressed all of your comments, extended tests a lot and made improvements for extreme arguments. Open to me is, probably, to return more np.nan for very difficult arguments.

To have automatic Cython interface, as you proposed, the function name is now without leading _. As a consequence, it is now a public special function.

The failing test is about too long lines in special/__init__.py (of which there were already several).
Side question: Where to put it? A separate paragraph or under Bessel functions?

person142 · 2020-10-15T14:40:06Z

Where to put it? A separate paragraph or under Bessel functions?

I'd put it in the Bessel functions section.

lorentzenchr · 2020-10-15T14:51:48Z

Where to put it? A separate paragraph or under Bessel functions?

I'd put it in the Bessel functions section.

There it is at the moment.

lorentzenchr · 2021-01-24T10:59:33Z

@WarrenWeckesser @person142 May I kindly ping you.

person142

Probably the best thing to do is merge and see what happens in the wild.

I'll leave it open for a few days in case someone else wants to comment; ping me again in a few days if I haven't merged it by then. (Looks like a lint error to fix in the meantime.)

lorentzenchr · 2021-02-04T19:50:25Z

@person142 Gentle ping.

person142 · 2021-02-05T02:03:00Z

The lint failure is real I think-note that it's running a stricter linting check on just your diff (we added that so that we could converge towards cleaner code while not having big refactoring PRs).

lorentzenchr · 2021-02-05T09:05:01Z

All green again.

person142 · 2021-02-06T19:04:10Z

Awesome, in it goes. Thanks for sticking with it @lorentzenchr!

lorentzenchr · 2021-02-06T19:34:07Z

@person142 Hitting the merge button for a +5000 line diff takes some courage/trust:smirk: I hope this function will help in a few places in the ecosytem around scipy.

rgommers · 2021-02-06T20:00:05Z

Awesome! @lorentzenchr I added an entry in https://github.com/scipy/scipy/wiki/Release-note-entries-for-SciPy-1.7.0#scipyspecial-improvements, if there's more to say please feel free to edit.

WarrenWeckesser · 2021-03-22T11:39:13Z

Would anyone be interested in adding an enlightening example or two to the wright_bessel docstring? We would like to ensure that the number of functions without examples in their docstrings goes down over time (see #7168).

10000

lorentzenchr · 2021-03-24T16:46:14Z

@WarrenWeckesser I opened #13738 and hope that this is what you had in mind with enlightening.:smirk:

lorentzenchr · 2024-06-17T07:08:04Z

In case someone is interested, I wrote a blog post about this function with a focus on the integral representation and the magic with the free parameter: https://lorentzen.ch/index.php/2024/06/17/a-tweedie-trilogy-part-iii-from-wrights-generalized-bessel-function-to-tweedies-compound-poisson-distribution/

ENH add wright's generalized Bessel function

c02ac6d

rgommers added enhancement A new feature or improvement scipy.special labels Jan 3, 2020

Christian Lorentzen added 2 commits January 4, 2020 14:13

ENH better convergence check for wright_bessel

0f2d867

TST add first tests for wright_bessel

44a6bb4

ENH better dtype handling of wright_bessel

bb80b38

WarrenWeckesser reviewed Jan 11, 2020

View reviewed changes

scipy/special/wright_bessel.py Outdated Show resolved Hide resolved

ENH cythonize wright_bessel and positive args only

7ee8a4a

person142 added the needs-work Items that are pending response from the author label Jan 13, 2020

Christian Lorentzen added 2 commits January 18, 2020 21:12

MAINT put series of wright bessel in separate function

2ced8fd

Merge branch 'master' into wright_function

9805a8a

Christian Lorentzen added 3 commits January 19, 2020 19:27

ENH small a and asymptotic expansion

858278d

ENH add series for small a and asymptotic x

7449dd0

ENH add integral representation

32d6abe

lorentzenchr added 6 commits October 7, 2020 20:42

DOC add wright_bessel to the doc in __init__

53aea02

ENH improve _wb_small_a

e7c994f

- add 5th order in a - use polevl - use Horner's scheme for polynomials in x - precompute constants C[..] numerically - better choosen order parameter - better range of validity - added tests for new range limits

ENH return nan if a >= rgamma_zero

cbd92e3

TST larger test grid

50190fe

- test more data points - print failing data points in _precompute/wright_bessel_data.py

ENH improve accuracy

f7133e0

- fine tuning for choice of eps in integral representation - higher bound for use of asymptotic expansion meaning the integral representation is used more often

ENH throw np.nan a bit more often

8caaeff

person142 approved these changes Jan 24, 2021

View reviewed changes

DOC break long lines with line continuation

5f3c5ea

person142 merged commit 8fceb48 into scipy:master Feb 6, 2021

lorentzenchr mentioned this pull request Feb 6, 2021

Add Tweedie distribution: likelihood, pdf and sampling methods statsmodels/statsmodels#7310

Open

tylerjereddy added this to the 1.7.0 milestone Feb 6, 2021

lorentzenchr deleted the wright_function branch March 24, 2021 16:13

lorentzenchr mentioned this pull request Mar 24, 2021

DOC: add example to wright_bessel #13738

Merged

steppi mentioned this pull request Apr 24, 2021

Inverse of Log CDF of Normal Distribution #13923

Closed

lorentzenchr mentioned this pull request May 11, 2023

ENH: special: Improvements for the incomplete beta functions. #17697

Merged

lorentzenchr mentioned this pull request Dec 23, 2023

Add Tweedie distributions to scipy.stats #11291

Closed

lorentzenchr mentioned this pull request Apr 28, 2024

ENH: special: add log of wright_bessel #20598

Closed

Uh oh!

ENH: add Wright's generalized Bessel function #11313

ENH: add Wright's generalized Bessel function #11313

Uh oh!

Conversation

Uh oh!

Reference issue

What does this implement/fix?

Additional information.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!