8000 BUG: Increase accuracy of log1p for small inputs by mfkasim1 · Pull Request #22611 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: Increase accuracy of log1p for small inputs #22611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

mfkasim1
Copy link
Contributor

Fixes #22609 to increase the accuracy of log1p for small inputs. With this PR, log1p can deserve its reason for existence for complex numbers.

import numpy as np
print(np.log1p(1e-18 + 1e-18j))  # 1e-18 + 1e-18j, as expected

@seberg
Copy link
Member
seberg commented Nov 17, 2022

Thanks for looking into a fix so quickly! Before I (or someone else dives in), maybe you have some quick tips to help. Do you have a reference to quickly make sure this is a good implementation? Does it have a speed tradeoff?

Copy link
Contributor
@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also FWIW the test_log1p_complex tests all pass on main, so they're not actually testing the desired behavior. I think this is expected given the precision involved.

@WarrenWeckesser
Copy link
Member
WarrenWeckesser commented Nov 17, 2022

Thanks @mfkasim1. This fixes the one case reported in gh-22609, but it breaks several other cases that are correctly handled by the existing code. The new problems are created by overflow or underflow in the expression xx * (2 + xx) + yy * yy. For example, try these inputs:

         z                expected value of log1p(z)
    -1 + 1e-18j        -41.44653167389282 + 1.5707963267948966j        
    -1 + 1e250j        575.6462732485114 + 1.5707963267948966j
    1e250 + 1j         575.6462732485114 + 1e-250j
    1e250 + 1e250j     575.9928468387914 + 0.7853981633974483j  
    1e-250 + 1e250j    575.6462732485114 + 1.5707963267948966j
    1e250 + 2e-250j    575.6462732485114 + 0.0j  

A correct implementation will require scaling the inputs appropriately, similar to what is done for npy_hypot:

NPY_INPLACE double npy_hypot(double x, double y)
{
#ifndef NPY_BLOCK_HYPOT
return hypot(x, y);
#else
double yx;
if (npy_isinf(x) || npy_isinf(y)) {
return NPY_INFINITY;
}
if (npy_isnan(x) || npy_isnan(y)) {
return NPY_NAN;
}
x = npy_fabs(x);
y = npy_fabs(y);
if (x < y) {
double temp = x;
x = y;
y = temp;
}
if (x == 0.) {
return 0.;
}
else {
yx = y/x;
return x*npy_sqrt(1.+yx*yx);
}
#endif
}

@mfkasim1
Copy link
Contributor Author

@rossbar Thanks for the comment. I've added the decimal arguments in the test to make sure it's checked with a lot of decimal points.

def test_log1p_complex(self):
assert_almost_equal(ncu.log1p(0.2 + 0.3j), ncu.log(1.2 + 0.3j))
assert_almost_equal(ncu.log1p(1e-19 + 1e-18j), 1e-19 + 1e-18j,
decimal=23)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that assert_almost_equal_nulp is the more convenient way of checking things here.

@WarrenWeckesser if you could want to dig into whether the implementation looks right, that would be great. I would still feel things would be much simpler if we can "steal" an implementation from elsewhere (that has a clearly BSD compatible license).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to centralise ideas (pytorch/pytorch#89214 (comment)), I agree. Now, it's not clear where is this function implemented for complex numbers really...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mfkasim1
Copy link
Contributor Author

I found a good implementation from scipy: https://github.com/scipy/scipy/blob/5b8490a2b80e73aec59af9ac93bca2ebd178fee1/scipy/special/_cunity.pxd#L25-L66
The idea is similar to handle it specially if the input is small. However, they also added a conditional if xx ** 2 + yy ** 2 + 2 * xx is close to zero, they use double-double precision.

@WarrenWeckesser
Copy link
Member
WarrenWeckesser commented Nov 19, 2022

@mfkasim1, I'm testing a reformulation of the calculation that seems to work well and shouldn't require higher precision for special cases.

Update: The reformulation below doesn't work as well as I originally thought. If I check the relative errors of the real and imaginary parts separately, I see that there are cases where the real parts can have errors that are large. For some inputs, the expression np.log1p(x) + 0.5*np.log1p((y/(x + 1))**2) in the code below will suffer loss of precision. A similar phenomenon is noted in the comment of the SciPy code that @mfkasim1 linked to.


The idea is to use this reformulation of $(x+1)^2 + y^2$:

$$ (x+1)^2 + y^2 = (x+1)^2 \left(1 + \left(\frac{y}{x + 1}\right)^2\right) $$

so

$$ \begin{split} \log\left((x+1)^2 + y^2\right)^{1/2} & = \frac{1}{2} \log\left((x+1)^2 + y^2\right) \\ & = \frac{1}{2} \log\left((x+1)^2 \left(1 + \left(\frac{y}{x+1}\right)^2\right)\right) \\ & = \log(x + 1) + \frac{1}{2}\log\left(1 + \left(\frac{y}{x+1}\right)^2\right) \\ & = \textrm{log1p}(x) + \frac{1}{2}\textrm{log1p}\left(\left(\frac{y}{x + 1}\right)^2\right) \end{split} $$

In the code, this formulation would be used when $x$ is small (e.g. $|x| &lt; 1/2$) and when $|y| &lt; x + 1$ (so $(y/(x+1))^2$ does not cause problems with overflow).

Here's an implementation in Python that I've been using to test this:

def log1p(z):
    x, y = z.real, z.imag
    theta = np.arctan2(y, x + 1)
    if abs(x) < 0.5 and abs(y) < x + 1:
        lnr = np.log1p(x) + 0.5*np.log1p((y/(x + 1))**2)
    else:
        lnr = np.log(np.hypot(x + 1, y))
    return lnr + 1j*theta

So far, quite a bit of fuzz testing shows that it works well. I use mpmath with high precision to compute the "true" values for tests. E.g.:

In [469]: from mpmath import mp

In [470]: mp.dps = 125

In [471]: z = -2e-17 + 2e-8j

Compute the expected value of log1p(z) with mp.log1p. Here is the 125 digit result

In [472]: mp.log1p(z)
Out[472]: mpc(real='0.00000000000000017999999999999997473817585095895590238446612672721306825082019861479410189297518927535665999267474842690586183589671360222800567',
imag='0.000000019999999999999998151784549935903144445373327991386300560662953457748900829013377241693442896709281068311236443103345569367960809')

And this is the expected double precision result:

In [473]: complex(mp.log1p(z))
Out[473]: (1.7999999999999997e-16+1.9999999999999997e-08j)

For this z, the function log1p(z) defined above gets the expected result:

In [474]: log1p(z)
Out[474]: (1.7999999999999997e-16+1.9999999999999997e-08j)

Fuzz testing shows that the relative error is generally less than 5e-16.

I'm still checking this, so don't update your code--it could definitely use another set of eyes. If anyone finds a case where this doesn't work, add a comment here.


@mfkasim1
Copy link
Contributor Author

@WarrenWeckesser Thanks for testing the reformulation! Could you give us an example where the formulation above produces large error?

@seberg
Copy link
Member
seberg commented Nov 22, 2022

CircleCI is doing some weird thing here, I hope with the next stuff push it will clear out, but this PR shouldn't affect documentation, so just ignore it if not.

@mfkasim1
Copy link
Contributor Author

It seems now the best option we have is to follow scipy's implementation. However, I can't see numpy have dependency on cephes's double double (required to handle the special case of cancellation terms) or any other cephes files. Would having the dependency on cephes of interest to numpy? Otherwise, should we just sacrifice the accuracy in that very special case (i.e. similar to calculate 1 + 1e-30 - 1 == 0.

@WarrenWeckesser
Copy link
Member
WarrenWeckesser commented Nov 25, 2022

@mfkasim1, for the function log1p that I showed above, any point in the complex plane on or sufficiently close to the circle $|z + 1| = 1$ (i.e. $(x + 1)^2 + y^2 = 1$, or $x^2 + 2x + y^2 = 0$, which is the same condition as in the SciPy code) will suffer loss of precision in the real part of the result. For example, z = -0.1 + 0.43588989435j is close to that circle:

In [311]: z = -0.1 + 0.43588989435j

In [312]: abs(z + 1)
Out[312]: 0.9999999999982271

The function log1p that I defined earlier gives

In [313]: log1p(z)
Out[313]: (-1.772956781387336e-12+0.4510268117926018j)

The true value (as computed with mpmath) is

In [314]: from mpmath import mp

In [315]: mp.dps = 100

In [316]: complex(mp.log1p(z))
Out[316]: (-1.7729356247190155e-12+0.4510268117926018j)

That shows that the relative error in the real part computed by log1p(z) is on the order of 1e-5.

However, if we look at the closeness of the complex numbers (rather than the relative error of just the real part), we get a relative error of 4.7e-17:

In [319]: w = log1p(z)

In [320]: w_true = complex(mp.log1p(z))

In [321]: abs(w - w_true)/abs(w_true)
Out[321]: 4.6907784121193136e-17

It seems now the best option we have is to follow scipy's implementation. However, I can't see numpy have dependency on cephes's double double (required to handle the special case of cancellation terms)

Even if there was interest in having a double-double library to use as a last resort for maintaining precision in special cases, I don't think the SciPy double-double code (in its current state) is a good source. Some of the transcendental functions in that code are buggy (see scipy/scipy#16227).

@seberg
Copy link
Member
seberg commented Dec 1, 2022

As much as I like getting things fully right. @WarrenWeckesser do you have a recommendation for the best improvement that we can easily get right now?
It sounds like we have a options that we can be confident about being better at little or no cost otherwise, and considering that cephes falls back to higher precision operations, it sounds like a fully precise option likely doesn't even exist.

@seberg
Copy link
Member
seberg commented Feb 17, 2023

Ping once more since I remembered this as better than nothing. I think we should do something (maybe just copy whatever pytorch did).

I am happy to base this on gut feeling and confidence that the new method is better rather than ideal in this case. But I do trust your gut feeling much more than mine @WarrenWeckesser :).

@WarrenWeckesser
Copy link
Member

@mfkasim1 and @seberg, sorry for not getting back to this for so long. I'm running more fuzz tests this week to compare this PR, my suggested formulation, and the version that is used in the Julia implementation at https://github.com/JuliaLang/julia/blob/c43e5a10be27b7f93b5368875aa1d2596b4d4947/base/complex.jl#L745. I'll report back in a day or so with the results.

@WarrenWeckesser
Copy link
Member
WarrenWeckesser commented Aug 3, 2023

FYI: I ended up exploring a few rabbit holes:

I'm still working on summarizing and coming up with a recommendation for NumPy.

@WarrenWeckesser
Copy link
Member
WarrenWeckesser commented Aug 7, 2023

After lots of experimentation with several versions of the complex log1p, here's the version that I recommend, written as a Python function that accepts scalars:

def log1p_theorem4(z):
    z = complex(z)
    u = z + 1
    if u == 1:
        return z
    elif u - 1 == z:
        return np.log(u)
    else:
        return np.log(u)*(z/(u - 1))

(The final version should probably have a check for either z.real or z.imag being nan, and return complex(nan, nan) in that case.)

It is based on the formula for log1p(x) (real argument) given in Theorem 4 of Goldberg's paper "What every computer scientist should know about floating-point arithmetic". The only tweak that I've made is the extra check for u - 1 == z, where log(u) is returned. Computing the complex division z/(u - 1) in that case can only introduce an additional (small) error.

I fuzz-tested with large samples clustered around sets such as: the circle |z + 1| = 1; specific points on that circle including 0+0j, -2+0j, -1+1j, -1-1j; the point -1+0j; and a few other scatter points. I also checked on various rays emanating from -1+0j through and beyond the circle |z + 1| = 1. I computed the relative error of the complex result w = log1p_theorem4(z) as abs(w - w_true)/abs(w_true), where w_true is computed to full precision by using mpath, and I also looked at the relative error of the real part, abs(w.real - w_true.real)/abs(w_true.real). In all cases, the relative error of the complex result was less than 6e-16.

The worst case relative error of the real part of the result depends the relative error of the real part of the underlying clog(z) function provided by the platform/compiler. On my Linux machine, this relative error was also very small, generally not exceeding 8e-16. That is because the complex log function clog(z) provided by the standard library is apparently very accurate, even on the troublesome unit circle.

On my Mac (m2), using clang, the relative error of the real part of the result could be quite high. That's because the relative error of the real part of clang's clog(z) on the unit circle can be high. For example,

In [36]: from mpmath import mp

In [37]: mp.dps = 200

In [38]: za = -0.0026626976230049726-0.07292671175487339j

This point is on the circle |z + 1| = 1:

In [39]: abs(za + 1)
Out[39]: 1.0

In [40]: w_true = complex(mp.log1p(za))

In [41]: w = log1p_theorem4(za)

In [42]: print(w_true)
(1.5951249881265712e-22-0.07299150803396787j)

In [43]: print(w)
(4.336808689942018e-19-0.07299150803396787j)

The only thing correct about the real part is the sign. The values are so small, however, that the overall relative error of the complex result is also very small:

In [44]: abs(w - w_true)/abs(w_true)
Out[44]: np.float64(5.93933963240823e-18)

The only way I could get the relative error of the real part reliably small on the Mac was to use an approach similar to that of SciPy: when the input z is close to the circle |z + 1| = 1, compute the real part of the result using double-double numbers. (This version is now implemented in ufunclab as log1p_doubledouble.) Using double-double can be an effective way to maintain precision when you really need it, but it is not clear to me that it is important for us to ensure that the real part has very low relative error; just ensuring that the overall relative error of the complex result is small seems good enough. Does that sound reasonable? I don't think we have an explicit policy about the acceptable relative errors for our functions.

@seberg
Copy link
Member
seberg commented Aug 7, 2023

I would say go for it. Anything seems like an improvement here and you covered a lot of bases it seems (enough so to fix a few bugs elsewhere!).
One thing I am just now wondering is whether the system cmath library may have an implementation available that we could use (at least sometimes)?

@WarrenWeckesser
Copy link
Member

One thing I am just now wondering is whether the system cmath library may have an implementation available that we could use (at least sometimes)?

I've only seen the function for real arguments (e.g. C99 has log1pf/log1p/log1pl).

@WarrenWeckesser
Copy link
Member

Well, here's another interesting wrinkle: I expected the version based on Theorem 4 to be faster than the double-double implementation. But in fact, when I measure the performance of the two functions implemented in ufunclab, the double-double version is often faster!

That needs more investigation, but for now, I think the version based on Theorem 4 is very good. We could implement that while I spend more time than is reasonable on figuring out if a follow-up based on the double-double implementation is worthwhile. 😄

@mfkasim1, are you interested in updating this pull request to implement what I called log1p_theorem4(z) above? FWIW, my C version is here. If not, I can start a new PR with the proposed function.

@mfkasim1
Copy link
Contributor Author

@WarrenWeckesser please go ahead with the new PR. I'm currently on holiday.

@lezcano
Copy link
lezcano commented Aug 11, 2023

Rather interesting that, after so much coming and going, the final formula is simply to compute log(1+z) with a "minor" tweak :)

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Aug 16, 2023
This PR improves the implementation of `torch.log1p` for complex inputs as mentioned in issue #107022. The new implementation is based on the insights provided in numpy/numpy#22611 (comment).

Pull Request resolved: #107100
Approved by: https://github.com/lezcano
summerdo pushed a commit to summerdo/pytorch that referenced this pull request Aug 17, 2023
This PR improves the implementation of `torch.log1p` for complex inputs as mentioned in issue pytorch#107022. The new implementation is based on the insights provided in numpy/numpy#22611 (comment).

Pull Request resolved: pytorch#107100
Approved by: https://github.com/lezcano
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Inaccurate log1p for small complex input
5 participants
0