8000 math.ldexp gives incorrect results on Windows · Issue #132876 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

math.ldexp gives incorrect results on Windows #132876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tkreuzer opened this issue Apr 24, 2025 · 39 comments
Closed

math.ldexp gives incorrect results on Windows #132876

tkreuzer opened this issue Apr 24, 2025 · 39 comments
Labels
extension-modules C modules in the Modules dir OS-windows type-bug An unexpected behavior, bug, or error

Comments

@tkreuzer
Copy link
tkreuzer commented Apr 24, 2025

Bug report

Bug description:

>>> import math
>>> math.ldexp(6993274598585239, -1126)
5e-324
>>>

The correct result would be 1e-323. This is obviously a bug in the Windows ldexp implementation (it works fine on Linux). But it would be good, if it could be fixed on the CPython side with a workaround.
math.ldexp is used by mpmath to round from multiprecision to float, leading to incorrect rounding on Windows.

CPython versions tested on:

3.13

Operating systems tested on:

Windows

Linked PRs

@tkreuzer tkreuzer added the type-bug An unexpected behavior, bug, or error label Apr 24, 2025
@skirpichev skirpichev added the extension-modules C modules in the Modules dir label Apr 24, 2025
@skirpichev
Copy link
Contributor

For a context: mpmath/mpmath#942

@mmmp-oo
Copy link
mmmp-oo commented Apr 24, 2025

Hi! 👋

This issue appears to be caused by platform-specific differences in floating-point handling, particularly on Windows. For consistency across platforms, a workaround is to use arbitrary-precision libraries such as mpmath:

from mpmath import mp
mp.dps = 50
x = mp.mpf(6993274598585239)
y = -1126
print(mp.ldexp(x, y))

Of course, since math.ldexp is a native function, a permanent solution would need to be implemented within CPython itself.

Hope this helps! 😊

@skirpichev
Copy link
Contributor

appears to be caused by platform-specific differences

Just a bug in the platform libc. @tkreuzer, could you please describe your system (OS, hardware)?

a workaround is to use arbitrary-precision libraries such as mpmath

It's not a real workaround, because OP want native CPython floats, not mpmath's numbers. In fact, math.ldexp() is used in the mpmath to provide the __float__ dunder for mpf type.

It's possible to use gmpy2:

>>> import gmpy2
>>> gmpy2.set_context(gmpy2.ieee(64))
>>> gmpy2.mpfr(6993274598585239)/gmpy2.mpfr(2)**1126
mpfr('0.0')
>>> gmpy2.set_context(gmpy2.ieee(128))
>>> gmpy2.mpfr(6993274598585239)/gmpy2.mpfr(2)**1126
mpfr('7.67194470417997883016474037150491012e-324',113)
>>> float(_)
1e-323

Though, it's not an option for the mpmath itself.

@tkreuzer
Copy link
Author

Just a bug in the platform libc. @tkreuzer, could you please describe your system (OS, hardware)?

Windows 10 on an Intel i/-1065G7.

I implemented a workaround myself now.

def ldexp_manual(x_float, exp):
    """
    Compute x_float * 2**exp for a floating-point number, handling denormals and rounding.

    Args:
        x_float (float): The floating-point number to scale.
        exp (int): The integer exponent of 2 by which to scale.

    Returns:
        float: The result of x_float * 2**exp, respecting IEEE 754 rules.
    """
    # Handle special cases: zero, infinity, or NaN
    if x_float == 0.0 or not math.isfinite(x_float):
        return x_float

    # Get the 64-bit IEEE 754 representation
    bits = struct.unpack('Q', struct.pack('d', x_float))[0]

    # Extract components
    sign = bits >> 63
    exponent = (bits >> 52) & 0x7FF
    mantissa = bits & 0xFFFFFFFFFFFFF

    if exponent == 0:
        # Denormal number
        if mantissa == 0:
            return x_float  # Zero
        # Normalize it
        while mantissa < (1 << 52):
            mantissa <<= 1
            exponent -= 1
        exponent += 1
    else:
        # Normal number, add implicit leading 1
        mantissa |= (1 << 52)

    # Adjust exponent with exp
    new_exponent = exponent + exp

    if new_exponent > 2046:
        # Overflow to infinity
        return math.copysign(math.inf, x_float)
    elif new_exponent <= 0:
        # Denormal or underflow
        if new_exponent < -52:
            # Underflow to zero
            return 0.0

        # Calculate how much we need to shift the mantissa
        mantissa_shift = 1 - new_exponent

        # Get the highest bit, zhat would be shifted off
        round_bit = (mantissa >> (mantissa_shift - 1)) & 1

        # Adjust mantissa for denormal
        mantissa >>= mantissa_shift
        mantissa += round_bit
        new_exponent = 0

    # Reconstruct the float
    bits = (sign << 63) | (new_exponent << 52) | (mantissa & 0xFFFFFFFFFFFFF)
    return float(struct.unpack('d', struct.pack('Q', bits))[0])


def mpf_to_float(mpf_value):
    """
    Convert an mpmath mpf value to the nearest Python float.
    We use ldexp_manual, because ldexp is broken on Windows.
    """
    sign = mpf_value._mpf_[0]
    mantissa = mpf_value._mpf_[1]
    exponent = mpf_value._mpf_[2]

    result = ldexp_manual(mantissa, exponent)
    if sign == 1:
        result = -result
    return result
8000

@skirpichev
Copy link
Contributor
skirpichev commented Apr 24, 2025

On mpmath's master:

$ python -m timeit -s 'import mpmath;x=mpmath.mpf(1)' 'float(x)'
200000 loops, best of 5: 1.22 usec per loop

With your patch (I doubt it's correct):

$ python -m timeit -s 'import mpmath;x=mpmath.mpf(1)' 'float(x)'
50000 loops, best of 5: 9.09 usec per loop

@tkreuzer
Copy link
Author

On mpmath's master:

...

It's slower, I get it. Would not be an issue, if it was implemented in C 🙂

Alternatively you can probably use ldexp for anything that is not a denormal (would need to test that). Then you can simplify the code a bit. And you can replace the shift loop with something like

        # Normalize it
        shift_amount = 52 - mantissa.bit_length()
        mantissa <<= shift_amount
        exponent = 1 - shift_amount

@skirpichev
Copy link
Contributor
skirpichev commented Apr 25, 2025

Windows 10 on an Intel i/-1065G7.

python -VV output might be helpful.

From quoted issue: Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] on win32

I would like to see if someone else can reproduce this on same platform.

It's slower, I get it. Would not be an issue, if it was implemented in C

This can be mitigated coding in C, but that's not for the mpmath.

It's also incorrect. Besides raising OverflowError in an appropriate place, you probably should remove if sign == 1 branch in mpf_to_float().

FYI: mpmath/mpmath#945

Alternatively you can probably use ldexp for anything that is not a denormal (would need to test that).

You meant "for anything, that not produces a denormal"? Because 6993274598585239.0 is a normal float. I doubt it's cheap.

CC @tim-one

skirpichev added a commit to skirpichev/mpmath that referenced this issue Apr 25, 2025
@tim-one
Copy link
Member
tim-one commented Apr 25, 2025

The Windows ldexp doesn't round - it chops (truncates: "round to 0"). Whether that's "a bug" is arguable, but Microsoft hasn't budged on this at all.

To make this easier to follow, here's the simplest "failing" case:

>>> import math
>>> tiny = math.nextafter(0, 1)
>>> tiny # smallest non-zero positive float
5e-324
>>> _.hex()
'0x0.0000000000001p-1022'
>>> math.frexp(tiny)
(0.5, -1073) # easier than counting by hand
>>> math.ldexp(0.5, -1073) # ldexp is frexp's inverse
5e-324
>>> math.ldexp(0.75, -1073) # doesn't round up
5e-324
>>> math.ldexp(0.8, -1073) # still doesn't
5e-324
>>> math.ldexp(0.9, -1073) # and still not
5e-324
>>> math.ldexp(0.9999, -1073) # nope!
5e-324
>>> math.ldexp(1.0, -1073) # finally - but not via rounding
1e-323
>>> _.hex()
'0x0.0000000000002p-1022'

So long as the result isn't subnormal, no problem. The original intent of ldexp was to give a fast way to change the exponent without changing the mantissa bits at all. Subnormals don't play nice with that view, though.

Python's math module generally intends to reproduce the platform's C library results for functions of the same name.

There's no clear win to be had here. It's impossible for ldexp to both round and to match C programmers' expectations of how ldexp works on Windows.

If you want the same semantics as multiplying by a power of 2, multiply by a power of 2. The days when ldexp was faster than float multiplication are now ancient history.

Not that it's trivial. In the above, e.g., 2.0**1073 isn't representable as an IEEE double. In bad cases it can take two stages: first use ldexp to shift the input to the smallest normal binade. No information is lost. Then multiply by the appropriate negative power of 2 to finish the job. Let the hardware take care of rounding. There's no need to pick apart - or do arithmetic on - the bits yourself.

@skirpichev
Copy link
Contributor

The Windows ldexp doesn't round - it chops (truncates: "round to 0"). Whether that's "a bug" is arguable

Ah, that explains things for me. Does it happens on all M$ systems or rather limited (e.g. to win32)? (Disclaimer: I had windows box nearby 10+ years ago.)

Sorry, but it seems to be a bug. Standard says: "If a range error due to underflow occurs, the correct result (after rounding) is returned." Rounding mode could be ignored only if no range error occurs.

Python's math module generally intends to reproduce the platform's C library results for functions of the same name.

Are you argue to get rid of CPython's versions of tgamma()/lgamma() and switch to libm's versions? ;-)

So, sometimes we have workarounds for buggy libm functions. This might be the case.

@tim-one
Copy link
Member
tim-one commented Apr 25, 2025

Does it happens on all M$ systems or rather limited (e.g. to win32)?

I'm not a Windows expert. It's something I've learned from the systems I personally use. It was true under DOS from the start, and is still so under my current 64-bit Windows 10 Pro.

Standard says: "If a range error due to underflow occurs, the correct result (after rounding) is returned." Rounding mode could be ignored only if no range error occur

The great thing about standards is that there are so many of them 😉. You need to identify the exact standard you have in mind. The accepted answer here:

https://stackoverflow.com/questions/32150888/should-ldexp-round-correctly

lets Microsoft off the hook because their C implementation may not define __STDC_IEC_559__ as 1. Well, they didn't know for sure whether it did, but the accepted answer here says __STDC_IEC_559__ is set to 0 on Windows by both gcc and clang (but to 1 on Linux)

https://softwarerecs.stackexchange.com/questions/78793/is-there-any-c-compiler-which-defines-both-stdc-and-stdc-iec-559-to-1

Which isn't the final word either. Hence it's "arguable", but it's not an argument I want to engage in.

Are you argue to get rid of CPython's versions of tgamma()/lgamma() and switch to libm's versions? ;-)

If platform libms have since improved to be better than horrid on those functions, yes, it would be best if we did lose ours. You already recently discovered that, on the platform you used, its gamma is better than ours, We're not doing users on that platform much good by hiding their platform's better implementation. But since we took them over, we have our own backward compatibility worries now.

Microsoft has decades of backward compatibility to worry about with ldexp(), which is why I expect - but don't know - they never change it.

So, sometimes we have workarounds for buggy libm functions. This might be the case.

It might be. But if we do take it over on Windows, our docs need to be clear that Python's ldexp differs in this respect from the native Windows version. The docs should have done so for the gamma functions when we took them over.

It's a fight I don't want to take part in. But I'd be +1 on leaving ldexp() alone, and introducing a new, say, ldexpr() ("ldexp rounded") function that promises to round on all platforms. It would the same as the platform ldexp on most popular platforms.

When I care, I multiply by a negative power of 2. It's cheap to compute 2**-n via ldexp(1.0, -n). But n has to be <= 1074, else that underflows to 0.

>>> math.ldexp(0.75, -1073) # truncates
5e-324
>>> 0.75 * math.ldexp(1, -1073) # rounds
1e-323

So that's the core "trick" to implementing a rounded variant of ldexp on Windows without tedious, slow, and error-prone bit fiddling. frexp() can be used to get the input's exponent. There's no need to look at (let alone change) the mantissa bits.

>>> math.ldexp(6993274598585239, -1126)
5e-324
>>> 6993274598585239 * math.ldexp(1, -1074) * math.ldexp(1., -1126 + 1074)
1e-323

@tim-one
Copy link
Member
tim-one commented Apr 25, 2025

Here's a very lightly tested workaround for Windows, which defers to the platform function unless rounding may be needed. Rounding (if any is needed) is done by the multiply at the end.

Details

from math import frexp, isfinite, ldexp, nextafter

BITS = 53 # number of mantissa bits
TINY_EXP = frexp(nextafter(0.0, 1.0))[1] # exp of smllest denorm
SMALLEST_NORM_EXP = TINY_EXP + BITS - 1

def ldexpr(x, n):
    if not isfinite(x) or not x:
        # NsNs, infs, and zeros are unchanged.
        return x
    # Non-zero finite.
    if n >= 0:
        # Left shifts may overflow, but don't round.
        return ldexp(x, n)
    # Right shift of non-zero finite.
    original_exp = frexp(x)[1]
    target_exp = original_exp + n
    if target_exp >= SMALLEST_NORM_EXP:
       # No bits are shifted off - no rounding involved.
       return ldexp(x, n)
    # Shifting into denorm-land - trailing bits may be lost.
    if original_exp > SMALLEST_NORM_EXP:
        # Shift down to the smallest normal binade. No bits lost,
        x = ldexp(x, SMALLEST_NORM_EXP - original_exp)
        assert frexp(x)[1] == SMALLEST_NORM_EXP
        n += original_exp - SMALLEST_NORM_EXP
    # Multiplying by 2**n finishes the job, and the HW will round as
    # appropriate, Note; if n < -BITS, all of x is shifted to be <
    # 1/2 ULP of TINY, so should be thrown away. If n is so very
    # negative that ldexp underflows to 0, that's fine; no need to
    # check in advance.
    return x * ldexp(1.0, n)

Note that it's important to do no more than one multiply. lest subtle "double rounding" bugs be introduced. The intent is that no information of any kind is lost before the single multiply.

@skirpichev
Copy link
Contributor

You need to identify the exact standard you have in mind.

Sorry, of course I meant the C standard.

It says that "Whether the functions honor the rounding direction mode is implementation-defined, unless explicitly specified otherwise." But IIUIC, that means that default rounding mode will play game, i.e. rounding to nearest. Of course, if the implementation doesn't define __STDC_IEC_559__ - anything can be.

__STDC_IEC_559__ is set to 0 on Windows by both gcc and clang (but to 1 on Linux)

What about mvsc? @tkreuzer ?

But if we do take it over on Windows, our docs need to be clear that Python's ldexp differs in this respect from the native Windows version.

Well, in same sense our gamma() and few other functions differ from the platform versions.

I would say, if MVSC defines __STDC_IEC_559__ - it seems to be a platform bug, which we should fix. If not - probably this should be closed as "wontfix".

Here's a very lightly tested workaround for Windows, which defers to the platform function unless rounding may be needed.

Thanks, seems correct and much faster, of course. Though, not a priceless:

$ python -m timeit -s 'import mpmath;x=mpmath.mpf(1)' 'float(x)' # i>=0
200000 loops, best of 5: 1.47 usec per loop
$ python -m timeit -s 'import mpmath;x=mpmath.mpf(1e-300)' 'float(x)' # no rounding
100000 loops, best of 5: 2.99 usec per loop
$ python -m timeit -s 'import mpmath;x=mpmath.exp(-744)' 'float(x)' # OP case
50000 loops, best of 5: 4.77 usec per loop

On the master:

$ python -m timeit -s 'import mpmath;x=mpmath.mpf(1)' 'float(x)' # i>=0
200000 loops, best of 5: 1.15 usec per loop
$ python -m timeit -s 'import mpmath;x=mpmath.mpf(1e-300)' 'float(x)' # no rounding
200000 loops, best of 5: 1.14 usec per loop
$ python -m timeit -s 'import mpmath;x=mpmath.exp(-744)' 'float(x)' # OP case
200000 loops, best of 5: 1.2 usec per loop

I think C-coded version of that will run almost as fast as current ldexp(), but it's not acceptable as pure-Python workaround in the mpmath.

@chris-eibl
Copy link
Member

MSVC does not even define __STDC_IEC_559__. Tested by placing

#if defined(__STDC_IEC_559__)
#error __STDC_IEC_559__ is defined
#endif

#if __STDC_IEC_559__
#error __STDC_IEC_559__
#endif

into mathmodule.c to rule out any compiler options / defines that might influence it.

@skirpichev
Copy link
Contributor

Hmm, thanks, Chris.

On another hand, we have in tree C99+ fix for M$:

cpython/Modules/mathmodule.c

Lines 2385 to 2393 in 8a4d4f3

#ifdef _MSC_VER
/* Windows (e.g. Windows 10 with MSC v.1916) loose sign
for zero result. But C99+ says: "if y is nonzero, the result
has the same sign as x".
*/
if (r == 0.0 && y != 0.0) {
r = copysign(r, x);
}
#endif

The C99+ "says this" in Annex F.

@tim-one
Copy link
Member
tim-one commented Apr 26, 2025

if the implementation doesn't define STDC_IEC_559 - anything can b

I also (in addition to @chris-eibl) confirmed that MS's most recent compiler knows nothing about __STDC_IEC_559__

'__STDC_IEC_559__': undeclared identifier

It's not really surprising, though - it's Microsoft 😉.

that default rounding mode will play game, i.e. rounding to nearest

It's deeper than just that, though. The workaround I gave doesn't comply with STDC_IEC_559 either: to do so, it would have to respect the current rounding mode, not just nearest/even.

Now the multiplication at the end would do so in almost all real-life cases. But, e.g., under to-plus-infinity rounding, this would give a wrong result ldexpr(1.0, -1000000). That underflows to 0 under my ldexpr() under all rounding modes on Windows, but under +inf rounding should round to the smallest denorm instead.

There is a bug here on MS's side, but I've never seen them acknowledge it. Their own docs have never said anything about this, and certainly imply ldexp(x, n) acts the same as the mathematical n * 2**n.

I expect (but don't know) they'd call it a "doc bug".

@tim-one
Copy link
Member
tim-one commented Apr 26, 2025

For mpmath, I'd suggest calling the platform ldexp() in all cases. Then, on Windows, if the result r is such that -SMALLEST_NORM <= r <= SMALLEST_NORM , or. if you prefer,abs(r) <= SMALLEST_NORM . do it again with something like my ldexpr(). If r is a NaN or an inf, it won't pass those tests, so you don't have to worry about them.

However, unfortunately, +-0 do pass those tests. and those are probably common outputs. So add another

if not x:  # the _input_, not the result `r`
    return x

test to spare the expense.

Then the workaround knows up-front that it's getting a non-zero finite, and can skip testing for NaNs, infs, and zeros.

Why can't you just return a 0 result? Because it's not always right. It's possible for x*2**n to yield a result < ulp(TINY), but > ulp(TINY)/2, in which case it should round up to TINY. Like here:

>>> ldexp(0.5000001, -1074) # input a little over ulp(TINY)/2
0.0
>>> ldexpr(0.5000001, -1074) # which my workaround realizes
5e-324

@skirpichev
Copy link
Contributor

The workaround I gave doesn't comply with STDC_IEC_559 either: to do so, it would have to respect the current rounding mode, not just nearest/even.

I understand and it's fine. So far, CPython has no interface to change current rounding mode - thus we probably can assume default rounding.

off-topic on rounding mode support

Some such interface might be interesting: then the fp context of mpmath can take into account "rounding" context settings. Currently it's doable but I don't know how to make this in a cross-platform safe way (this is on Linux with Glibc 2.36):

>>> mp_d = MPContext(53, 'd')
>>> mp_u = MPContext(53, 'u')
>>> mp_d.mpf(libmp.mpf_exp(mpf(3)._mpf_, 53, 'd'))
mpf('20.085536923187664')
>>> mp_u.mpf(libmp.mpf_exp(mpf(3)._mpf_, 53, 'u'))
mpf('20.085536923187668')
>>> import ctypes
>>> libm = ctypes.CDLL('libm.so.6')
>>> libm.fesetround.argtypes = [ctypes.c_int]
>>> libm.fesetr
8000
ound.restype = ctypes.c_int
>>> libm.fesetround(0x400)  # magic constants might be different
0
>>> fp.exp(3)
20.085536923187664
>>> libm.fesetround(0x800)
0
>>> fp.exp(3)
20.085536923187668

There is a bug here on MS's side, but I've never seen them acknowledge it.

I suspect same for fmod() example above. (Is this workaround needed now?) Though, you might say that this case is different as our docs says: "Behavior in exceptional cases follows Annex F of the C99 standard where appropriate." Yet this still invalidates you argument "Python's math module generally intends to reproduce the platform's C library results for functions of the same name."

Also, saying "not a bug" - means that MSVC will never support Annex F. They claims that "Microsoft C++ (MSVC) is consistent with the IEEE numeric standards".

For mpmath, I'd suggest calling the platform ldexp() in all cases. Then, on Windows, if the result r is such that -SMALLEST_NORM <= r <= SMALLEST_NORM

Thanks, I'm already experimenting with such approach. This doesn't seems much better than current mpmath/mpmath#942.

@tim-one
Copy link
Member
tim-one commented Apr 26, 2025

So far, CPython has no interface to change current rounding mode - thus we probably can assume default rounding.

CPython internals do assume nearest/even. and it's deliberate that Python offers no portable way to change HW mode. Doing so would be a bottomless rat hole,. If rounding mode can change, CPYthon internals that rely on nearest/even would have to change/restore rounding mode on the fly. Besides being endlessly tedious, it's also expensive.

I suspect same for fmod() example above. (Is this workaround needed now?)

No idea, and no interest in finding out 😉.

Though, you might say that this case is different as our docs says: "Behavior in exceptional cases follows Annex F of the C99 standard where appropriate."

I absolutely say this case is different: we document up front what we're doing. And in real life almost nobody ever cares about the sign of a zero anyway 😉.

I could certainly live with CPython taking over ldexp on Windows too, provided that the docs clearly say we're doing so. Indeed, while barely anyone in real life cares about denorms for 8-byte floats (they do care for 4-byte floats, which have such relatively low precision), it would be of more real value than the tiny change to fmod().

Also, saying "not a bug" - means that MSVC will never support Annex F. They claims that "Microsoft C++ (MSVC) is consistent with the IEEE numeric standards".

In context, that page is talking about "IEEE Floating-Point Representation". They do fully conform with the standards in how floats are represented. Which is trivial on most boxes, because they're using the native HW representations. which themselves follow the standards for how floats are represented.

@tim-one
Copy link
Member
tim-one commented Apr 27, 2025

A possible improvement for my workaround function (which has now survived more substantial testing): the "shift down to the smallest normal binade" part is needed to ensure that ldexp(1.0, n) can't underflow to 0.0 in cases where that would matter. But, in fact, that's probably rarely needed, and can be skipped like so first:

    if n >= -1074 # hardcoded TINY_EXP-1
        # Close enough to +inf than 2**n can't underflow.
        factor = ldexp(1.0, n)
        assert factor
        return x * factor

Of course factor can be removed and the body compressed to a single multiply when you gain confidence in that the assert never fails. If the tests are any good 😉, the assert will eventually fail if you replace -1074 with -1075 (or any other more-negative int).

Note about the "-1": literature about the fp standards usually views the mantissa as being in the range [1.0, 2.0) But we get our exponents from frexp(), which uses the range [0.5, 1.0) instead. As a result, it's 0.5 * 2**TINY_EXP that's the smallest denorm, which is the same as 2**(TINY_EXP - 1).

@skirpichev
Copy link
Contributor

If rounding mode can change, CPYthon internals that rely on nearest/even would have to change/restore rounding mode on the fly.

I'm not sure if there are too many places to do this. Formatting, integer division, did I miss something else? The rest is a free lunch (assuming that platform libm supports custom rounding modes correctly). Though, it's not the current issue.

provided that the docs clearly say we're doing so

Well, we can say that the rounding direction mode for the math's functions is rounding to nearest, i.e. default for C99+. As CPython implementation detail, like for special values.

@tim-one
Copy link
Member
tim-one commented Apr 27, 2025

If rounding mode can change, CPYthon internals that rely on nearest/even would have to change/restore rounding mode on the fly.

I'm not sure if there are too many places to do this. Formatting, integer division, did I miss something else?

Certainly, but I'm not spending time doing exhaustive code review for something I have no interest in. The intent of rounding mode is to round the final result in the specified direction. On the way to that end, internal algorithms generally want the best rounding (nearest-even) they can get for intermediate results. mpmath internals can freely use extra precision at will to absorb intermediate rounding errors, but CPython's C code cannot. Native C double is it.

So review all the internal algorithms in statistics.py. review all uses of Veltkamp splitting and Dekker products (techniques @rhettinger frequently used to emulate extended precision), ...

EDIT: the very useful Veltkamp and Dekker techniques don't care about the "even" part of "nearest/even", but do rely on the "nearest" part. Alternative splitting methods can be contrived that work with other rounding modes, but that's another level of the rat hole I'm not diving into 😉.

The rest is a free lunch (assuming that platform libm supports custom rounding modes correctly).

As noted before, switching rounding modes is expensive. so I expect most libm authors also assume nearest/even for their internal algorithms.

Though, it's not the current issue.

No, it certainly isn't 😉.

@zooba
Copy link
Member
zooba commented Apr 28, 2025

Tested on my own machine:

Python 3.14.0a7+ (heads/pymanager-dirty:9156ec6020f, Apr 23 2025, 12:31:57) [MSC v.1944 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import math
>>> math.ldexp(6993274598585239, -1126)
1e-323

We're going to need more information. Specifically, we need the exact Windows version, and if it's easy enough, find the exact version of ucrtbase.dll (which likely has the implementation in it).

It's very likely that any fix here is going to be "update your OS to a working one". As a general rule, we prefer to expose native functionality directly, so that we aren't responsible for providing bug fixes for issues that belong to the OS.

@zooba
Copy link
Member
zooba commented Apr 28, 2025

if it's easy enough, find the exact version of ucrtbase.dll (which likely has the implementation in it).

Actually, it is easy enough. Grab this script from our repo and run it with python validate_ucrtbase.py ucrtbase. It'll print the exact version of the C runtime you've got.

For reference, mine is 10.0.26100.3624. If yours is earlier than that, the fix is to update your OS.

@skirpichev skirpichev added the pending The issue will be closed if no feedback is provided label Apr 28, 2025
@skirpichev
Copy link
Contributor

@zooba you might miss in the thread:
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] on win32
So, at least OP's CPython was compiled by different MSC version.

@zooba
Copy link
Member
zooba commented Apr 28, 2025

you might miss in the thread

I didn't, but it's irrelevant. The implementation of this function is part of the OS, not part of the compiler, so updating the OS will fix it and Python doesn't have to do a thing.

@tim-one
Copy link
Member
tim-one commented Apr 28, 2025

@zooba

you're on a much older version of Windows 10 (update 2004, which means released around April 2020). And in fact, it's now out of support, so we don't support it for Python either.

Are you sure? I first installed Win10 some years ago, but I've applied every update Windows Update ever offered, at least twice almost every month since. Most recently last week, updates to the OS and to .NET. Never a word about being out of support.

I can't install Win11, in part because I don't have a TPM, and more infuriatingly because it complains about my CPU. Which is a fast 4-core Intel CPU far more powerful than Windows needs (and 16 GB of RAM, terabytes of free disk space. both hard drive and SSD - it's a substantial box).

I could buy a new box ... but the prospect of spending days in all reinstalling apps is utterly depressing ☹ Luckily, I don't personally care about whether ldexp() rounds denorms 😉.

@zooba
Copy link
Member
zooba commented Apr 28, 2025

I can't install Win11, in part because I don't have a TPM, and more infuriatingly because it complains about my CPU

The TPM requirement is what was later removed, but if it has decided it doesn't like your CPU (which seems weird... Intel CPUs should basically all be fine - motherboard would be more likely).

Unfortunately, Windows does this thing where when it decides your machine can't handle an update, it never actually tells you. I assumed one of mine was up to date, but it turns out they'd blocked major updates for a year because of an audio driver (which I only found out after I bricked it...)

I wouldn't tell you to buy a new box, but it's possible that if you get the installation assistant from https://www.microsoft.com/software-download/windows11/ it'll tell you that you can now get the update, or point at exactly what the issue is.

I don't particularly want to fill Python up with OS version-specific workarounds for bugs that have been resolved (but not necessarily installed), but if this particular case is upsetting people and we can special-case the inputs (rather than the OS), then sure, why not.

@chris-eibl
Copy link
Member

Now I got curious :)

I have old and new hardware. All are updated regularily, last Windows update this morning.

Old desktop:

New laptop:

  • Win11 24H2 26100.3915
  • ucrtbase 10.0.26100.3912
  • 1e-323

@chris-eibl
Copy link
Member
chris-eibl commented Apr 28, 2025

Ah yeah, you're on a much older version of Windows 10 (update 2004, which means released around April 2020). And in fact, it's now out of support, so we don't support it for Python either.

I am pretty sure, Tim is on 22H2, too. Seems one cannot conclude from ucrtbase version to Windows version?
@tim-one: in a console, you can use ver. Or in the start menu enter winver to find out.

BTW:
Quote from https://learn.microsoft.com/lifecycle/products/windows-10-home-and-pro

Windows 10 will reach end of support on October 14, 2025

Quote from PEP 745 – Python 3.14 Release Schedule

3.14.0 final: Tuesday, 2025-10-07

OOI: just a few days before Windows 10 EOL. According to https://peps.python.org/pep-0011/#microsoft-windows it should be supported? Is there already a decision, whether 3.14 will drop Win10 support?

@tim-one
Copy link
Member
tim-one commented Apr 28, 2025

@chris-eibl

I am pretty sure, Tim is on 22H2,

So am I 😉. That's what winver says. ver just gives the build number:

C:\Code\Python>ver
Microsoft Windows [Version 10.0.19045.5796]

So appears to be the same as yours. We also have the same ucrtbase version numbers.

@zooba

The TPM requirement is what was later removed, but if it has decided it doesn't like your CPU (which seems weird... Intel CPUs should basically all be fine - motherboard would be more likely).

Alas, following the link you gave just ended up downloading the "PC Health Check" app, which I've tried all along over the last year. The output is always the same:

TPM 2.0 must be supported and enabled on this PC.

That shows up with a yellow circle around an exclamation mark. I understand there are "unadvertised" ways to install Win11 anyway, but there's another:

The processor isn't currently supported for Windows 11.
Intel Core i7-4790 CPU @ 3.60GHz

That shows up with a red circle around an X.

@chris-eibl said "MS requires at least 8th gen", so that could be it.

I'm hosed 😉.

Thanks for the eyeballs! I appreciate the effort.

@zooba
Copy link
Member
zooba commented Apr 28, 2025

ucrtbase 10.0.19041.3636; interesting 19041 vs 19045. Seems, this does not update to the same degree as Windows?

Windows runs a daily build and caches entire binaries, so if the sources are unchanged it won't be rebuilt, and will eventually be shipped. This is what broke our old version detection for the OS - we were reading it from kernel32.dll, and in one release it didn't change on the last day so the version number was one below. (To be fair, Windows only got the caching working this well around this time, and previously everything was always rebuilt so the version number was reliable.)

chris-eibl said "MS requires at least 8th gen", so that could be it.

I was skeptical, but I just checked my 8 year old laptop and it's 8th gen, so I guess that may be it. Sorry!

That would, however, put the bug without our supported range of versions. We don't get to detect it at compile-time, unfortunately, and I'd rather not add a (relatively slow) version check for all Windows users for all time because of this. If it's just this one set of inputs that's affected, then special-casing that case would be fine, though that seems unlikely.

Do we have a numeric stability policy for this module? In general, I'd expect "do what the OS does" to be our policy, like elsewhere, and would need a very compelling reason to bypass that.

@tim-one
Copy link
Member
tim-one commented Apr 29, 2025

@zooba, t's a puzzle. What should happen: MS should push a fix into a Win10 update before support ends. Since they did fix it for Win11, there's no plausible "but we can't!" argument to be made. It's a standard C library function, and the bug isn't really subtle: it's plainly wrong by all relevant standards. doesn't match MS's own docs, and is implemented correctly on all other known platforms.

I'm told Win10 still has outright majority share across all desktop OSes. Since I can't upgrade to Win11, I'll probably stick with Win10 until my HW dies (I've already, e.g., replaced two failing drives, a failed SSD "accelerator", and a dead cooling fan). Hard to believe I won't be joined by millions in that. Win10 will be with us for years yet in reality.

It's not so much that end users care about ldexp(), but that it's a basic building block for people writing numeric libraries. Errors in ldexp() pollute every library function that builds on it.

If MS won't fix it, I don't believe CPython should ignore it. As @skirpichev pointed out, we took over part of fmod() on Windows to spare users from a less consequential bug, but that was technically trivial to do. I agree with exposing the platform libm, but not really in cases where the platform is inarguably dead wrong by all metrics. Unless the platform itself dies out, it will eventually get fixed by the platform The d 10000 ifference between Win10 and Win11 is crucial to Microsoft, but to plain users they're just gratuitously different names for "Windows" 😉.

@skirpichev
Copy link
Contributor

Do we have a numeric stability policy for this module?

We have (as CPython implementation detail): "The math module consists mostly of thin wrappers around the platform C math library functions. Behavior in exceptional cases follows Annex F of the C99 standard where appropriate."

So, fixes for exceptional cases (zeros, infinities, nans) are expected (you can see modf() workaround for Windows). Technically, this issue not fits into it.

Though, a long time after this remark was added (by Mark Dickinson) - CPython had (til fa26245) own implementations for a number of libm's functions (e.g. acosh). So, I would interpret "follows to Annex F" in a more wide sense: not just for special values, but for the rest of Annex F as well, i.e. rounding.

I'd expect "do what the OS does" to be our policy

It isn't that helpful. Much better if people can expect if module functions will follow to some known standard, like the decimal module does. I think that pure-Python modules like the mpmath aren't the right place to fix OS-specific quirks.

BTW, we already require support for IEEE 754 floating- point numbers for CPython building. While platforms may not declare support for Annex F (does Windows 11 does this?), I think it's reasonable to believe that things will evolve toward better support.

So, as long as Windows 10 is supported by CPython - I think we should put in a workaround (included conditionally for all Windows systems, like that for fmod()). I'll come with a patch, based on @tim-one code, after more benchmarks.

@skirpichev skirpichev added the needs backport to 3.13 bugs and security fixes label Apr 29, 2025
@skirpichev
Copy link
Contributor

Here are my result (1st case corresponds to "normal" situation, when workaround not triggered):

Benchmark ref patch
ldexp(0x1.0000000000000p+0, 1) 220 ns 227 ns: 1.03x slower
ldexp(0x1.0000000000000p-1022, -1) 312 ns 411 ns: 1.32x slower
ldexp(0x1.8d858a043f397p+52, -1126) 297 ns 408 ns: 1.37x slower
Geometric mean (ref) 1.23x slower

I'll appreciate if someone repeat these tests on less noisy system, but it seems that we have some slowdown only for denormal results, as expected.

BTW, nowadays we can use PyLong_AsInt() in ldexp(). That will simplify function and remove need for type-cast of the exp. Does it make sense? If so, I'll include this refactoring to final patch - it can be backported to 3.13 too.

A patch (tested on Linux, of course):

diff --git a/Modules/mathmodule.c b/Modules/mathmodule.c
index 11d9b7418a..dfee7865e6 100644
--- a/Modules/mathmodule.c
+++ b/Modules/mathmodule.c
@@ -2163,6 +2163,23 @@ math_ldexp_impl(PyObject *module, double x, PyObject *i)
     } else {
         errno = 0;
         r = ldexp(x, (int)exp);
+#if 1 // _MSC_VER
+        if (r && r > -DBL_MIN && r < DBL_MIN) {
+            int original_exp;
+            frexp(x, &original_exp);
+            if (original_exp > DBL_MIN_EXP) {
+                /* Shift down to the smallest normal binade.  No bits lost. */
+                x = ldexp(x, DBL_MIN_EXP - original_exp);
+                exp += original_exp - DBL_MIN_EXP;
+            }
+            /* Multiplying by 2**exp finishes the job, and the HW will round as
+               appropriate.  Note: if exp < -DBL_MANT_DIG, all of x is shifted to
+               be < 1/2 ULP of smallest denorm, so should be thrown away.  If exp
+               is so very negative that ldexp underflows to 0, that's fine; no
+               need to check in advance. */
+            r = x*ldexp(1.0, (int)exp);
+        }
+#endif
         if (isinf(r))
             errno = ERANGE;
     }

A benchmark:

# bench.py
import pyperf
from math import ldexp
tiny = float.fromhex('0x1p-1022')
runner = pyperf.Runner()
for x, i in [(1.0, 1),
             (float.fromhex('0x1p-1022'), -1),
             (6993274598585239.0, -1126)]:
    s = f"ldexp({x.hex()}, {i})"
    runner.bench_func(s, ldexp, x, i)

@zooba
Copy link
Member
zooba commented Apr 29, 2025

Okay, sure, but at least include a comment clarifying that it's a known issue in some older versions of the C runtime, and it's not worth trying to identify the version being used at runtime.

I know the process involved in getting non-security fixes applied to Windows 10 at this stage, it's very unlikely anyone is going to consider this worthwhile for servicing (unless we can show it's actually impacting some big paid piece of software). Pragmatism beats purity in this area.

@skirpichev skirpichev removed the pending The issue will be closed if no feedback is provided label Apr 29, 2025
skirpichev added a commit to skirpichev/cpython that referenced this issue Apr 29, 2025
Co-authored-by: Tim Peters <tim.peters@gmail.com>
@skirpichev
Copy link
Contributor

PR is ready for review: #133135

@skirpichev skirpichev removed their assignment Apr 29, 2025
tim-one added a commit that referenced this issue May 26, 2025
* gh-132876: workaround broken ldexp() on Windows 10

ldexp() fails to round subnormal results before Windows 11,
so hide their bug.

Co-authored-by: Tim Peters <tim.peters@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 26, 2025
…3135)

* pythongh-132876: workaround broken ldexp() on Windows 10

ldexp() fails to round subnormal results before Windows 11,
so hide their bug.
(cherry picked from commit cf8941c)

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Co-authored-by: Tim Peters <tim.peters@gmail.com>
tim-one added a commit that referenced this issue May 26, 2025
…#134684)

gh-132876: workaround broken ldexp() on Windows 10 (GH-133135)

* gh-132876: workaround broken ldexp() on Windows 10

ldexp() fails to round subnormal results before Windows 11,
so hide their bug.
(cherry picked from commit cf8941c)

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Co-authored-by: Tim Peters <tim.peters@gmail.com>
tim-one added a commit that referenced this issue May 26, 2025
…#134685)

* gh-132876: workaround broken ldexp() on Windows 10

ldexp() fails to round subnormal results before Windows 11,
so hide their bug.
(cherry picked from commit cf8941c)

Co-authored-by: Tim Peters <tim.peters@gmail.com>
@skirpichev
Copy link
Contributor

Fixed and backported.

@picnixz picnixz added 3.13 bugs and security fixes 3.14 bugs and security fixes and removed needs backport to 3.13 bugs and security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes labels Jun 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir OS-windows type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

7 participants
0