repr doesn't roundtrip for float32 dtype #9360

mdickinson · 2017-07-04T19:23:30Z

It seems that the np.float32 type doesn't have a round-trippable repr:

>>> x = np.float32(1024 - 2**-14)
>>> y = np.float32(1024 - 2**-13)
>>> x == y  # get False, as expected
False
>>> repr(x) == repr(y)  # expecting False
True
>>> np.float32(repr(x)) == x  # expecting True
False
>>> np.float32(repr(y)) == y  # get True, as expected
True

Looking at the source, 8 significant digits are used for the repr of an np.float32, but the IEEE 754 binary32 format requires 9 digits to roundtrip correctly.

Perhaps this is intentional, but it seems surprising.

[Versions: Python 3.6.1, numpy 1.13.0, macOS 10.10.5]

The text was updated successfully, but these errors were encountered:

seberg · 2017-07-04T19:48:40Z

Frankly, I find this slightly disturbing, though it must have been there for many many years, should be fixed in any case in my opinion.

eric-wieser · 2017-07-04T19:50:32Z

the IEEE 754 binary32 format requires 9 digits to roundtrip correctly.

If that' 8000 s the case, why is np.finfo(np.float32).precision == 6?

mdickinson · 2017-07-04T19:54:42Z

@eric-wieser: the np.finfo precision values are the maximum decimal precisions for which decimal -> binary -> decimal recovers the original value; for repr, you want something different: the minimum decimal precision for which binary -> decimal -> binary recovers the value. (Assuming IEEE 754 formats, the relevant values are 3 and 5 for float16, 6 and 9 for float32 and 15 and 17 for float64.)

eric-wieser · 2017-07-04T20:10:15Z

Well explained. Seems like we should also expose those larger numbers in finfo too then (and perhaps be more precise in the documentation for precision). Can you think of a suitable name?

Also, can you cite a source for those numbers?

mdickinson · 2017-07-04T21:10:11Z

As far as sources go, the C99 standard has the relevant formulas, in section 5.2.4.2.2p9: for binary -> decimal -> binary roundtrip for a binary format with precision p (which is what we want for repr), the formula is 1 + ceil(p * log10(2)); for p=11, 24 and 53 this gives 5, 9 and 17 respectively. For the precision, we want floor((p-1) * log10(2)), which is where the 3, 6 and 15 values come from. IEEE 754-2008 also gives the 5, 9 and 17 values explicitly in section 5.12.2

I've never found non-paywalled proofs of those formulas, but they're not hard to prove directly: here are some proofs I wrote up last year, after getting annoyed at not finding anything online.

Seems like we should also expose those larger numbers in finfo too then

Sounds good in principle. The catch would be that C99 provides the precision numbers directly for float, double and long double, without any assumption of IEEE 754, under the names FLT_DIG, DBL_DIG and LDBL_DIG; I assume that that's where np.finfo is getting them from. In the other direction, it only defines one number, DECIMAL_DIG, which is the:

number of decimal digits, n, such that any floating-point number in the widest supported floating type with pmax radix b digits can be rounded to a floating-point number with n decimal digits and back again without change to the value,

C11 does provide separate FLT_DECIMAL_DIG, DBL_DECIMAL_DIG and LDBL_DECIMAL_DIG macros, but I don't know how well current compilers support those.

Can you think of a suitable name?

Not right now. Naming things is hard. :-)

eric-wieser · 2017-07-04T21:15:20Z

for a binary format with precision p

Here you're using precision to refer to the number of bits in the mantissa, right?

The catch would be that C99 provides the precision numbers directly for float, double and long double, without any assumption of IEEE 754, under the names FLT_DIG, DBL_DIG and LDBL_DIG; I assume that that's where np.finfo is getting them from.

Nope, they're hard-coded in the python code, calculated from finfo.eps, so this isn't a problem. I don't think the calculations there match your equations though, so perhaps they should be fixed

mdickinson · 2017-07-05T06:01:46Z

Here you're using precision to refer to the number of bits in the mantissa, right?

Aargh, yes. Too many precisions. Yes, the number of bits in the significand, including the implicit bit where relevant.

Nope, they're hard-coded in the python code, calculated from finfo.eps

Ah, right. I was making bad assumptions, then.

I don't think the calculations there match your equations though

At a quick glance, it looks the same to me: for a (binary)precision-p binary format, eps should be 2**(1-p), so self.precision = int(-log10(self.eps)) is computing floor((p-1) * log10(2)).

eric-wieser referenced this issue in leozide/leocad Jul 21, 2017

Increased precision of angle constants.

b2c739d

mdickinson mentioned this issue Oct 30, 2017

ENH: Use Dragon4 algorithm to print floating values #9941

Merged

charris closed this as completed in 9ab9e8b Nov 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

repr doesn't roundtrip for float32 dtype #9360

repr doesn't roundtrip for float32 dtype #9360

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

repr doesn't roundtrip for float32 dtype #9360

repr doesn't roundtrip for float32 dtype #9360

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!