Incorrect conversion to Int64 by loadtxt (traced to _getconv in numpy.lib.io) (Trac #1163) #1761

thouis · 2012-10-19T20:50:28Z

Original ticket http://projects.scipy.org/numpy/ticket/1163 on 2009-07-09 by trac user onsi, assigned to unknown.

I'm running version 1.2.1 but this error should also occur in 1.3.0 based on the source currently in the trunk.

I try importing the following ascii data stored in "sample.csv":

9007200000000000,670927001710,0.010190886[[BR]]
9007200000000001,660927001348,0.00976051[[BR]]
9007200000000002,650883003926,0.009154096

using (maximal verbosity for clarity):

import numpy
arr=numpy.loadtxt("sample.csv",dtype=[('id0',numpy.int64),('id1',numpy.int64),('flt',numpy.float32)],delimiter=',',comments='#')

I get:

[(9007200000000000L, 670927001710L, 0.010190886445343494)[[BR]](9007200000000000L, 660927001348L, 0.0097605101764202118)[[BR]](9007200000000002L, 650883003926L, 0.009154096245765686)][[BR]]

After some digging, i found the culprit to be the converter used by loadtxt to convert strings to dtypes. lib.io._getconv (line 352 in trunk) returns:

lambda x: int(float(x))

as the converter for any dtype that is a subclass of int, which int64 is. Unfortunately, float does not faithfully reproduce long integers and so 9007200000000001 gets rounded to 9007200000000000.

This is fairly serious as int64s are often used as IDs in various numerical/simulation contexts. Changing the converter to int() should resolve this problem -- though then some error checking needs to take place to ensure that int is fed an integer string.

The text was updated successfully, but these errors were encountered:

numpy-gitbot · 2012-10-23T02:41:36Z

Milestone changed to 1.4.0 by @cournape on 2009-11-25

numpy-gitbot · 2012-10-23T02:41:36Z

@WarrenWeckesser wrote on 2010-08-18

There is a thread in the mailing list about this problem:
http://mail.scipy.org/pipermail/numpy-discussion/2010-August/052289.html

numpy-gitbot · 2012-10-23T02:41:36Z

Milestone changed to Unscheduled by @mwiebe on 2011-03-24

numpy-gitbot · 2012-10-23T02:41:37Z

trac user dynetrekk wrote on 2011-03-26

I suggest a patch here. The patch will work both for long int as well as for converting floats into ints.

diff -r 2763b87dd7e8 -r 8eaaeb6ed8f3 numpy/lib/npyio.py
--- a/numpy/lib/npyio.py    Fri Mar 25 22:37:19 2011 -0600
+++ b/numpy/lib/npyio.py    Sat Mar 26 12:40:26 2011 +0100
@@ -566,7 +566,20 @@
     if issubclass(typ, np.bool_):
         return lambda x: bool(int(x))
     if issubclass(typ, np.integer):
-        return lambda x: int(float(x))
+        def _intconv(x):
+            try:
+                # This works for long integer, for example:
+                # >>> int('123456789123456789123456789123456789')
+                # 123456789123456789123456789123456789L
+                
+                y = int(x)
+            except ValueError:
+                # This will work if the number is a float, for example:
+                # >>> int(float('1.23e45'))
+                # 1229999999999999973814869011019624571608236032L
+                y = int(float(x))
+            return y
+        return _intconv
     elif issubclass(typ, np.floating):
         return float
     elif issubclass(typ, np.complex):

numpy-gitbot · 2012-10-23T02:41:37Z

@rgommers wrote on 2011-03-31

It would be helpful if the patch includes a test.

numpy-gitbot · 2012-10-23T02:41:37Z

Milestone changed to 1.6.0 by @rgommers on 2011-03-31

numpy-gitbot · 2012-10-23T02:41:37Z

@rgommers wrote on 2011-04-02

Closing as duplicate of #2162. This one's older, but #2162 has a more complete patch and more discussion.

This should be fixed for 1.6.0.

thouis closed this as completed Oct 19, 2012

numpy-gitbot mentioned this issue Oct 23, 2012

loadtxt fails to load large unsigned int64 integers. (Trac #1565) #2162

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Incorrect conversion to Int64 by loadtxt (traced to _getconv in numpy.lib.io) (Trac #1163) #1761

Incorrect conversion to Int64 by loadtxt (traced to _getconv in numpy.lib.io) (Trac #1163) #1761

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Incorrect conversion to Int64 by loadtxt (traced to _getconv in numpy.lib.io) (Trac #1163) #1761

Incorrect conversion to Int64 by loadtxt (traced to _getconv in numpy.lib.io) (Trac #1163) #1761

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!