8000 Allow long numbers in numpy.rec.array formats string by cgohlke · Pull Request #376 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Allow long numbers in numpy.rec.array formats string #376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 2, 2012
Merged

Allow long numbers in numpy.rec.array formats string #376

merged 4 commits into from
Sep 2, 2012

Conversation

cgohlke
Copy link
Contributor
@cgohlke cgohlke commented Aug 6, 2012

On win-amd64, some pyfits 3.0.9 tests fail because the np.rec.array formats argument may contain Python long numbers. A reduced example is:

>>> import numpy as np
>>> np.rec.array(None, shape=1L, formats='(1L, 1L)i4')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "X:\Python27-x64\lib\site-packages\numpy\core\records.py", line 749, in array
    aligned, byteorder)._descr
  File "X:\Python27-x64\lib\site-packages\numpy\core\records.py", line 143, in __init__
    self._parseFormats(formats, aligned)
  File "X:\Python27-x64\lib\site-packages\numpy\core\records.py", line 157, in _parseFormats
    dtype = sb.dtype(formats, aligned)
  File "X:\Python27-x64\lib\site-packages\numpy\core\_internal.py", line 234, in _commastring
    newitem = (dtype, eval(repeats))
  File "<string>", line 1
    (1
     ^
SyntaxError: unexpected EOF while parsing

Tested with numpy 1.6.2 on win-amd64-py2.7 and win32-py2.7. Will work on a unit test later...

@travisbot
Copy link
8000 travisbot commented Aug 6, 2012

This pull request passes (merged d009f0f into fcdbcac).

@travisbot
Copy link

This pull request passes (merged 1079fa4 into fcdbcac).

@njsmith
Copy link
Member
njsmith commented Aug 7, 2012

Can you elaborate on why this is a bug in numpy ("bug: it should accept Python's weird integer repr formats") instead of a bug in pyfits ("bug: it should use standard integer formats")?

(Presumably the fix in pyfits would be to just use %s instead of %r somewhere. str() on integers produces standard format, repr() can do odd things like this.)

@cgohlke
Copy link
Contributor Author
cgohlke commented Aug 8, 2012

It is common to end up with long numbers in strings:

>>> str((1L,))
'(1L,)'

Or, probably more relevant, on win-amd64-py2.x:

>>> str(np.zeros((1,)).shape)
'(1L,)'

Besides that, I find it surprising that np.rec.array(None, shape=1, formats='(1L, 1L)i4') should raise a SyntaxError while eval("np.dtype([('', 'i4', (1L, 1L))])") succeeds.

@cgohlke
Copy link
Contributor Author
cgohlke commented Aug 8, 2012

Just noticed that np.dtype('(1L,)i4') also fails with the same SyntaxError. So the test should probably be simplified and moved.

@travisbot
Copy link

This pull request fails (merged e391a44 into fcdbcac).

@travisbot
Copy link

This pull request passes (merged 1e7979f into fcdbcac).

@certik
Copy link
Contributor
certik commented Aug 31, 2012

This looks good to me. Can somebody with push access merge this?

@njsmith
Copy link
Member
njsmith commented Aug 31, 2012

I really think that as a general principle, we should be encouraging people to use the nice clean Python-data-structure ways of specifying dtypes, instead of adding more elaborate hacks to the domain-specific language parser. The bug here is that PyFITS has some nice Python representation of what it wants, but then instead of just giving us that representation it tries to serialize it into some string form, and its serialization code is, of course, buggy. This is a bad idea in exactly the same way that constructing source code by string manipulation for eval() is a bad idea.

That said, we might as well merge this, given that np.rec.array is itself a grungy old interface that can't be adapted to handle anything except the bad old string-style formats= argument.

(Can we deprecate formats=?)

@certik
Copy link
Contributor
certik commented Aug 31, 2012

The way I see it, is that if (and this is the big if) we have a code like this:

format_re = re.compile(asbytes(
                           r'(?P<order1>[<>|=]?)'
                           r'(?P<repeats> *[(]?[ ,0-9]*[)]? *)'
                           r'(?P<order2>[<>|=]?)'
                           r'(?P<dtype>[A-Za-z0-9.]*(?:\[[a-zA-Z0-9,.]+\])?)'))

then we can just as well have a code like this:

format_re = re.compile(asbytes(
                           r'(?P<order1>[<>|=]?)'
                           r'(?P<repeats> *[(]?[ ,0-9L]*[)]? *)'
                           r'(?P<order2>[<>|=]?)'
                           r'(?P<dtype>[A-Za-z0-9.]*(?:\[[a-zA-Z0-9,.]+\])?)'))

Nothing has changed in terms of maintenance, simplicity, clarity or speed. But a bug was fixed.

However, we can of course discuss whether we should have such a code in the first place, but that's a separate issue as I see it.

@certik
Copy link
Contributor
certik commented Aug 31, 2012

@njsmith, let me know if you are ok with merging this PR.

@njsmith
Copy link
Member
njsmith commented Sep 1, 2012

I already said I was :-)

("That said, we might as well merge this...")

certik added a commit that referenced this pull request Sep 2, 2012
Allow long numbers in numpy.rec.array formats string
@certik certik merged commit 9652574 into numpy:master Sep 2, 2012
luyahan pushed a commit to plctlab/numpy that referenced this pull request Apr 25, 2024
feat: Add vclt[q|s|d]_[s64|u64|f32|f64]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0