8000 BUG: MemoryError when indexing 2D StringDType array with a list index · Issue #27737 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: MemoryError when indexing 2D StringDType array with a list index #27737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SamAdamDay opened this issue Nov 11, 2024 · 1 comment · Fixed by #27715
Closed

BUG: MemoryError when indexing 2D StringDType array with a list index #27737

SamAdamDay opened this issue Nov 11, 2024 · 1 comment · Fixed by #27715
Labels
00 - Bug component: numpy.strings String dtypes and functions

Comments

@SamAdamDay
Copy link

Describe the issue:

Trying to index a StringDType array of shape (1, 1), where the single string has length more than 15, using a list results in a MemoryError. This also happens when indexing with an array.

Specifically, this error appears when this array is printed, or (more directly), when it is accessed at (-1,-1).

Possibly related to #27710.

The issue does not appear when:

  • The single string has length 15
  • The array has shape (1, )

Additionally, I get SystemError: error return without exception set.

Reproduce the code example:

import numpy as np
from numpy.dtypes import StringDType

ok  = np.array([["abcdefghijklmno"]], dtype=StringDType())
bad = np.array([["abcdefghijklmnop"]], dtype=StringDType())

ok[[0]][-1, -1]
bad[[0]][-1, -1]

# These also raise errors:
bad[np.array([0])][-1, -1]
repr(bad[[0]])

# However this does not:
ok_2 = np.array(["abcdefghijklmnop"], dtype=StringDType())
repr(ok_2[[0]])

Error message:

Traceback (most recent call last):
  File "bug.py", line 8, in <module>
    bad[[0]][-1, -1]
    ~~~~~~~~^^^^^^^^
MemoryError: Failed to load string in StringDType getitem
Traceback (most recent call last):
  File "bug.py", line 8, in <module>
    bad[[0]][-1, -1]
    ~~~~~~~~^^^^^^^^
SystemError: error return without exception set

Python and NumPy Versions:

2.2.0.dev0+git20241111.fd4f467
3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0]

Runtime Environment:

[{'numpy_version': '2.2.0.dev0+git20241111.fd4f467',
'python': '3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0]',
'uname': uname_result(system='Linux', node='laozi', release='6.8.0-47-generic', version='#47-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 21:40:26 UTC 2024', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL',
'AVX512_SPR']}}]

Context for the issue:

I found the bug because I wanted to randomly permute an array of strings. When I tried to print the result, I got a memory error.

Being able to index a StringDType array using an array seems like important functionality, because I can't think of a workaround other than iterating over the index array using a Python loop.

@ngoldbaum
Copy link
Member

Thanks for the re 8D41 port! This is fixed by #27715.

🐍 Launching Python with PYTHONPATH="/Users/goldbaum/Documents/numpy/build-install/usr/lib/python3.13/site-packages"
$ /Users/goldbaum/.pyenv/versions/3.13.0/bin/python -P ../numpy-experiments/test.py
abcdefghijklmno
abcdefghijklmnop
abcdefghijklmnop
array([['abcdefghijklmnop']], dtype=StringDType())

goldbaum at Nathans-MBP in ~/Documents/numpy on fix-generic-fancy-index-cast
± cat ../numpy-experiments/test.py
import numpy as np
from numpy.dtypes import StringDType

ok  = np.array([["abcdefghijklmno"]], dtype=StringDType())
bad = np.array([["abcdefghijklmnop"]], dtype=StringDType())

print(ok[[0]][-1, -1])
print(bad[[0]][-1, -1])

# These also raise errors:
print(bad[np.array([0])][-1, -1])
print(repr(bad[[0]]))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: numpy.strings String dtypes and functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0