E474 BUG: numpy._core.memmap Error when offset is a multiple of allocation granularity · Issue #27722 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content
BUG: numpy._core.memmap Error when offset is a multiple of allocation granularity #27722
@laarohi

Description

@laarohi

Describe the issue:

numpy.memmap fails when attempting to create an empty memmap that has an offset which is a multiple of mmap.ALLOCATIONGRANULARITY.

The issue occurs due to the fact that under these conditions, numpy.memmap calls mmap.mmap which length=0 which numpy assumes means a length of zero however as described in the mmap python docs:

If length is 0, the maximum length of the map will be the current size of the file when mmap is called.

Reproduce the code example:

import numpy as np

def empty_memmap(offset):
    fname = "test.dat"
    a = np.array([])  # empty array 
    with open(fname, 'wb+') as f:
        f.write(b'c'*offset)
        mm = np.memmap(f, shape=a.shape, dtype=a.dtype, offset=offset, mode='r+')
        print(mm)

# This works
empty_memmap(4321)

# This fails
empty_memmap(4096)

# This also fails
empty_memmap(2*4096)

Error message:

Traceback (most recent call last):
  File "/home/luke/m.py", line 15, in <module>
    empty_memmap(4096)
  File "/home/luke/m.py", line 8, in empty_memmap
    mm = np.memmap(f, shape=a.shape, dtype=a.dtype, offset=offset, mode='r+')
  File "/home/luke/user310/lib/python3.10/site-packages/numpy/_core/memmap.py", line 280, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
ValueError: mmap offset is greater than file size

Python and NumPy Versions:

2.1.3
3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]

Runtime Environment:

[{'numpy_version': '2.1.3',
'python': '3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]',
'uname': uname_result(system='Linux', node='titan', release='5.15.0-122-generic', version='#132-Ubuntu SMP Thu Aug 29 13:45:52 UTC 2024', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}}]

Context for the issue:

While this is an extreme edge case it is also important to resolve. In my case I had a program which had been running reliably in a production setting for number of years and which crashed due to this issue when the offset of my memmap coincidentally happened to be a multiple of the allocation ganularity.

The fix to this issue is fairly straightforward and I will be submitting a pull request shortly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0