Description
In use with multiprocessing (e.g. when pickling awkward
arrays to transmit them from a subprocess to the host), I see issues with importlib_metadata
. This was originally observed by @richeldichel as multiprocessing.pool.MaybeEncodingError: Error sending result:
with then appended BadZipFile
error or OSError
. I could get this down to the following minimal working example that sometimes (every second to every tenths try) reproduces the underlying error:
import importlib_metadata
dists = importlib_metadata.MetadataPathFinder().find_distributions()
eps = [dist.entry_points for dist in dists]
import multiprocessing
def process(i):
importlib_metadata.entry_points()
return
with multiprocessing.Pool(processes=8) as pool:
for _ in pool.imap_unordered(process, range(100)):
pass
(edit: this is a more simplified version:)
import importlib_metadata
dists = importlib_metadata.MetadataPathFinder().find_distributions()
eps = [dist.entry_points for dist in dists]
import multiprocessing
def process(i):
dists = importlib_metadata.MetadataPathFinder().find_distributions()
[dist._normalized_name for dist in dists]
return
with multiprocessing.Pool(processes=8) as pool:
for _ in pool.imap_unordered(process, range(100)):
pass
Tested with Python 3.10.12 with importlib-metadata==8.7.0
and Python 3.8.10 with importlib-metadata==8.5.0
. It has to be noted that in the latter test environment, the issue only occurs for me when doing export PYTHONPATH=/home/.../.local/lib/python3.8/site-packages/
beforehand.
Below is an exemplary error log:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/user/delthis/test6.py", line 9, in process
importlib_metadata.entry_points()
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1094, in entry_points
return EntryPoints(eps).select(**params)
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1091, in <genexpr>
eps = itertools.chain.from_iterable(
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/_itertools.py", line 17, in unique_everseen
k = key(element)
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/compat/py39.py", line 23, in normalized_name
return dist._normalized_name
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1016, in _normalized_name
or super()._normalized_name
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 552, in _normalized_name
return Prepared.normalize(self.name)
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 547, in name
return md_none(self.metadata)['Name']
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 528, in metadata
or self.read_text('PKG-INFO')
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 998, in read_text
return self._path.joinpath(filename).read_text(encoding='utf-8')
File "/home/user/.local/lib/python3.10/site-packages/zipp/__init__.py", line 382, in read_text
with self.open('r', encoding, *args, **kwargs) as strm:
File "/home/user/.local/lib/python3.10/site-packages/zipp/__init__.py", line 348, in open
stream = self.root.open(self.at, zip_mode, pwd=pwd)
File "/usr/lib/python3.10/zipfile.py", line 1546, in open
raise BadZipFile("Bad magic number for file header")
zipfile.BadZipFile: Bad magic number for file header
I also saw zipfile.BadZipFile: Overlapped entries: 'EGG-INFO/PKG-INFO' (possible zip bomb)
and
File "/home/.../.local/lib/python3.8/site-packages/zipp/__init__.py", line 385, in read_text
return strm.read()
File "/usr/lib/python3.8/zipfile.py", line 928, in read
buf += self._read1(self.MAX_N)
File "/usr/lib/python3.8/zipfile.py", line 1010, in _read1
data += self._read2(n - len(data))
File "/usr/lib/python3.8/zipfile.py", line 1042, in _read2
data = self._fileobj.read(n)
File "/usr/lib/python3.8/zipfile.py", line 765, in read
self._file.seek(self._pos)
OSError: [Errno 22] Invalid argument