-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-88887: Cleanup multiprocessing.resource_tracker.ResourceTracker
upon deletion
#130429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# see https://github.com/python/cpython/issues/88887 | ||
try: | ||
self._stop() | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching some exceptions is fine. But catching any exception is not ok. You should not ignore KeyboardInterrupt for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to catch OSError, TypeError, AttributeError
only
multiprocessing.resource_tracker.ResourceTracker
upon deletion
…n into fix_resource_tracker_leak
Co-authored-by: Victor Stinner <vstinner@python.org>
@vstinner good to merge? |
# see https://github.com/python/cpython/issues/88887 | ||
try: | ||
self._stop() | ||
except (OSError, TypeError, AttributeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not confortable with these exceptions. In which case we should expect these exceptions? I would prefer a direct call to self._stop() without try/except.
_stop() should be modified to do nothing if self._pid
is None.
def _stop(self):
with self._lock:
if self._pid is None:
return
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to catching AttributeError only which is likely caused by __del__
:
del() can be executed during interpreter shutdown. As a consequence, the global variables it needs to access (including other modules) may already have been deleted or set to None.
https://docs.python.org/3/reference/datamodel.html#object.del
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not catching AttributeError here fails some of the tests as the os
module is already None
when _stop
runs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change mostly LGTM, but here is a review on minor issues.
@@ -0,0 +1 @@ | |||
Fixing multiprocessing Resource Tracker process leaking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing multiprocessing Resource Tracker process leaking | |
Fixing multiprocessing Resource Tracker process leaking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the issue specific to Python running as the pid 1? If yes, it might be useful to mention it in the Changelog entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the leaking happens regardless of PID 1. Python relies on kernel's process handling code to clean up the process, which works fine for most cases.
A case where it won't work is when running Python as PID 1 and there's no signal processing handler to reap zombies on client code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sorry, @luccabb and @vstinner, I could not cleanly backport this to
|
…cker` upon deletion (pythonGH-130429) (cherry picked from commit f53e7de) Co-authored-by: luccabb <32229669+luccabb@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Gregory P. Smith <greg@krypto.org>
GH-131516 is a backport of this pull request to the 3.13 branch. |
Merged, thank you. @luccabb: Automated backport to 3.12 failed, do you want to try to backport the change manually to 3.12? |
…acker` upon deletion (GH-130429) (#131516) gh-88887: Cleanup `multiprocessing.resource_tracker.ResourceTracker` upon deletion (GH-130429) (cherry picked from commit f53e7de) Co-authored-by: luccabb <32229669+luccabb@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Gregory P. Smith <greg@krypto.org>
GH-131530 is a backport of this pull request to the 3.12 branch. |
Hello, We noticed this change in the logs of The I think in normal behavior, it should probably only be called by the main process no? Sorry if this is just because I misunderstood the PR by reading too fast. |
Thanks, please open a new issue with a test scenario and reproduction details. |
Sorry for the noise, I read too fast... The behavior is still the right one. |
…cker` upon deletion (python#130429) Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Gregory P. Smith <greg@krypto.org>
fixing #88887: multiprocessing: the Resource Tracker process is never reaped
#88887 is only visible when running Python with PID 1. This is because Python relies on kernel's process handling code to clean up terminated processes, which works fine for most cases. But it won't work when running Python as PID 1 and there's no signal processing handler to reap zombies on client code
So to test changes we need a docker instance that starts with python code that leaks the process. For that I'm using https://github.com/viktorvia/python-multi-issue but with a custom Dockerfile that installs my branch of cpython instead: https://gist.github.com/luccabb/a0a48b15cd30746e7619ecb7168ce439
Test plan
before
docker build
observe
[python3] <defunct>
processes increasing by 1 every 1s:after
update Dockerfile to get code from this branch: https://gist.github.com/luccabb/a0a48b15cd30746e7619ecb7168ce439
docker build
observe no
[python3] <defunct>
processes: