8000 gh-132917: Check resident set size (RSS) before GC trigger. by nascheme · Pull Request #133399 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-132917: Check resident set size (RSS) before GC trigger. #133399

New issue 8000

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 5, 2025

Conversation

nascheme
Copy link
Member
@nascheme nascheme commented May 4, 2025

This greatly reduces the time to run the script in GH-132917. For the default build, I get 0.19 seconds to run. For free-threaded build main branch, 1.9 seconds. With this PR applied, 0.2 seconds.

I've tried to implement a get_current_rss() function that works on Linux, MacOS, Windows, FreeBSD and OpenBSD. If the platform doesn't have an implementation or the function fails, gc_should_collect() reverts to the old behavior.

I did not implement this for the default build GC yet. I think we should try it in 3.14 for only the free-threaded build. If it seems to work okay, we can add a similar RSS check for the default GC collection threshold check.

As conditions, I'm using RSS increase of 10% or young object count increase of 40 x threshold. I think this is quite conservative in terms of running the GC often enough.

@nascheme nascheme changed the title Check resident set size (RSS) before GC trigger. gh-132917: Check resident set size (RSS) before GC trigger. May 4, 2025
@nascheme nascheme added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 4, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @nascheme for commit 5e965ab 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133399%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 4, 2025
@nascheme nascheme requested a review from pablogsal May 4, 2025 21:29
@nascheme nascheme added topic-free-threading performance Performance or resource usage labels May 4, 2025
@nascheme nascheme marked this pull request as ready for review May 4, 2025 23:13
@nascheme nascheme requested a review from colesbury May 4, 2025 23:13
Copy link
Contributor
@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments below, but looks good to me.

Copy link
Member
@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I like this approach is simple and elegant and will work most of the time. I am slightly worried about Python objects that hold other resources that are not just memory not being accounted for and maybe how SWAP can affect this numbers.

For example in my mac laptop and I can prevent a GC run by opening many chrome tabs and forcing a swap to disk causing the RSS to drop and raise again. I don't think is a problem but maybe in the future we can use heap memory instead or a combined value?

@nascheme nascheme enabled auto-merge (squash) May 5, 2025 16:55
@nascheme
Copy link
Member Author
nascheme commented May 5, 2025

For example in my mac laptop and I can prevent a GC run by opening many chrome tabs and forcing a swap to disk causing the RSS to drop and raise again. I don't think is a problem but maybe in the future we can use heap memory instead or a combined value?

Yeah, that's probably a good idea. Instead of get_current_rss() it could return some estimate of total memory usage by the process. The deferred_count > 40*threshold condition will hopefully save us in the case of swapping. The GC will eventually run.

@nascheme nascheme merged commit 5c245ff into python:main May 5, 2025
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage topic-free-threading
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0