-
Notifications
You must be signed in to change notification settings - Fork 814
Reset multiproc values when the pid changes. #169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This puts us at ~2000ns for an operation, which is ~300ns than previously but still slightly better than we were before we switched to a single lock. This is twice as slow as without multiproc mode.
I did some testing and gauge files would sometimes get left behind. There is a race condition where the So, gauge livesums could remain. Counters are getting a new file for each pid which I guess will stay around forever? Can we use a single file for those? |
Can you expand on this?
It's intended that you clean them out when the process restarts.
No, that'd require coordination between pids which I'm trying to avoid for performance reasons. |
Test setup:
The race condition is caused by The counter files are worrisome. On active servers these could pile up. Also I prefer not to create too many local files when I'm running inside containers. This could easily get into 100s or 1000s of files. Could we do something like this:
Need to think about this more, just thinking out loud here. |
First off the good news is it sounds like the code is working as expected. I haven't given too much thought to high churn situations. Do you have a notion of the performance impact? |
The fixes in this PR did seem to keep the files aligned with the worker process pids so that is good. I didn't really look at performance. I don't think the fixes I made would have any impact to the performance of this PR. There is just the one Also, the race condition I was experiencing was caused by the high churn rate in my test. But the likelihood of this failure would also increase as more and more metric files are left to be processed. That would increase the time between getting the list of files and processing the last file which would increase the chance the collector would create a file it shouldn't. |
The question for now is are the right metrics being exposed, this is the first attempt at supporting all the various modes of multi-processing. |
I'm pretty sure if the race condition is triggered a gauge with |
I don't think so, as the metric will be missing from the file rather than having the value 0. |
You are right. The potential issue would actually be with |
This puts us at ~2000ns for an operation,
which is ~300ns than previously but still slightly
better than we were before we switched to a single lock.
This is twice as slow as without multiproc mode.