-
Notifications
You must be signed in to change notification settings - Fork 3.9k
CQ shared store: Delete from index on remove or roll over #13959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sorry to keep on nitpicking :) during healthy operation when messages are mostly consumed in the same order as they are written, most of the rdq files can be deleted before any compaction happens on them. So I agree that its more important to make the delete_file more efficient than compact_file. Just wanted to mention the trade off in the performance of compaction. |
Thanks a lot, I'll keep an eye on this. I've just fixed (?) a bug that actually crashed the message store during compaction. :) |
Right it's likely to make compaction itself slower. But it might be better overall when compaction+delete are both happening, because before delete was very slow. There's other cases where it is likely slower (like mass deleting messages). |
It was expensive to delete files because we had clean up the index and to get the messages in the file we have to scan it. Instead of cleaning up the index on file delete this commit deletes from the index as soon as possible. There are two scenarios: messages that are removed from the current write file, and messages that are removed from other files. In the latter case, we can just delete the index entry on remove. For messages in the current write file, we want to keep the entry in case fanout is used, because we don't want to write the fanout message multiple times if we can avoid it. So we keep track of removes in the current write file and do a cleanup of these entries on file roll over. Compared to the previous implementation we will no longer increase the ref_count of messages that are not in the current write file, meaning we may do more writes in fanout scenarios. But at the same time the file delete operation is much cheaper. Additionally, we prioritise delete calls in rabbit_msg_store_gc. Without that change, if the compaction was lagging behind, we could have file deletion requests queued behind many compaction requests, leading to many unnecessary compactions of files that could already be deleted. Co-authored-by: Michal Kuratczyk <michal.kuratczyk@broadcom.com>
cb60eb0
to
78d2e9b
Compare
Test results (with this change applied on top of Highlights:
![]()
![]() The throughput is comparable in most cases. The downside is slower compaction, since we no longer skip over consumed messages. This should be less of a problem since we prioritised deletion requests now, avoiding unnecessary compactions (for files we can already delete). Deletions are generally much more common, so it feels like it's worth to have faster deletions and slower compactions than the other way round. On my machine we can still do a few compactions per second. |
@lhoguin This is now ready for review. While we discussed ideas for speeding up compaction, I feel like it's a too risky change to include here. With 4.1.1 coming soon, I'd rather include this change without further complications. For future reference, the idea to speed up compaction was to:
But any bug in zeroing means data corruption on potentially valid data so I don't want to rush it. |
GH doesn't want me to approve my own PR, but I approve it nevertheless. I didn't double check the CI failures though. |
It was expensive to delete files because we had clean up the index and to get the messages in the file we have to scan it. Instead of cleaning up the index on file delete this commit deletes from the index as soon as possible. There are two scenarios: messages that are removed from the current write file, and messages that are removed from other files. In the latter case, we can just delete the index entry on remove. For messages in the current write file, we want to keep the entry in case fanout is used, because we don't want to write the fanout message multiple times if we can avoid it. So we keep track of removes in the current write file and do a cleanup of these entries on file roll over. Compared to the previous implementation we will no longer increase the ref_count of messages that are not in the current write file, meaning we may do more writes in fanout scenarios. But at the same time the file delete operation is much cheaper. Additionally, we prioritise delete calls in rabbit_msg_store_gc. Without that change, if the compaction was lagging behind, we could have file deletion requests queued behind many compaction requests, leading to many unnecessary compactions of files that could already be deleted. Co-authored-by: Michal Kuratczyk <michal.kuratczyk@broadcom.com> (cherry picked from commit 0278980)
CQ shared store: Delete from index on remove or roll over (backport #13959)
…3959) It was expensive to delete files because we had clean up the index and to get the messages in the file we have to scan it. Instead of cleaning up the index on file delete this commit deletes from the index as soon as possible. There are two scenarios: messages that are removed from the current write file, and messages that are removed from other files. In the latter case, we can just delete the index entry on remove. For messages in the current write file, we want to keep the entry in case fanout is used, because we don't want to write the fanout message multiple times if we can avoid it. So we keep track of removes in the current write file and do a cleanup of these entries on file roll over. Compared to the previous implementation we will no longer increase the ref_count of messages that are not in the current write file, meaning we may do more writes in fanout scenarios. But at the same time the file delete operation is much cheaper. Additionally, we prioritise delete calls in rabbit_msg_store_gc. Without that change, if the compaction was lagging behind, we could have file deletion requests queued behind many compaction requests, leading to many unnecessary compactions of files that could already be deleted. Co-authored-by: Michal Kuratczyk <michal.kuratczyk@broadcom.com>
It is expensive to delete files because we must clean up the index and to get the messages in the file we have to scan it.
Instead of cleaning up the index on file delete this commit deletes from the index as soon as possible. There are two scenarios: messages that are removed from the current write file, and messages that are removed from other files. In the latter case, we
can just delete the index entry on remove. For messages in the current write file, we want to keep the entry in case fanout is used, because we don't want to write the fanout message multiple times if we can avoid it. So we keep track of removes in the current write file and do a cleanup of these entries on file roll over.
Compared to the previous implementation we will no longer increase the ref_count of messages that are not in the current write file, meaning we may do more writes in fanout scenarios. But at the same time the file delete operation is much cheaper.
@mkuratczyk