-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Expose the publishAllOutstanding() method as public #3093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose the publishAllOutstanding() method as public #3093
Conversation
So that external systems that need to ensure that all messages have been published can clear a batch.
|
This looks good to me. @mdietz94 do you have an opinion? |
|
Why can't users just check whether messages have been published already via the callbacks? That's meant to be the method by which users handle flow control / checkpointing. |
|
So my use case is as follows. We are streaming some large amount of data through the publisher so we want to have generous batch sizes for throughput reasons. But every so often our system emits checkpoints so that the system synchronizes to a certain moment in time for recovery purposes. This allows our various processing blocks to persist state information to disk for roll back reasons. I could use the callback to know that messages have yet to be published but if they aren't then it's a failure in our processing and I'd need to rollback to the previous checkpoint because the publish was not acknowledged. How do I know the difference between a message lost because of a protocol stack error versus to a message waiting for the batch time or size triggers? I could always put small batch timeouts or sync them with our checkpoint times but exposing the ability to flush the batch (like any |
|
What is your batch timeout? For even large batching we usually use 50ms, which does not seem to be too high to wait on, especially since the actual time to make the RPC to the server is >100ms at 99%ile. |
|
Right now we've been using record-count or request-byte thresholds and setting the batch timeout to be higher to ensure throughput. Right now it is 10 seconds. We don't really care about latency. It's the throughput we need. Unfortunately although I'm sure your bench times are correct, we are trying to build a system which survives our clients' screwed up networks around the world. It's the 1% we worry about. Frankly I'm surprised about the pushback from effectively a flush call. Isn't that the point of |
|
That's a fair point. My concern just comes from trying to limit the API as much as possible, but I think you have fair points, and there could be similar use cases such as shutting down where you want to avoid waiting for batching. I think this is safe to approve. Thanks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit on the doc, but otherwise LGTM.
Thank you for the PR!
| } | ||
|
|
||
| private void publishAllOutstanding() { | ||
| /** publish any outstanding batches if non-empty */ |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
Done. |
So that external systems that need to ensure that all messages have been published can clear a batch. We need this because out system checkpoints every so often and we'd like to ensure that there aren't messages stuck in a batch buffer.