8000 Expose the publishAllOutstanding() method as public by j256 · Pull Request #3093 · googleapis/google-cloud-java · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@j256
Copy link
@j256 j256 commented Mar 27, 2018

So that external systems that need to ensure that all messages have been published can clear a batch. We need this because out system checkpoints every so often and we'd like to ensure that there aren't messages stuck in a batch buffer.

So that external systems that need to ensure that all messages have been
published can clear a batch.
@j256 j256 requested a review from pongad as a code owner March 27, 2018 20:13
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 27, 2018
@pongad
Copy link
Contributor
pongad commented Mar 28, 2018

This looks good to me. @mdietz94 do you have an opinion?

@mdietz94
Copy link

Why can't users just check whether messages have been published already via the callbacks? That's meant to be the method by which users handle flow control / checkpointing.

@pongad
Copy link
Contributor
pongad commented Mar 29, 2018

I might have misunderstood @j256 . My understanding is that we want "we're checkpointing, so let's just publish whatever we have right now".

@j256 Could you explain your use case a little more. Sorry I should have asked this earlier.

@j256
Copy link
Author
j256 commented Mar 29, 2018

So my use case is as follows. We are streaming some large amount of data through the publisher so we want to have generous batch sizes for throughput reasons. But every so often our system emits checkpoints so that the system synchronizes to a certain moment in time for recovery purposes. This allows our various processing blocks to persist state information to disk for roll back reasons.

I could use the callback to know that messages have yet to be published but if they aren't then it's a failure in our processing and I'd need to rollback to the previous checkpoint because the publish was not acknowledged.

How do I know the difference between a message lost because of a protocol stack error versus to a message waiting for the batch time or size triggers? I could always put small batch timeouts or sync them with our checkpoint times but exposing the ability to flush the batch (like any stream.flush() call) made more sense to me.

@mdietz94
Copy link

What is your batch timeout? For even large batching we usually use 50ms, which does not seem to be too high to wait on, especially since the actual time to make the RPC to the server is >100ms at 99%ile.

@j256
Copy link
Author
j256 commented Mar 29, 2018

Right now we've been using record-count or request-byte thresholds and setting the batch timeout to be higher to ensure throughput. Right now it is 10 seconds. We don't really care about latency. It's the throughput we need.

Unfortunately although I'm sure your bench times are correct, we are trying to build a system which survives our clients' screwed up networks around the world. It's the 1% we worry about.

Frankly I'm surprised about the pushback from effectively a flush call. Isn't that the point of bufferedOutputStream.flush()? You have a buffer with limits but every so often you need to ensure that the bytes are written to the wire?

@mdietz94
Copy link
mdietz94 commented Mar 29, 2018

That's a fair point. My concern just comes from trying to limit the API as much as possible, but I think you have fair points, and there could be similar use cases such as shutting down where you want to avoid waiting for batching. I think this is safe to approve.

Thanks,

Copy link
Contributor
@pongad pongad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit on the doc, but otherwise LGTM.

Thank you for the PR!

}

private void publishAllOutstanding() {
/** publish any outstanding batches if non-empty */

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@j256
Copy link
Author
j256 commented Apr 3, 2018

Done.

@pongad pongad merged commit 1e79087 into googleapis:master Apr 3, 2018
@j256 j256 deleted the gw-make-publish-all-outstanding-be-public branch April 4, 2018 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

0