Expose the publishAllOutstanding() method as public #3093

j256 · 2018-03-27T20:13:14Z

So that external systems that need to ensure that all messages have been published can clear a batch. We need this because out system checkpoints every so often and we'd like to ensure that there aren't messages stuck in a batch buffer.

So that external systems that need to ensure that all messages have been published can clear a batch.

pongad · 2018-03-28T02:46:36Z

This looks good to me. @mdietz94 do you have an opinion?

mdietz94 · 2018-03-28T19:50:35Z

Why can't users just check whether messages have been published already via the callbacks? That's meant to be the method by which users handle flow control / checkpointing.

pongad · 2018-03-29T00:12:12Z

I might have misunderstood @j256 . My understanding is that we want "we're checkpointing, so let's just publish whatever we have right now".

@j256 Could you explain your use case a little more. Sorry I should have asked this earlier.

j256 · 2018-03-29T16:06:09Z

So my use case is as follows. We are streaming some large amount of data through the publisher so we want to have generous batch sizes for throughput reasons. But every so often our system emits checkpoints so that the system synchronizes to a certain moment in time for recovery purposes. This allows our various processing blocks to persist state information to disk for roll back reasons.

I could use the callback to know that messages have yet to be published but if they aren't then it's a failure in our processing and I'd need to rollback to the previous checkpoint because the publish was not acknowledged.

How do I know the difference between a message lost because of a protocol stack error versus to a message waiting for the batch time or size triggers? I could always put small batch timeouts or sync them with our checkpoint times but exposing the ability to flush the batch (like any stream.flush() call) made more sense to me.

mdietz94 · 2018-03-29T16:49:26Z

What is your batch timeout? For even large batching we usually use 50ms, which does not seem to be too high to wait on, especially since the actual time to make the RPC to the server is >100ms at 99%ile.

j256 · 2018-03-29T17:31:49Z

Right now we've been using record-count or request-byte thresholds and setting the batch timeout to be higher to ensure throughput. Right now it is 10 seconds. We don't really care about latency. It's the throughput we need.

Unfortunately although I'm sure your bench times are correct, we are trying to build a system which survives our clients' screwed up networks around the world. It's the 1% we worry about.

Frankly I'm surprised about the pushback from effectively a flush call. Isn't that the point of bufferedOutputStream.flush()? You have a buffer with limits but every so often you need to ensure that the bytes are written to the wire?

mdietz94 · 2018-03-29T21:12:34Z

That's a fair point. My concern just comes from trying to limit the API as much as possible, but I think you have fair points, and there could be similar use cases such as shutting down where you want to avoid waiting for batching. I think this is safe to approve.

Thanks,

pongad

One nit on the doc, but otherwise LGTM.

Thank you for the PR!

google-cloud-pubsub/src/main/java/com/google/cloud/pubsub/v1/Publisher.java

  }

-  private void publishAllOutstanding() {
+  /** publish any outstanding batches if non-empty */


j256 · 2018-04-03T04:09:32Z

Done.

Expose the publishAllOutstanding() method as public

724414f

So that external systems that need to ensure that all messages have been published can clear a batch.

j256 requested a review from pongad as a code owner March 27, 2018 20:13

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 27, 2018

pongad approved these changes Apr 2, 2018

View reviewed changes

expanded comment from feedback

8f2f5b3

pongad merged commit 1e79087 into googleapis:master Apr 3, 2018

j256 deleted the gw-make-publish-all-outstanding-be-public branch April 4, 2018 12:59

pongad mentioned this pull request Apr 10, 2018

Need to flush the Google Pub/Sub batch so that it delivers immediately? #3081

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Expose the publishAllOutstanding() method as public #3093

Expose the publishAllOutstanding() method as public #3093

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Expose the publishAllOutstanding() method as public #3093

Expose the publishAllOutstanding() method as public #3093

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants