fix: Handle max concurrent queries errors #20907

erezrokah · 2025-06-24T17:02:35Z

Summary

This PR fixes an issue that's more evident on older ClickHouse versions (e.g. 22 that we don't support anymore) where max_concurrent_queries defaulted to 100. On newer versions the default is 1500.

The result of hitting the limit is that some tables can be dropped unexpectedly, since checkPartitionOrOrderByChanged called from checkForced here

cloudquery/plugins/destination/clickhouse/client/migrate.go

Line 31 in a69aa8c

if err := c.checkForced(ctx, have, want, messages); err != nil {

will not return an error while checkPartitionOrOrderByChanged can return an error when called further down the line here

cloudquery/plugins/destination/clickhouse/client/migrate.go

Line 149 in a69aa8c

    
           if unsafe := unsafeChanges(changes); len(unsafe) > 0 || c.checkPartitionOrOrderByChanged(ctx, want, c.spec.Partition, c.spec.OrderBy) != nil {

Had to override the max_concurrent_queries configuration so using the docker compose file for that.

The solution implemented in this PR has 2 parts:

Save the state from when we check which tables need to be forced migration so we don't re-run the same queries again
Retry ClickHouse operations (tried to create a connection wrapper for this purpose, but that's a bit tricky since the error can be on connection.Exec row.Scan and so on, so I'd basically need to wrap all ClickHouse's driver interfaces.

cq-bot · 2025-06-24T17:03:59Z

🚀 ClickHouse Cloud UI deployed to Vercel:
https://plugin-destination-clickhouse-cloud-ui-20907.vercel.app
You can also check out this plugin in action at:
https://cloud.cloudquery.io/teams/cloudquery-test/destinations/create?plugin-cloud-ui=cloudquery|destination|clickhouse|https://plugin-destination-clickhouse-cloud-ui-20907.vercel.app
Unique Vercel deployment URL:
https://plugin-cloud-2heaycp0c-cloudquery.vercel.app

erezrokah · 2025-07-01T17:59:23Z

plugins/destination/clickhouse/client/migrate.go

 				return nil
 			}

-			have := have.Get(want.Name)
-			if have == nil {
+			tableName := want.Name


This is the part about re-using the tables changes state

erezrokah · 2025-07-01T18:00:58Z

plugins/destination/clickhouse/client/migrate.go

@@ -103,23 +142,21 @@ func unsafeChanges(changes []schema.TableColumnChange) []schema.TableColumnChang
 }

 func (c *Client) createTable(ctx context.Context, table *schema.Table, partition []spec.PartitionStrategy, orderBy []spec.OrderByStrategy) (err error) {
-	c.logger.Debug().Str("table", table.Name).Msg("Table doesn't exist, creating")


This was confusing as we called dropTable then createTable when doing force migration, so even if the table existed this log was printed

marianogappa

Congrats on finding and fixing this bug 👍

My only concern here is the retry logic.

Obviously, without more context there's no smartness possible to the retry, like we do on InsertSplitter, so we agree that it's not gonna get a lot smarter than "try running the same query again later".

However, retrying up to 10 times, 1s +/- .5s later in the case of "Too many simultaneous queries" sounds 💣 🤔 no? I think this will by default increase congestion. My proposed remediation isn't very smart...I'd 10x the delay or something like that.

erezrokah · 2025-07-01T18:23:25Z

However, retrying up to 10 times, 1s +/- .5s later in the case of "Too many simultaneous queries" sounds 💣 🤔 no? I think this will by default increase congestion. My proposed remediation isn't very smart...I'd 10x the delay or something like that.

The default strategy is backoff + random jitter, but I don't mind increasing the retry delay. Based on my tests with the current values and 2000 concurrent syncs (from the test) we hardly every retry more than twice

maaarcelino · 2025-07-02T07:05:06Z

plugins/destination/clickhouse/client/retry_helpers.go

+		retry.Attempts(5),
+		retry.Delay(3 * time.Second),
+		retry.MaxJitter(1 * time.Second),


Would it make sense to expose these variables in the spec? Not blocking, just thinking out loud.

Not sure actually I think we can expose based on user feedback. Could be too much control. The default retry logic is backoff + random jitter so I think that should cover quite a lot of cases (especially on newer versions)

maaarcelino

LGTM, one minor non-blocking suggestion

🤖 I have created a release *beep* *boop* --- ## [7.1.1](plugins-destination-clickhouse-v7.1.0...plugins-destination-clickhouse-v7.1.1) (2025-07-02) ### Bug Fixes * **deps:** Update golang.org/x/exp digest to b7579e2 ([#20935](#20935)) ([aac340d](aac340d)) * **deps:** Update module github.com/cloudquery/codegen to v0.3.29 ([#20947](#20947)) ([af179be](af179be)) * Handle max concurrent queries errors ([#20907](#20907)) ([92c2827](92c2827)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

erezrokah requested a review from a team as a code owner June 24, 2025 17:02

erezrokah requested a review from maaarcelino June 24, 2025 17:02

cq-bot added area/ci area/plugin/destination/clickhouse labels Jun 24, 2025

erezrokah force-pushed the fix/handle_force_migration_false_positives branch 3 times, most recently from 6778728 to 56298c6 Compare June 30, 2025 11:18

erezrokah marked this pull request as draft July 1, 2025 15:12

erezrokah removed the request for review from maaarcelino July 1, 2025 15:13

erezrokah added 3 commits July 1, 2025 16:45

test: Add test that simulates multiple concurrent syncs

85306ba

fix: Better partition and sort key changes handling

e74c95d

fix: Retry ClickHouse operations

10badf5

erezrokah force-pushed the fix/handle_force_migration_false_positives branch from 56298c6 to 10badf5 Compare July 1, 2025 17:55

erezrokah changed the title ~~test: Add test that simulates multiple concurrent syncs~~ fix: Handle max concurrent queries errors Jul 1, 2025

erezrokah commented Jul 1, 2025

View reviewed changes

erezrokah marked this pull request as ready for review July 1, 2025 18:10

marianogappa approved these changes Jul 1, 2025

View reviewed changes

erezrokah added 2 commits July 1, 2025 19:25

chore: Verbose tests

4e7fe47

Tweak retries

98540cf

maaarcelino reviewed Jul 2, 2025

View reviewed changes

maaarcelino approved these changes Jul 2, 2025

View reviewed changes

stoovon approved these changes Jul 2, 2025

View reviewed changes

erezrokah added the automerge Automatically merge once required checks pass label Jul 2, 2025

kodiakhq bot and others added 3 commits July 2, 2025 16:28

Merge branch 'main' into fix/handle_force_migration_false_positives

3fc8b43

Merge branch 'main' into fix/handle_force_migration_false_positives

e1adc6d

Merge branch 'main' into fix/handle_force_migration_false_positives

436c0f9

kodiakhq bot merged commit 92c2827 into main Jul 2, 2025
17 checks passed

kodiakhq bot deleted the fix/handle_force_migration_false_positives branch July 2, 2025 17:02

cq-bot mentioned this pull request Jul 2, 2025

chore(main): Release plugins-destination-clickhouse v7.1.1 #20937

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Handle max concurrent queries errors #20907

fix: Handle max concurrent queries errors #20907

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix: Handle max concurrent queries errors #20907

fix: Handle max concurrent queries errors #20907

Uh oh!

Conversation

Uh oh!

Summary

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!