8000 chore: retry postgres connection on reset by peer in tests by spikecurtis · Pull Request #18632 · coder/coder · GitHub
[go: up one dir, main page]

Skip to content

chore: retry postgres connection on reset by peer in tests #18632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 27, 2025

Conversation

spikecurtis
Copy link
Contributor
@spikecurtis spikecurtis commented Jun 27, 2025

Fixes coder/internal#695

Retries initial connection to postgres in testing up to 3 seconds if we see "reset by peer", which probably means that some other test proc just started the container.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@spikecurtis spikecurtis force-pushed the spike/internal-695-retry-pg-reset-by-peer branch from abc4510 to 5c7efe5 Compare June 27, 2025 11:48
@spikecurtis spikecurtis requested a review from hugodutka June 27, 2025 11:49
@spikecurtis spikecurtis marked this pull request as ready for review June 27, 2025 11:49
@hugodutka
Copy link
Contributor

I believe a much simpler fix is this: #18423. openContainer uses a flock to serialize multiple binaries trying to start a postgres container at the same time, and has an early exit path if the container was already started. Do you see a reason this wouldn't be sufficient?

Copy link
Contributor Author

That assumes that "reset by peer" can only happen if there is already a Docker container with the name we expect starting on the port.

If it uses a different name, or a non-Dockerized postgres is responsible for "reset by peer" then starting the container will fail due to port conflict, no?

Copy link
Contributor
@hugodutka hugodutka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about it on Zoom and reached the conclusion that it's a good practice to handle retriable and non-retriable errors separately.

Co-authored-by: Hugo Dutka <hugo@coder.com>
@spikecurtis spikecurtis enabled auto-merge (squash) June 27, 2025 12:51
@spikecurtis spikecurtis merged commit f0251df into main Jun 27, 2025
35 checks passed
@spikecurtis spikecurtis deleted the spike/internal-695-retry-pg-reset-by-peer branch June 27, 2025 13:03
@github-actions github-actions bot locked and limited conversation to collaborators Jun 27, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flake: postgresql connection reset
2 participants
0