libnetwork/networkdb: prioritize local table broadcasts over event rebroadcasts #50193

corhere · 2025-06-12T22:15:13Z

- What I did

Fixed Flaky test: libnetwork/networkdb.TestNetworkCRUDTableEntries #50089

- How I did it
The flaky TestNetworkDBCRUDTableEntries only waits long enough for a handful of gossip cycles to execute before it asserts that the entries owned by one node exist on the other nodes. It does not wait long enough for a bulk-sync to be performed between the nodes. The test therefore asserts, effectively, that gossip is sufficient to reliably converge in a lossless network scenario. The flaky test failures are always due to the CREATE event for a single table entry not being propagated through gossip. The stats and debug logs (not committed to this PR for being very spammy with a low signal:noise ratio) revealed that when the test flakes, the CREATE event for the offending entry was enqueued but never received by the other node and the owning node's queue is holding a large number of messages. Rebroadcasts for messages received from other nodes are delaying the broadcasting of the local node's messages.

Improve the convergence rate of NetworkDB by preferentially broadcasting event messages for entries owned by the local node before rebroadcasting received messages. Enqueue the messages for rebroadcasting into a separate queue which is only pulled from if there is space left over in the gossip packet after filling it with local event messages.

- How to verify it

$ go test -count=100 -run CRUDTableEntries -failfast ./libnetwork/networkdb

- Human readable description for the release notes

- Improve the convergence rate of NetworkDB, part of the management plane for overlay networking, after bursts of updates.

- A picture of a cute animal (not mandatory but encouraged)

libnetwork/networkdb/cluster.go

libnetwork/networkdb/networkdb.go

The map key for nDB.networks is the network ID. The struct field is not actually used anywhere in practice. Signed-off-by: Cory Snider <csnider@mirantis.com>

When joining a network that was previously joined but not yet reaped, NetworkDB replaces the network struct value with a zeroed-out one with the entries count copied over. This is also the case when joining a network that is currently joined! Consequently, joining a network has the side effect of clearing the broadcast queue. If the queue is cleared while messages are still pending broadcast, convergence may be delayed until the next bulk sync cycle. Make it an error to join a network twice without leaving. Retain the existing broadcast queue when rejoining a network that has not yet been reaped. Signed-off-by: Cory Snider <csnider@mirantis.com>

NetworkDB uses a muli-dimensional map of struct network to keep track of network attachments for both remote nodes and the local node. Only a subset of the struct fields are used for remote nodes' network attachments. The tableBroadcasts pointer field in particular is always initialized for network values representing local attachments (read: nDB.networks[nDB.config.NodeID]) and always nil for remote attachments. Consequently, unnecessary defensive nil-pointer checks are peppered throughout the code despite the aforementioned invariant. Enshrine the invariant that tableBroadcasts is initialized iff the network attachment is for the local node in the type system. Pare down struct network to only the fields needed for remote network attachments and move the local-only fields into a new struct thisNodeNetwork. Elide the unnecessary nil-checks. Signed-off-by: Cory Snider <csnider@mirantis.com>

Log more details when assertions fail to provide a more complete picture of what went wrong when TestCRUDTableEntries fails. Log the state of each NetworkDB instance at various points in TestCRUDTableEntries to provide an even more complete picture. Increase the global logger verbosity in tests so warnings and debug logs are printed to the test log. Signed-off-by: Cory Snider <csnider@mirantis.com>

A network node is responsible for both broadcasting table events for entries it owns and for rebroadcasting table events from other nodes it has received. Table events to be broadcast are added to a single queue per network, including events for rebroadcasting. As the memberlist TransmitLimitedQueue is (to a first approximation) LIFO, a flood of events from other nodes could delay the broadcasting of locally-generated events indefinitely. Prioritize broadcasting local events by splitting up the queues and only pulling from the rebroadcast queue if there is free space in the gossip packet after draining the local-broadcast queue. Signed-off-by: Cory Snider <csnider@mirantis.com>

Copilot

Pull Request Overview

This PR improves the convergence rate of NetworkDB by prioritizing local event broadcasts over remote rebroadcasts during gossip cycles. Key changes include:

Refactoring local network attachments to use a new thisNodeNetworks structure.
Splitting broadcast queues into local tableBroadcasts and remote tableRebroadcasts.
Enhancing test diagnostics with additional logging and table dumps.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
libnetwork/networkdb/networkdbdiagnostic.go	Updated diagnostic lookup to use thisNodeNetworks.
libnetwork/networkdb/networkdb_test.go	Modified tests to use the new network map and added dumpTable for debugging info.
libnetwork/networkdb/networkdb.go	Refactored Join/Leave logic to use thisNodeNetworks and introduced dual broadcast queues.
libnetwork/networkdb/delegate.go	Adjusted event handling to account for the new broadcast queue split.
libnetwork/networkdb/cluster.go	Updated network reaping and gossip functions to work with thisNodeNetworks.
libnetwork/networkdb/broadcast.go	Added a helper (getBroadcasts) to merge broadcasts from multiple queues.

Comments suppressed due to low confidence (2)

libnetwork/networkdb/delegate.go:290

Update the accompanying comment to reflect that the nil check for tableBroadcasts was removed and that remote rebroadcasts are now handled via 'tableRebroadcasts'.

if !ok || n.leaving {

libnetwork/networkdb/broadcast.go:163

[nitpick] Consider adding detailed function comments describing the behavior, parameters, and return value of 'getBroadcasts' to improve maintainability.

func getBroadcasts(overhead, limit int, queues ...*memberlist.TransmitLimitedQueue) [][]byte {

libnetwork/networkdb/networkdb.go

vvoland · 2025-07-08T13:30:46Z

No backport for 28.3.2 - we can backport it for the new release along with #50344

corhere added this to the 29.0.0 milestone Jun 12, 2025

corhere requested review from akerouanton, tiborvass, thaJeztah, vvoland, robmry and Copilot June 12, 2025 22:15

corhere added area/networking Networking impact/changelog kind/bugfix PR's that fix bugs area/networking/d/overlay Networking labels Jun 12, 2025

This comment was marked as outdated.

Sign in to view

corhere force-pushed the libn/networkdb-fix-crudtable-flakes-harder branch from ce52ebc to d830219 Compare June 12, 2025 22:18

robmry reviewed Jun 13, 2025

View reviewed changes

libnetwork/networkdb/cluster.go Show resolved Hide resolved

libnetwork/networkdb/networkdb.go Outdated Show resolved Hide resolved

corhere force-pushed the libn/networkdb-fix-crudtable-flakes-harder branch 2 times, most recently from 78e32e9 to 9549022 Compare June 13, 2025 18:26

corhere added 5 commits June 13, 2025 16:25

libnetwork/networkdb: drop id field from network

30b27ab

The map key for nDB.networks is the network ID. The struct field is not actually used anywhere in practice. Signed-off-by: Cory Snider <csnider@mirantis.com>

corhere force-pushed the libn/networkdb-fix-crudtable-flakes-harder branch from 9549022 to 6ec6e09 Compare June 13, 2025 21:12

corhere requested review from Copilot and robmry June 13, 2025 21:14

Copilot AI reviewed Jun 13, 2025

View reviewed changes

libnetwork/networkdb/networkdb.go Show resolved Hide resolved

robmry approved these changes Jun 17, 2025

View reviewed changes

akerouanton approved these changes Jun 18, 2025

View reviewed changes

corhere merged commit de24c53 into moby:master Jun 18, 2025
190 of 191 checks passed

corhere deleted the libn/networkdb-fix-crudtable-flakes-harder branch June 18, 2025 17:35

corhere added the process/cherry-pick label Jun 18, 2025

corhere added the process/cherry-pick/25.0 label Jun 18, 2025

thaJeztah added the process/cherry-pick/28.x label Jun 23, 2025

corhere mentioned this pull request Jul 7, 2025

libnetwork/networkdb: fix broadcast queue deadlocks #50344

Merged

corhere mentioned this pull request Jul 25, 2025

[25.0] libnetwork/networkdb: backport all the fixes #50511

Merged

dmcgowan mentioned this pull request Aug 6, 2025

Prepare release notes for v2.0.0-alpha.0 #50651

Closed

vvoland removed the process/cherry-pick/28.x label Aug 13, 2025

dmcgowan mentioned this pull request Sep 5, 2025

Prepare release notes for v2.0.0-beta.0 #50918

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

libnetwork/networkdb: prioritize local table broadcasts over event rebroadcasts #50193

libnetwork/networkdb: prioritize local table broadcasts over event rebroadcasts #50193

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

libnetwork/networkdb: prioritize local table broadcasts over event rebroadcasts #50193

libnetwork/networkdb: prioritize local table broadcasts over event rebroadcasts #50193

Uh oh!

Conversation

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!