E546 Fix possible overlapping IPs by dperny · Pull Request #42432 · moby/moby · GitHub
[go: up one dir, main page]

Skip to content

Conversation

dperny
Copy link
Contributor
@dperny dperny commented May 27, 2021

- What I did

Fix a possibility where overlapping IP addresses could exist as a result of the node failing to clean up its old loadbalancer IPs.

Fixes #40989, supercedes #41062.
closes #41062

- How I did it

A node is no longer using its load balancer IP address when it no longer has tasks that use the network that requires that load balancer. When this occurs, the swarmkit manager will free that IP in IPAM, and may reassign it.

When a task shuts down cleanly, it attempts removal of the networks it uses, and if it is the last task using those networks, this removal will succeed:

https://github.com/dperny/docker/blob/abe0a10c6839ce301bc5a0e43bde4633af01f50d/daemon/cluster/executor/container/controller.go#L378-L384

However, this behavior is absent if the container fails. Removal of the network is never attempted.

To address this issue, I amend the executor. Whenever a node load balancer IP is removed or changed, that information is passed to the executor by way of the Configure method. By keeping track of the set of node NetworkAttachments from the previous call to Configure, we can determine which, if any, have been removed or changed.

At first, this seems to create a race, by which a task can be attempting to start and the network removed right out from under it. However, closer inspection of the code shows that this is already addressed in the controller:

https://github.com/dperny/docker/blob/abe0a10c6839ce301bc5a0e43bde4633af01f50d/daemon/cluster/executor/container/controller.go#L207-L225

The controller will attempt again to create the network as long as the network is missing.

- How to verify it

I'm not certain how to write a sane test for this. I've mostly verified it by logic.

- Description for the changelog

Fixed a bug where swarm could have duplicate IPs when the last task on a network on a node failed.

@thaJeztah
Copy link
Member

@dperny could you update the commit message with your description?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if e.nodeObj is not nil here? (I guess it would panic in that case)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also wondering if there's concurrency here (if a (RW)Mutex is needed for accessing e.nodeObj (no idea, but putting it here as a comment 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the nil-check. e.nodeObj will be nil on the first call to Configure. Incredibly embarrassed to have missed that.

I've added a comment explaining why nodeObj does not need to be guarded by a mutex.

@thaJeztah
Copy link
Member

@dperny needs a rebase, now that libnetwork was removed into this repository

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"github.com/docker/libnetwork"
"github.com/docker/docker/libnetwork"

A node is no longer using its load balancer IP address when it no longer
has tasks that use the network that requires that load balancer. When
this occurs, the swarmkit manager will free that IP in IPAM, and may
reaassign it.

When a task shuts down cleanly, it attempts removal of the networks it
uses, and if it is the last task using those networks, this removal
succeeds, and the load balancer IP is freed.

However, this behavior is absent if the container fails. Removal of the
networks is never attempted.

To address this issue, I amend the executor. Whenever a node load
balancer IP is removed or changed, that information is passedd to the
executor by way of the Configure method. By keeping track of the set of
node NetworkAttachments from the previous call to Configure, we can
determine which, if any, have been removed or changed.

At first, this seems to create a race, by which a task can be attempting
to start and the network is removed right out from under it. However,
this is already addressed in the controller. The controller will attempt
to recreate missing networks before starting a task.

Signed-off-by: Drew Erny <derny@mirantis.com>
@dperny
Copy link
Contributor Author
dperny commented Jun 18, 2021

@thaJeztah I bumped this and changed the libnetwork import path. It should be good to go.

Copy link
Member
@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sorry, though I already approved ☺️

@thaJeztah
Copy link
Member

@cpuguy83 @AkihiroSuda ptal

@AkihiroSuda AkihiroSuda merged commit 7729ebf into moby:master Jun 18, 2021
@thaJeztah thaJeztah added this to the 21.xx milestone Jul 12, 2021
@maricn
Copy link
maricn commented Oct 4, 2021

What's the relationship between this PR and #41011?
This PR supercedes #41062 that has a commit with comment "Bring over #41011 to make a integrated fix.". Is that integrated fix also incorporated in this PR and if so should the referenced #41011 and related tickets be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Node LB not deleted after container exits itself (cause IP overlapping)
4 participants
0