E540 Make container removal fail if platform-specific containers fail by wk8 · Pull Request #80320 · kubernetes/kubernetes · GitHub
[go: up one dir, main page]

Skip to content

Conversation

wk8
Copy link
Contributor
@wk8 wk8 commented Jul 18, 2019

What type of PR is this?
/kind feature

What this PR does / why we need it:

#74737 introduced a new in-memory
map for the dockershim, that could potentially (in pathological cases) cause
memory leaks - for containers that use GMSA cred specs, get created
successfully, but then never get started nor removed.

This patch addresses this issue by making container removal fail altogether
when platform-specific clean ups fail: this allows clean ups to be retried
later, when the kubelet attempts to remove the container again.

Which issue(s) this PR fixes:
Fixes #74843

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 18, 2019
@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jul 18, 2019
@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jul 18, 2019
@cblecker
Copy link
Member

@wk8 Why are the previous comments being deleted? Are you still scripting retests against kubernetes/kubernetes?

@wk8
Copy link
Contributor Author
wk8 commented Jul 18, 2019

/test pull-kubernetes-bazel-test

@wk8
Copy link
Contributor Author
wk8 commented Jul 18, 2019

@cblecker : deactivated the automated process. Sorry.

Copy link
Contributor
@mattjmcnaughton mattjmcnaughton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your pr :)

Before I start a review, can you share more about this line in your pr:

that could potentially (in pathological cases) cause memory leaks - for containers that use GMSA cred specs, get created successfully, but then never get started nor removed.

Can you share more about these pathological cases and the size/scope of the memory leaks?

@wk8
Copy link
Contributor Author
wk8 commented Jul 19, 2019

@mattjmcnaughton : the only way cred specs can be leak is if a container is created with a cred spec, then somehow never gets either started nor removed - and even then a single containerCreationCleanupInfo struct is leaked, which is really just a string. So for that leak to become significant, you'd need to have a large number of containers using cred specs that somehow get stuck forever in between creation and start time, all over a single Kubelet lifespan.

That's not very likely to occur, but it was deemed a sufficient concern to require some kind of GC before GMSA support is considered mature enough to be graduated to beta (see #74737 (comment)).

Please also note that this whole thing should become moot and superfluous when we can deprecate support for versions of docker older than the current latest release, which includes moby/moby#38777, a patch that makes it possible to pass cred specs directly to docker instead of having to write somewhere, and then clean up.

@mattjmcnaughton
Copy link
Contributor

Gotcha, thanks @wk8 - just wanted to confirm that the failure was bad/common enough to warrant a fix and it sounds like we've already had that discussion.

cc @lavalamp how much context do you have on this part of the code base? I have fairly little. Happy to try and wade in if necessary, but just wanted to check with you first :)

@yujuhong yujuhong self-assigned this Jul 23, 2019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - it might be good to prefix this with unsafe so its consistent with the other functions that assume the caller holds the mutex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done, thanks :)

@yujuhong
Copy link
Contributor

That's not very likely to occur, but it was deemed a sufficient concern to require some kind of GC before GMSA support is considered mature enough to be graduated to beta (see #74737 (comment)).

Please also note that this whole thing should become moot and superfluous when we can deprecate support for versions of docker older than the current latest release, which includes moby/moby#38777, a patch that makes it possible to pass cred specs directly to docker instead of having to write somewhere, and then clean up.

@wk8 thanks for the PR. For the reasons you stated above, I think a simpler cleanup routine would suffice.

We don't need to worry about containers that were never started or removed -- kubelet should try to remove them, and would trigger the cleanup anyway. The reviewer comments from the previous PR was mainly about the case where the cleanup failed (for whatever reason), dockershim will never attempt to clean up again. To handle the specific corner, dockershim can periodically look at the cleanup that failed previously and attempt again. This should be a much smaller code change.

/cc @Random-Liu @liyanhui1228

@k8s-ci-robot k8s-ci-robot requested a review from Random-Liu July 30, 2019 04:30
@k8s-ci-robot
Copy link
Contributor

@yujuhong: GitHub didn't allow me to request PR reviews from the following users: liyanhui1228.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

That's not very likely to occur, but it was deemed a sufficient concern to require some kind of GC before GMSA support is considered mature enough to be graduated to beta (see #74737 (comment)).

Please also note that this whole thing should become moot and superfluous when we can deprecate support for versions of docker older than the current latest release, which includes moby/moby#38777, a patch that makes it possible to pass cred specs directly to docker instead of having to write somewhere, and then clean up.

@wk8 thanks for the PR. For the reasons you stated above, I think a simpler cleanup routine would suffice.

We don't need to worry about containers that were never started or removed -- kubelet should try to remove them, and would trigger the cleanup anyway. The reviewer comments from the previous PR was mainly about the case where the cleanup failed (for whatever reason), dockershim will never attempt to clean up again. To handle the specific corner, dockershim can periodically look at the cleanup that failed previously and attempt again. This should be a much smaller code change.

/cc @Random-Liu @liyanhui1228

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wk8
Copy link
Contributor Author
wk8 commented Aug 5, 2019

dockershim will never attempt to clean up again. To handle the specific corner, dockershim can periodically look at the cleanup that failed previously and attempt again. This should be a much smaller code change.

@yujuhong : that's indeed what this change is trying to accomplish, in a way that also stays reasonably efficient if for whatever reason we end up having a large number of such failures. Could you please be a little more specific as to what specific parts should be simplified? Thank you!

@wk8 wk8 force-pushed the wk8/gmsa_cleanup branch 2 times, most recently from 870531c to 60671ff Compare August 5, 2019 13:07
@wk8
Copy link
Contributor Author
wk8 commented Aug 6, 2019

/test pull-kubernetes-integration

1 similar comment
@wk8
Copy link
Contributor Author
wk8 commented Aug 7, 2019

/test pull-kubernetes-integration

@wk8
Copy link
Contributor Author
wk8 commented Aug 9, 2019

@yujuhong : simplified the implementation at the cost of a slightly decreased performance, please let me know what you think, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup associated -> associated cleanup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed :)

@yujuhong
Copy link
Contributor

/assign @Random-Liu

@wk8
Copy link
Contributor Author
wk8 commented Aug 22, 2019

After discussing it with @Random-Liu on Slack, it seems just running clean ups at container removals, and failing those if the clean up fails, is the cleaner way to go here :)

Will make that change as soon as @yujuhong okays it.

@yujuhong
Copy link
Contributor

After discussing it with @Random-Liu on Slack, it seems just running clean ups at container removals, and failing those if the clean up fails, is the cleaner way to go here

SGTM. Thanks!

@wk8 wk8 force-pushed the wk8/gmsa_cleanup branch from 3757cff to b041a04 Compare August 23, 2019 00:59
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 23, 2019
@wk8 wk8 changed the title Make dockershim periodically clean up its containerCleanupInfos Make container removal fail if platform-specific containers fail Aug 23, 2019
@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Aug 23, 2019
kubernetes#74737 introduced a new in-memory
map for the dockershim, that could potentially (in pathological cases) cause
memory leaks - for containers that use GMSA cred specs, get created
successfully, but then never get started nor removed.

This patch addresses this issue by making container removal fail altogether
when platform-specific clean ups fail: this allows clean ups to be retried
later, when the kubelet attempts to remove the container again.

Resolves issue kubernetes#74843.

Signed-off-by: Jean Rouge <rougej+github@gmail.com>
@wk8 wk8 force-pushed the wk8/gmsa_cleanup branch from b041a04 to 4d4edcb Compare August 23, 2019 01:04
@wk8
Copy link
Contributor Author
wk8 commented Aug 23, 2019

@yujuhong & @Random-Liu : amended as requested to fail container removal when the clean up fails. PTAL, thanks!

Copy link
Member
@Random-Liu Random-Liu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one question

if cleanupInfo, present := ds.containerCleanupInfos[containerID]; present {
ds.performPlatformSpecificContainerCleanupAndLogErrors(containerID, cleanupInfo)
delete(ds.containerCleanupInfos, containerID)
errors = ds.performPlatformSpecificContainerCleanupAndLogErrors(containerID, cleanupInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you call performPlatformSpecificContainerCleanup instead?

Or do you intentionally want to log the error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's to log any errors that might occur; it seems better to do it here as performPlatformSpecificContainerCleanup lives in platform-specific files, so logging here centralizes it for all platforms.

@PatrickLang
Copy link
Contributor

/priority important-soon
/sig windows
/milestone v1.16

@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Aug 23, 2019
@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/windows Categorizes an issue or PR as relevant to SIG Windows. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 23, 2019
@wk8
Copy link
Contributor Author
wk8 commented Aug 26, 2019

/test pull-kubernetes-e2e-gce
/test pull-kubernetes-integration

@Random-Liu
Copy link
Member

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 27, 2019
@PatrickLang
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: PatrickLang, Random-Liu, wk8

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wk8
Copy link
Contributor Author
wk8 commented Aug 27, 2019

/test pull-kubernetes-integration
/test pull-kubernetes-e2e-gce-100-performance

@wk8
Copy link
Contributor Author
wk8 commented Aug 27, 2019

/test pull-kubernetes-e2e-gce-100-performance

@k8s-ci-robot k8s-ci-robot merged commit 6cf7f3c into kubernetes:master Aug 28, 2019
wk8 added a commit to wk8/kubernetes that referenced this pull request Aug 29, 2019
Concerns from alpha-phase reviews have been addressed in
kubernetes#80320
and
kubernetes#82109
and early adopters have given positive feedback; so it seems there are
no blockers to graduate GMSA support to beta.

This patch also enables GMSA support by default, mainly for the sake of
making it easier for cluster admins to use, as well as for e2e tests in
nightly builds.

Signed-off-by: Jean Rouge <rougej+github@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make dockershim periodically clean up its containerCleanupInfos map
8 participants
0