-
Notifications
You must be signed in to change notification settings - Fork 41.5k
Fixing a small bug with GMSA support #74737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -167,20 +167,28 @@ func (ds *dockerService) CreateContainer(_ context.Context, r *runtimeapi.Create | |
if err != nil { | ||
return nil, err | ||
} | ||
defer func() { | ||
for _, err := range ds.performPlatformSpecificContainerCreationCleanup(cleanupInfo) { | ||
klog.Warningf("error when cleaning up after container %v's creation: %v", containerName, err) | ||
} | ||
}() | ||
|
||
createResp, createErr := ds.client.CreateContainer(createConfig) | ||
if createErr != nil { | ||
createResp, createErr = recoverFromCreationConflictIfNeeded(ds.client, createConfig, createErr) | ||
} | ||
|
||
if createResp != nil { | ||
return &runtimeapi.CreateContainerResponse{ContainerId: createResp.ID}, nil | ||
containerID := createResp.ID | ||
|
||
if cleanupInfo != nil { | ||
// we don't perform the clean up just yet at that could destroy information | ||
// needed for the container to start (e.g. Windows credentials stored in | ||
// registry keys); instead, we'll clean up after the container successfully | ||
// starts or gets removed | ||
ds.containerCleanupInfos[containerID] = cleanupInfo | ||
} | ||
return &runtimeapi.CreateContainerResponse{ContainerId: containerID}, nil | ||
} | ||
|
||
// the creation failed, let's clean up right away | ||
ds.performPlatformSpecificContainerCleanupAndLogErrors(containerName, cleanupInfo) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. previously this would have always been called in the defer, but now its not. this seems awkwardly structured and/or confusing. i need to see what this actually was cleaning up. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. actually, nevermind, i think this makes sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @derekwaynecarr : better suggestions welcome :) This is meant to handle any clean up that might be needed after a container creation - which as of now only is registry keys used to pass down registry keys down to Docker (see previous PR). It used to be in a defer here, and so performed right after the container's creation; turned out that wasn't good enough, as Docker needs these reg keys to be present when it starts the container instead; so this clean up needs to happen after that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @wk8 could you make it clear in the comment that the key is needed for starting the container? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @yujuhong : tried to make that clear at https://github.com/kubernetes/kubernetes/pull/74737/files#diff-c7dd39479fd733354254e70845075db5R180 ; please let me know if you think that's okay. |
||
|
||
return nil, createErr | ||
} | ||
|
||
|
@@ -267,6 +275,8 @@ func (ds *dockerService) StartContainer(_ context.Context, r *runtimeapi.StartCo | |
return nil, fmt.Errorf("failed to start container %q: %v", r.ContainerId, err) | ||
} | ||
|
||
ds.performPlatformSpecificContainerForContainer(r.ContainerId) | ||
|
||
return &runtimeapi.StartContainerResponse{}, nil | ||
} | ||
|
||
|
@@ -281,6 +291,8 @@ func (ds *dockerService) StopContainer(_ context.Context, r *runtimeapi.StopCont | |
|
||
// RemoveContainer removes the container. | ||
func (ds *dockerService) RemoveContainer(_ context.Context, r *runtimeapi.RemoveContainerRequest) (*runtimeapi.RemoveContainerResponse, error) { | ||
ds.performPlatformSpecificContainerForContainer(r.ContainerId) | ||
|
||
// Ideally, log lifecycle should be independent of container lifecycle. | ||
// However, docker will remove container log after container is removed, | ||
// we can't prevent that now, so we also clean up the symlink here. | ||
|
@@ -438,3 +450,20 @@ func (ds *dockerService) UpdateContainerResources(_ context.Context, r *runtimea | |
} | ||
return &runtimeapi.UpdateContainerResourcesResponse{}, nil | ||
} | ||
|
||
func (ds *dockerService) performPlatformSpecificContainerForContainer(containerID string) { | ||
if cleanupInfo, present := ds.containerCleanupInfos[containerID]; present { | ||
ds.performPlatformSpecificContainerCleanupAndLogErrors(containerID, cleanupInfo) | ||
delete(ds.containerCleanupInfos, containerID) | ||
} | ||
} | ||
|
||
func (ds *dockerService) performPlatformSpecificContainerCleanupAndLogErrors(containerNameOrID string, cleanupInfo *containerCleanupInfo) { | ||
if cleanupInfo == nil { | ||
return | ||
} | ||
|
||
for _, err := range ds.performPlatformSpecificContainerCleanup(cleanupInfo) { | ||
klog.Warningf("error when cleaning up after container %q: %v", containerNameOrID, err) | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -155,7 +155,7 @@ type dockerNetworkHost struct { | |
*portMappingGetter | ||
} | ||
|
||
var internalLabelKeys []string = []string{containerTypeLabelKey, containerLogPathLabelKey, sandboxIDLabelKey} | ||
var internalLabelKeys = []string{containerTypeLabelKey, containerLogPathLabelKey, sandboxIDLabelKey} | ||
|
||
// ClientConfig is parameters used to initialize docker client | ||
type ClientConfig struct { | ||
|
@@ -186,6 +186,7 @@ func NewDockerClientFromConfig(config *ClientConfig) libdocker.Interface { | |
return nil | ||
} | ||
|
||
// NewDockerService creates a new `DockerService` struct. | ||
// NOTE: Anything passed to DockerService should be eventually handled in another way when we switch to running the shim as a different process. | ||
func NewDockerService(config *ClientConfig, podSandboxImage string, streamingConfig *streaming.Config, pluginSettings *NetworkPluginSettings, | ||
cgroupsName string, kubeCgroupDriver string, dockershimRootDir string, startLocalStreamingServer bool) (DockerService, error) { | ||
|
@@ -211,6 +212,7 @@ func NewDockerService(config *ClientConfig, podSandboxImage string, streamingCon | |
checkpointManager: checkpointManager, | ||
startLocalStreamingServer: startLocalStreamingServer, | ||
networkReady: make(map[string]bool), | ||
containerCleanupInfos: make(map[string]*containerCleanupInfo), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how big can this map grow? it looks like we ignore errors after cleanup and always remove from the map. so its basically we try to clean up once, and after that oh well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Only as big as the number of containers which have been successfully created, and neither successfully started nor removed yet; and which also have any clean up to be performed, ie that use GMSA cred specs.
Yes this is just best effort. These reg keys don't contain any secrets anyhow. Also, we do try to clean up again at each dockershim start. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we may need periodic cleanup to ensure memory is not leaked permanently. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
} | ||
|
||
// check docker version compatibility. | ||
|
@@ -305,6 +307,12 @@ type dockerService struct { | |
// startLocalStreamingServer indicates whether dockershim should start a | ||
// streaming server on localhost. | ||
startLocalStreamingServer bool | ||
|
||
// containerCleanupInfos maps container IDs to the `containerCleanupInfo` structs | ||
// needed to clean up after containers have been started or removed. | ||
// (see `applyPlatformSpecificDockerConfig` and `performPlatformSpecificContainerCleanup` | ||
// methods for more info). | ||
containerCleanupInfos map[string]*containerCleanupInfo | ||
} | ||
|
||
// TODO: handle context. | ||
|
@@ -411,7 +419,7 @@ func (ds *dockerService) Start() error { | |
// initCleanup is responsible for cleaning up any crufts left by previous | ||
// runs. If there are any errros, it simply logs them. | ||
func (ds *dockerService) initCleanup() { | ||
errors := ds.platformSpecificContainerCreationInitCleanup() | ||
errors := ds.platformSpecificContainerInitCleanup() | ||
|
||
for _, err := range errors { | ||
klog.Warningf("initialization error: %v", err) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do not think i reviewed the prior pr, but i am happy the gate was added.
do we know its covering all the impacted areas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekwaynecarr : the feature gate was already in the prior PR, and protecting all the impacted areas. I "just" had forgotten to also add it to this list, much to my shame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing in this pr appears to be gating on this feature gate. can we ensure gating happens by adding the feature gate with the reader in a separate pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, I'm not sure I understand what you're suggesting?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, this was the earlier PR where the feature gate
WindowsGMSA
was introduced and where the feature gating is done to enable the logic to configure pods and containers with GMSA: https://github.com/kubernetes/kubernetes/pull/73726/files#diff-4d8e81d13b02ca9a4b8919ef8cf8b6f9R40There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not catching that in the previous review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yujuhong : hardly your fault :)