8000 Fixing a small bug with GMSA support by wk8 · Pull Request #74737 · kubernetes/kubernetes · GitHub
[go: up one dir, main page]

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pkg/features/kube_features.go
Original file line number Diff line number Diff line change
Expand Up @@ -480,6 +480,7 @@ var defaultKubernetesFeatureGates = map[utilfeature.Feature]utilfeature.FeatureS
ProcMountType: {Default: false, PreRelease: utilfeature.Alpha},
TTLAfterFinished: {Default: false, PreRelease: utilfeature.Alpha},
KubeletPodResources: {Default: false, PreRelease: utilfeature.Alpha},
WindowsGMSA: {Default: false, PreRelease: utilfeature.Alpha},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do not think i reviewed the prior pr, but i am happy the gate was added.

do we know its covering all the impacted areas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekwaynecarr : the feature gate was already in the prior PR, and protecting all the impacted areas. I "just" had forgotten to also add it to this list, much to my shame.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing in this pr appears to be gating on this feature gate. can we ensure gating happens by adding the feature gate with the reader in a separate pr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I'm not sure I understand what you're suggesting?

Copy link
Contributor
@ddebroy ddebroy Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, this was the earlier PR where the feature gate WindowsGMSA was introduced and where the feature gating is done to enable the logic to configure pods and containers with GMSA: https://github.com/kubernetes/kubernetes/pull/73726/files#diff-4d8e81d13b02ca9a4b8919ef8cf8b6f9R40

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not catching that in the previous review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yujuhong : hardly your fault :)


// inherited features from generic apiserver, relisted here to get a conflict if it is changed
// unintentionally on either side:
Expand Down
41 changes: 35 additions & 6 deletions pkg/kubelet/dockershim/docker_container.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,20 +167,28 @@ func (ds *dockerService) CreateContainer(_ context.Context, r *runtimeapi.Create
if err != nil {
return nil, err
}
defer func() {
for _, err := range ds.performPlatformSpecificContainerCreationCleanup(cleanupInfo) {
klog.Warningf("error when cleaning up after container %v's creation: %v", containerName, err)
}
}()

createResp, createErr := ds.client.CreateContainer(createConfig)
if createErr != nil {
createResp, createErr = recoverFromCreationConflictIfNeeded(ds.client, createConfig, createErr)
}

if createResp != nil {
return &runtimeapi.CreateContainerResponse{ContainerId: createResp.ID}, nil
containerID := createResp.ID

if cleanupInfo != nil {
// we don't perform the clean up just yet at that could destroy information
// needed for the container to start (e.g. Windows credentials stored in
// registry keys); instead, we'll clean up after the container successfully
// starts or gets removed
ds.containerCleanupInfos[containerID] = cleanupInfo
}
return &runtimeapi.CreateContainerResponse{ContainerId: containerID}, nil
}

// the creation failed, let's clean up right away
ds.performPlatformSpecificContainerCleanupAndLogErrors(containerName, cleanupInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously this would have always been called in the defer, but now its not. this seems awkwardly structured and/or confusing. i need to see what this actually was cleaning up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, nevermind, i think this makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekwaynecarr : better suggestions welcome :)

This is meant to handle any clean up that might be needed after a container creation - which as of now only is registry keys used to pass down registry keys down to Docker (see previous PR).

It used to be in a defer here, and so performed right after the container's creation; turned out that wasn't good enough, as Docker needs these reg keys to be present when it starts the container instead; so this clean up needs to happen after that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wk8 could you make it clear in the comment that the key is needed for starting the container?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yujuhong : tried to make that clear at https://github.com/kubernetes/kubernetes/pull/74737/files#diff-c7dd39479fd733354254e70845075db5R180 ; please let me know if you think that's okay.


return nil, createErr
}

Expand Down Expand Up @@ -267,6 +275,8 @@ func (ds *dockerService) StartContainer(_ context.Context, r *runtimeapi.StartCo
return nil, fmt.Errorf("failed to start container %q: %v", r.ContainerId, err)
}

ds.performPlatformSpecificContainerForContainer(r.ContainerId)

return &runtimeapi.StartContainerResponse{}, nil
}

Expand All @@ -281,6 +291,8 @@ func (ds *dockerService) StopContainer(_ context.Context, r *runtimeapi.StopCont

// RemoveContainer removes the container.
func (ds *dockerService) RemoveContainer(_ context.Context, r *runtimeapi.RemoveContainerRequest) (*runtimeapi.RemoveContainerResponse, error) {
ds.performPlatformSpecificContainerForContainer(r.ContainerId)

// Ideally, log lifecycle should be independent of container lifecycle.
// However, docker will remove container log after container is removed,
// we can't prevent that now, so we also clean up the symlink here.
Expand Down Expand Up @@ -438,3 +450,20 @@ func (ds *dockerService) UpdateContainerResources(_ context.Context, r *runtimea
}
return &runtimeapi.UpdateContainerResourcesResponse{}, nil
}

func (ds *dockerService) performPlatformSpecificContainerForContainer(containerID string) {
if cleanupInfo, present := ds.containerCleanupInfos[containerID]; present {
ds.performPlatformSpecificContainerCleanupAndLogErrors(containerID, cleanupInfo)
delete(ds.containerCleanupInfos, containerID)
}
}

func (ds *dockerService) performPlatformSpecificContainerCleanupAndLogErrors(containerNameOrID string, cleanupInfo *containerCleanupInfo) {
if cleanupInfo == nil {
return
}

for _, err := range ds.performPlatformSpecificContainerCleanup(cleanupInfo) {
klog.Warningf("error when cleaning up after container %q: %v", containerNameOrID, err)
}
}
29 changes: 19 additions & 10 deletions pkg/kubelet/dockershim/docker_container_unsupported.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,26 +23,35 @@ import (
runtimeapi "k8s.io/kubernetes/pkg/kubelet/apis/cri/runtime/v1alpha2"
)

type containerCreationCleanupInfo struct{}
type containerCleanupInfo struct{}

// applyPlatformSpecificDockerConfig applies platform-specific configurations to a dockertypes.ContainerCreateConfig struct.
// The containerCreationCleanupInfo struct it returns will be passed as is to performPlatformSpecificContainerCreationCleanup
// after the container has been created.
func (ds *dockerService) applyPlatformSpecificDockerConfig(*runtimeapi.CreateContainerRequest, *dockertypes.ContainerCreateConfig) (*containerCreationCleanupInfo, error) {
// The containerCleanupInfo struct it returns will be passed as is to performPlatformSpecificContainerCleanup
// after either:
// * the container creation has failed
// * the container has been successfully started
// * the container has been removed
// whichever happens first.
func (ds *dockerService) applyPlatformSpecificDockerConfig(*runtimeapi.CreateContainerRequest, *dockertypes.ContainerCreateConfig) (*containerCleanupInfo, error) {
return nil, nil
}

// performPlatformSpecificContainerCreationCleanup is responsible for doing any platform-specific cleanup
// after a container creation. Any errors it returns are simply logged, but do not fail the container
// creation.
func (ds *dockerService) performPlatformSpecificContainerCreationCleanup(cleanupInfo *containerCreationCleanupInfo) (errors []error) {
// performPlatformSpecificContainerCleanup is responsible for doing any platform-specific cleanup
// after either:
// * the container creation has failed
// * the container has been successfully started
// * the container has been removed
// whichever happens first.
// Any errors it returns are simply logged, but do not prevent the container from being started or
// removed.
func (ds *dockerService) performPlatformSpecificContainerCleanup(cleanupInfo *containerCleanupInfo) (errors []error) {
return
}

// platformSpecificContainerCreationInitCleanup is called when dockershim
// platformSpecificContainerInitCleanup is called when dockershim
// is starting, and is meant to clean up any cruft left by previous runs
// creating containers.
// Errors are simply logged, but don't prevent dockershim from starting.
func (ds *dockerService) platformSpecificContainerCreationInitCleanup() (errors []error) {
func (ds *dockerService) platformSpecificContainerInitCleanup() (errors []error) {
return
}
35 changes: 22 additions & 13 deletions pkg/kubelet/dockershim/docker_container_windows.go
< 6D38 tr data-hunk="e58c3c2d90773deb4a00e3b94705d194efc6ccb3a7ea6fa48abdb0a7a2cbdf98" class="show-top-border">
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,19 @@ import (
"k8s.io/kubernetes/pkg/kubelet/kuberuntime"
)

type containerCreationCleanupInfo struct {
type containerCleanupInfo struct {
gMSARegistryValueName string
}

// applyPlatformSpecificDockerConfig applies platform-specific configurations to a dockertypes.ContainerCreateConfig struct.
// The containerCreationCleanupInfo struct it returns will be passed as is to performPlatformSpecificContainerCreationCleanup
// after the container has been created.
func (ds *dockerService) applyPlatformSpecificDockerConfig(request *runtimeapi.CreateContainerRequest, createConfig *dockertypes.ContainerCreateConfig) (*containerCreationCleanupInfo, error) {
cleanupInfo := &containerCreationCleanupInfo{}
// The containerCleanupInfo struct it returns will be passed as is to performPlatformSpecificContainerCleanup
// after either:
// * the container creation has failed
// * the container has been successfully started
// * the container has been removed
// whichever happens first.
func (ds *dockerService) applyPlatformSpecificDockerConfig(request *runtimeapi.CreateContainerRequest, createConfig *dockertypes.ContainerCreateConfig) (*containerCleanupInfo, error) {
cleanupInfo := &containerCleanupInfo{}

if err := applyGMSAConfig(request.GetConfig(), createConfig, cleanupInfo); err != nil {
return nil, err
Expand All @@ -58,7 +62,7 @@ func (ds *dockerService) applyPlatformSpecificDockerConfig(request *runtimeapi.C
// When docker supports passing a credential spec's contents directly, we should switch to using that
// as it will avoid cluttering the registry - there is a moby PR out for this:
// https://github.com/moby/moby/pull/38777
func applyGMSAConfig(config *runtimeapi.ContainerConfig, createConfig *dockertypes.ContainerCreateConfig, cleanupInfo *containerCreationCleanupInfo) error {
func applyGMSAConfig(config *runtimeapi.ContainerConfig, createConfig *dockertypes.ContainerCreateConfig, cleanupInfo *containerCleanupInfo) error {
credSpec := config.Annotations[kuberuntime.GMSASpecContainerAnnotationKey]
if credSpec == "" {
return nil
Expand Down Expand Up @@ -156,18 +160,23 @@ func randomString(length int) (string, error) {
return hex.EncodeToString(randBytes), nil
}

// performPlatformSpecificContainerCreationCleanup is responsible for doing any platform-specific cleanup
// after a container creation. Any errors it returns are simply logged, but do not fail the container
// creation.
func (ds *dockerService) performPlatformSpecificContainerCreationCleanup(cleanupInfo *containerCreationCleanupInfo) (errors []error) {
// performPlatformSpecificContainerCleanup is responsible for doing any platform-specific cleanup
// after either:
// * the container creation has failed
// * the container has been successfully started
// * the container has been removed
// whichever happens first.
// Any errors it returns are simply logged, but do not prevent the container from being started or
// removed.
func (ds *dockerService) performPlatformSpecificContainerCleanup(cleanupInfo *containerCleanupInfo) (errors []error) {
if err := removeGMSARegistryValue(cleanupInfo); err != nil {
errors = append(errors, err)
}

return
}

func removeGMSARegistryValue(cleanupInfo *containerCreationCleanupInfo) error {
func removeGMSARegistryValue(cleanupInfo *containerCleanupInfo) error {
if cleanupInfo == nil || cleanupInfo.gMSARegistryValueName == "" {
return nil
}
Expand All @@ -184,11 +193,11 @@ func removeGMSARegistryValue(cleanupInfo *containerCreationCleanupInfo) error {
return nil
}

// platformSpecificContainerCreationInitCleanup is called when dockershim
// platformSpecificContainerInitCleanup is called when dockershim
// is starting, and is meant to clean up any cruft left by previous runs
// creating containers.
// Errors are simply logged, but don't prevent dockershim from starting.
func (ds *dockerService) platformSpecificContainerCreationInitCleanup() (errors []error) {
func (ds *dockerService) platformSpecificContainerInitCleanup() (errors []error) {
return removeAllGMSARegistryValues()
}

Expand Down
16 changes: 8 additions & 8 deletions pkg/kubelet/dockershim/docker_container_windows_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ func TestApplyGMSAConfig(t *testing.T) {
defer setRandomReader(randomBytes)()

createConfig := &dockertypes.ContainerCreateConfig{}
cleanupInfo := &containerCreationCleanupInfo{}
cleanupInfo := &containerCleanupInfo{}
err := applyGMSAConfig(containerConfigWithGMSAAnnotation, createConfig, cleanupInfo)

assert.Nil(t, err)
Expand All @@ -105,7 +105,7 @@ func TestApplyGMSAConfig(t *testing.T) {
defer setRegistryCreateKeyFunc(t, &dummyRegistryKey{})()

createConfig := &dockertypes.ContainerCreateConfig{}
cleanupInfo := &containerCreationCleanupInfo{}
cleanupInfo := &containerCleanupInfo{}
err := applyGMSAConfig(containerConfigWithGMSAAnnotation, createConfig, cleanupInfo)

assert.Nil(t, err)
Expand All @@ -127,15 +127,15 @@ func TestApplyGMSAConfig(t *testing.T) {
t.Run("when there's an error generating the random value name", func(t *testing.T) {
defer setRandomReader([]byte{})()

err := applyGMSAConfig(containerConfigWithGMSAAnnotation, &dockertypes.ContainerCreateConfig{}, &containerCreationCleanupInfo{})
err := applyGMSAConfig(containerConfigWithGMSAAnnotation, &dockertypes.ContainerCreateConfig{}, &containerCleanupInfo{})

require.NotNil(t, err)
assert.Contains(t, err.Error(), "error when generating gMSA registry value name: unable to generate random string")
})
t.Run("if there's an error opening the registry key", func(t *testing.T) {
defer setRegistryCreateKeyFunc(t, &dummyRegistryKey{}, fmt.Errorf("dummy error"))()

err := applyGMSAConfig(containerConfigWithGMSAAnnotation, &dockertypes.ContainerCreateConfig{}, &containerCreationCleanupInfo{})
err := applyGMSAConfig(containerConfigWithGMSAAnnotation, &dockertypes.ContainerCreateConfig{}, &containerCleanupInfo{})

require.NotNil(t, err)
assert.Contains(t, err.Error(), "unable to open registry key")
Expand All @@ -145,7 +145,7 @@ func TestApplyGMSAConfig(t *testing.T) {
key.setStringValueError = fmt.Errorf("dummy error")
defer setRegistryCreateKeyFunc(t, key)()

err := applyGMSAConfig(containerConfigWithGMSAAnnotation, &dockertypes.ContainerCreateConfig{}, &containerCreationCleanupInfo{})
err := applyGMSAConfig(containerConfigWithGMSAAnnotation, &dockertypes.ContainerCreateConfig{}, &containerCleanupInfo{})

if assert.NotNil(t, err) {
assert.Contains(t, err.Error(), "unable to write into registry value")
Expand All @@ -155,7 +155,7 @@ func TestApplyGMSAConfig(t *testing.T) {
t.Run("if there is no GMSA annotation", func(t *testing.T) {
createConfig := &dockertypes.ContainerCreateConfig{}

err := applyGMSAConfig(&runtimeapi.ContainerConfig{}, createConfig, &containerCreationCleanupInfo{})
err := applyGMSAConfig(&runtimeapi.ContainerConfig{}, createConfig, &containerCleanupInfo{})

assert.Nil(t, err)
assert.Nil(t, createConfig.HostConfig)
Expand All @@ -164,7 +164,7 @@ func TestApplyGMSAConfig(t *testing.T) {

func TestRemoveGMSARegistryValue(t *testing.T) {
valueName := "k8s-cred-spec-1900254518529e2a3dedb85cdec03ce270559647459ab531f07af5eb1c5495fda709435ce82ab89c"
cleanupInfoWithValue := &containerCreationCleanupInfo{gMSARegistryValueName: valueName}
cleanupInfoWithValue := &containerCleanupInfo{gMSARegistryValueName: valueName}

t.Run("it does remove the registry value", func(t *testing.T) {
key := &dummyRegistryKey{}
Expand Down Expand Up @@ -204,7 +204,7 @@ func TestRemoveGMSARegistryValue(t *testing.T) {
key := &dummyRegistryKey{}
defer setRegistryCreateKeyFunc(t, key)()

err := removeGMSARegistryValue(&containerCreationCleanupInfo{})
err := removeGMSARegistryValue(&containerCleanupInfo{})

assert.Nil(t, err)
assert.Equal(t, 0, len(key.deleteValueArgs))
Expand Down
12 changes: 10 additions & 2 deletions pkg/kubelet/dockershim/docker_service.go
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ type dockerNetworkHost struct {
*portMappingGetter
}

var internalLabelKeys []string = []string{containerTypeLabelKey, containerLogPathLabelKey, sandboxIDLabelKey}
var internalLabelKeys = []string{containerTypeLabelKey, containerLogPathLabelKey, sandboxIDLabelKey}

// ClientConfig is parameters used to initialize docker client
type ClientConfig struct {
Expand Down Expand Up @@ -186,6 +186,7 @@ func NewDockerClientFromConfig(config *ClientConfig) libdocker.Interface {
return nil
}

// NewDockerService creates a new `DockerService` struct.
// NOTE: Anything passed to DockerService should be eventually handled in another way when we switch to running the shim as a different process.
func NewDockerService(config *ClientConfig, podSandboxImage string, streamingConfig *streaming.Config, pluginSettings *NetworkPluginSettings,
cgroupsName string, kubeCgroupDriver string, dockershimRootDir string, startLocalStreamingServer bool) (DockerService, error) {
Expand All @@ -211,6 +212,7 @@ func NewDockerService(config *ClientConfig, podSandboxImage string, streamingCon
checkpointManager: checkpointManager,
startLocalStreamingServer: startLocalStreamingServer,
networkReady: make(map[string]bool),
containerCleanupInfos: make(map[string]*containerCleanupInfo),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how big can this map grow? it looks like we ignore errors after cleanup and always remove from the map. so its basically we try to clean up once, and after that oh well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how big can this map grow?

Only as big as the number of containers which have been successfully created, and neither successfully started nor removed yet; and which also have any clean up to be performed, ie that use GMSA cred specs.

it looks like we ignore errors after cleanup and always remove from the map. so its basically we try to clean up once, and after that oh well?

Yes this is just best effort. These reg keys don't contain any secrets anyhow. Also, we do try to clean up again at each dockershim start.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may need periodic cleanup to ensure memory is not leaked permanently.
I think this is okay for alpha, but would like this to be addressed by beta. Please add a comment and file an issue and/or update the KEP to reflect this (if @derekwaynecarr agrees with not doing this for alpha)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yujuhong : good call. Created #74843 and assigned it to myself.

}

// check docker version compatibility.
Expand Down Expand Up @@ -305,6 +307,12 @@ type dockerService struct {
// startLocalStreamingServer indicates whether dockershim should start a
// streaming server on localhost.
startLocalStreamingServer bool

// containerCleanupInfos maps container IDs to the `containerCleanupInfo` structs
// needed to clean up after containers have been started or removed.
// (see `applyPlatformSpecificDockerConfig` and `performPlatformSpecificContainerCleanup`
// methods for more info).
containerCleanupInfos map[string]*containerCleanupInfo
}

// TODO: handle context.
Expand Down Expand Up @@ -411,7 +419,7 @@ func (ds *dockerService) Start() error {
// initCleanup is responsible for cleaning up any crufts left by previous
// runs. If there are any errros, it simply logs them.
func (ds *dockerService) initCleanup() {
errors := ds.platformSpecificContainerCreationInitCleanup()
errors := ds.platformSpecificContainerInitCleanup()

for _, err := range errors {
klog.Warningf("initialization error: %v", err)
Expand Down
0