-
Notifications
You must be signed in to change notification settings - Fork 40.9k
DRA resourceslice controller: fix recreation after quick delete #132683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRA resourceslice controller: fix recreation after quick delete #132683
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@@ -573,12 +574,14 @@ func testPrioritizedList(tCtx ktesting.TContext, enabled bool) { | |||
}).WithTimeout(10 * time.Second).WithPolling(time.Second).Should(schedulingAttempted) | |||
} | |||
|
|||
func testPublishResourceSlices(tCtx ktesting.TContext, disabledFeatures ...featuregate.Feature) { | |||
func testPublishResourceSlices(tCtx ktesting.TContext, haveLatestAPI bool, disabledFeatures ...featuregate.Feature) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have event handler tests in the unit tests of the package. Testing work queue behavior is tricky there and there's always the question whether the implemented behavior works "in the real world".
Therefore I added an integration test instead. I also used the opportunity to tests the "update slice after unwanted update" scenario.
Both tests need the v1beta2 API, whereas the original "create slices" doesn't.
disabled = append(disabled, string(feature)) | ||
// Manually turn into the expected slices, considering that some fields get dropped. | ||
expectedResources := resources.DeepCopy() | ||
var expectedSliceSpecs []resourcev1beta2.ResourceSliceSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comparison was avoided earlier because it depends on a specific API version, but it's useful to have it at least for one of them (more specifically, the latest one).
}), | ||
"NodeName": matchPointer(spec.NodeName), | ||
"NodeSelector": matchPointer(spec.NodeSelector), | ||
"AllNodes": gstruct.PointTo(gomega.BeTrue()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linter hints that this should be BeTrueBecause. That hint can be ignored here, inside a struct match we only care about the true/false comparison.
/test pull-kubernetes-e2e-gce Failed to come u. /skip One linter hint which can be ignored. |
@@ -420,11 +433,13 @@ func (c *Controller) initInformer(ctx context.Context) error { | |||
&cache.ListWatch{ | |||
ListWithContextFunc: func(ctx context.Context, options metav1.ListOptions) (runtime.Object, error) { | |||
tweakListOptions(&options) | |||
return c.resourceClient.ResourceSlices().List(ctx, options) | |||
slices, err := c.resourceClient.ResourceSlices().List(ctx, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just return directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point I had some logging inserted between the call and the return. I'll revert to directly returning the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// It's not sufficient to scheduled a delayed sync because | ||
// another sync scheduled by an event overwrites the older | ||
// one, so we would sync too soon and then not again. | ||
lastAddByPool map[string]time.Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we clarify here that the key will be the name of a pool. It wasn't obvious to me until I saw the assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
}, | ||
WatchFuncWithContext: func(ctx context.Context, options metav1.ListOptions) (watch.Interface, error) { | ||
tweakListOptions(&options) | ||
return c.resourceClient.ResourceSlices().Watch(ctx, options) | ||
w, err := c.resourceClient.ResourceSlices().Watch(ctx, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
009eec4
to
2a53b1c
Compare
If a ResourceSlice got published by the ResourceSlice controller in a DRA driver and then that ResourceSlice got deleted quickly (within one minute, the mutation cache TTL) by someone (for example, the kubelet because of a restart), then the controller did not react properly to the deletion unless some other event triggered the syncing of the pool. Found while adding upgrade/downgrade tests with a driver which keeps running across the upgrade/downgrade. The exact sequence leading to this were: - controller adds ResourceSlice, schedules a sync for one minute in the future (the TTL) - someone else deletes the ResourceSlice - add and delete events schedule another sync 30 seconds in the future (the delay), *overwriting* the other scheduled sync - sync runs once, finds deleted slices in the mutation cache, does not re-create them, and also does not run again One possible fix would be to set a resync period. But then work is done periodically, whether it's necessary or not. Another fix is to ensure that the TTL is shorter than the delay. Then when a sync occurs, all locally stored additional slices are expired. But that renders the whole storing of recently created slices in the cache pointless. So the fix used here is to keep track of when another sync has to run because of added slices. At the end of each sync, the next sync gets scheduled if (and only if) needed, until eventually syncing can stop.
2a53b1c
to
2e96624
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pohly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@pohly: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/skip |
/lgtm |
LGTM label has been added. Git tree hash: 010ce4b69fe631d737debe71a049696003490177
|
What type of PR is this?
/kind bug
What this PR does / why we need it:
If a ResourceSlice got published by the ResourceSlice controller in a DRA driver and then that ResourceSlice got deleted quickly (within one minute, the mutation cache TTL) by someone (for example, the kubelet because of a restart), then the controller did not react properly to the deletion unless some other event triggered the syncing of the pool.
Found while adding upgrade/downgrade tests with a driver which keeps running across the upgrade/downgrade.
Which issue(s) this PR is related to:
KEP: kubernetes/enhancements#4381
Special notes for your reviewer:
The exact sequence leading to this were:
One possible fix would be to set a resync period. But then work is done periodically, whether it's necessary or not.
Another fix is to ensure that the TTL is shorter than the delay. Then when a sync occurs, all locally stored additional slices are expired. But that renders the whole storing of recently created slices in the cache pointless.
So the fix used here is to keep track of when another sync has to run because of added slices. At the end of each sync, the next sync gets scheduled if (and only if) needed, until eventually syncing can stop.
Does this PR introduce a user-facing change?
/assign @mortent