Skip to content

Conversation

@klihub
Copy link
Member

@klihub klihub commented Sep 30, 2025

A plugin having received a sync request is not enough to ensure it has been fully registered on the runtime side (IOW appended to the list of active plugins). When we are waiting for test plugins to start up, use an extra plugin synchronization block/unblock to ensure they are fully registered.

A plugin having received a sync request is not enough to ensure it
has been fully registered on the runtime side (IOW appended to the
list of active plugins). Use a sync block/unblock to ensure it has.

Signed-off-by: Krisztian Litkey <[email protected]>
Copy link
Contributor

@chrishenzie chrishenzie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, how did we discover this issue?

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the plugin loop need a new time.After per plugin? Using the same timeout channel for all the plugins seems sus.

@klihub
Copy link
Member Author

klihub commented Oct 2, 2025

Just curious, how did we discover this issue?

@chrishenzie In the #235, the v1beta1 API RFC/draft PR, I added a plugin shutdown test which was failing in a weird way. That got me chasing this.

The reason for the failure was that we were waiting for plugin startup (and assumed registration) by synchronizing on the plugin getting a Synchronize request. Normally this is enough, especially if the test is doing mocked pod or container creation because those are blocking/unblocking plugin registration, so they implicitly do this synchronization.

But the test there instead immediately tried to shut down the plugin by name. Which was racy without this synchronization, so the plugin was practically never fully registered yet, so we did not find it on the adaptation side, and failed to shut it down and register the test shutdown reason. So the test failed.

@klihub
Copy link
Member Author

klihub commented Oct 2, 2025

Does the plugin loop need a new time.After per plugin? Using the same timeout channel for all the plugins seems sus.

@mikebrow Well we could do this differently, but I'm not sure if it's really necessary here. That timeout is just to put an upper bound on how long we wait in case of failure. Semantically it is a failure if any plugin does not get synchronized within that time. So we don't want to give each one separately that timeout.

@klihub klihub requested a review from mikebrow October 2, 2025 16:00
@mikebrow
Copy link
Member

mikebrow commented Oct 5, 2025

two seconds for all plugins to synch.. seemed short.

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@klihub
Copy link
Member Author

klihub commented Oct 6, 2025

two seconds for all plugins to synch.. seemed short.

Yes, I understand and probably true in general. If we had a large number of real plugins to be started like this, and they all had to do some real processing during sync, then 2 seconds overall could be too short in some cases. But in this case, we only have a few of them at most at any time, and they are all almost no-op test plugins which merely record what they were sync'd with. So I think 2 seconds should be pretty generous here.

@klihub klihub merged commit 9b8befa into containerd:main Oct 6, 2025
16 checks passed
@klihub klihub deleted the fixes/tests/ensure-fully-registered-mock-plugins branch October 6, 2025 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants