-
Notifications
You must be signed in to change notification settings - Fork 83
Support direct editing of the intelRdt config #215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| github.com/onsi/ginkgo/v2 v2.19.1 | ||
| github.com/onsi/gomega v1.34.0 | ||
| github.com/opencontainers/runtime-spec v1.1.0 | ||
| github.com/opencontainers/runtime-spec v1.2.2-0.20250818071321-383cadbf08c0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marquiz Do we know what is the expected timeframe for cutting the first such runtime-spec release which would put 383cadbf08c0 commit behind a tag ?
I am asking because we have been reluctant lately to pull in untagged versions of the runtime spec, because that would cause extra hassle/work if we wanted to or were forced to cut a new release and bump containerd and cri-o to pull in that new version. For instance #157 and #166 are current drafts exactly for this reason: the runtime-spec they require is not available yet in a tagged release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sure. I don't think we need to hurry with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marquiz @mikebrow @chrishenzie In principle there is one way how we could get this partially in already now, to avoid having to keep rebasing this with a guaranteed conflict whenever we add something to LinuxContainer or LinuxContainerAdjustment.
It would be to split this up to 2 stacked PRs:
- PR#1: The bulk of this PR, without the OCI Spec bump + a single change to Generator.Adjust(): return an error if adjust.GetLinux().GetRdt() != nil
- draft PR#2: All the commits in PR#1 + the OCI Spec bump (eventually to a tagged version) + a Generator.Adjust() updated to do RDT adjustment if requested
Then we could merge #PR1 (and rebase PR#2 when that happens), and keep the OCI Spec-dependent bits pending until the necessary bits get tagged. We could apply the same to #157 and #166...
Maybe not worth the effort though, and we can just mark this a draft until we get a tagged OCI Spec.
|
nit: typo in the body of commit message for 'api: add rdt': repeated 'the' in "... the the IntelRdt message is sepate from LinuxResources. This is to ..." |
| a.Linux.Resources.Unified = make(map[string]string) | ||
| } | ||
| } | ||
| func (a *ContainerAdjustment) initLinuxRdt() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note: #118 has a bunch of queued similar changes, updating the result-handling code so that there is no need to always a priory create a fully constructed but empty adjustment. It would be nice to get that finally in, so I'll rebase it that on the latest main/HEAD then update with any missing init*() bits necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I'll review when that's updated. And then rebase when that's merged
|
@marquiz @mikebrow @chrishenzie This overall LGTM. My biggest question is the timeline for tagging a new opencontainers/runtime-spec so that bits we rely on here become available via a tagged release. |
b2a5d93 to
c089e9c
Compare
I'm on the same side. I think there's no immediate hurry with this, especially because even the runc/crun implementation(s) are not there, yet. I've been working on this area so I wanted to get this one off my shoulders. That said, I think this is ready for review. I could've submitted this as a draft, but decided not to because from my perspective this is complete. |
|
I just recalled that there's actually one aspect/feature missing from the new extensions: ability to remove the I couldn't come up with any nice solutions, I played with two alternatives but didn't commit either of them:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@marquiz I'd vote with a mild bias for 2 above (because it's how we handle other similar cases where removal is supported). But with the addition that on the wire-protocol this is the convention we use, but in the user-facing API we provide explicit functions for clearing any LinuxRdt, with something along the lines of this: func (a *ContainerAdjustment) RemoveLinuxRDT() {
a.initLinux()
a.Linux.Rdt = &LinuxRdt{
ClosId = MarkForRemoval("") // or MarkForRemoval("/") if you think that's better
}
}And then have the corresponding handling for this in intermediate result-calculation and final Spec generation. |
|
@klihub thanks for the feedback. The MarkForRemoval thingie was a good hint, I didn't have that in my early prototype. I'll cook up something along the lines you suggested and then we can mull over that. |
Signed-off-by: Markus Lehtonen <[email protected]>
Signed-off-by: Markus Lehtonen <[email protected]>
Add new message for directly managing the linux.intelRdt object of the runtime-spec. NOTE: the old RdtClass field resides under LinuxResources message but the IntelRdt message is sepate from LinuxResources. This is to maintain parity with the OCI runtime-spec. It is also much more logical because the LinuxResources message is used in many places (like pod overhead, update resources etc) where the RDT fields don't make sense or cannot be applied. Also note that the Rdt object cannot modified on container update (missing from UpdateContainerRequest) as this functionality is missing from the runtime spec and runtimes (runc/crun). Signed-off-by: Markus Lehtonen <[email protected]>
Add support for direct manipulation of the linuxRdt object in the container config. Signed-off-by: Markus Lehtonen <[email protected]>
Add support for direct manipulation of the intelRdt object of the container config. Signed-off-by: Markus Lehtonen <[email protected]>
Signed-off-by: Markus Lehtonen <[email protected]>
Signed-off-by: Markus Lehtonen <[email protected]>
Makes it possible to remove/override the linux.intelRdt object from the container configuration. Extend the API by adding new 'Remove' field to the LinuxRdt message which is used as a marker to fully delete/override the IntelRdt configuration. This patch also updates the adaptation and runtime-tools correspondingly. Signed-off-by: Markus Lehtonen <[email protected]>
|
Updated:
Thx for spotting this. Fixed |
I ended up with adding a separate field. I implemented both variants and the "prefix closid with |
@marquiz Yes, and sorry about the long sidetrack in our offline discussion earlier. I was in the totally wrong mental model, associated with most of the existing bits (slices and maps). You were absolutely right that we can't reasonably handle removal with an inline notation, because then it would not be possible to indicate a removal prior to an addition to avoiding conflict with an earlier plugin. I agree a separate dedicated field in the wire-protocol is the way to go here. |
This PR adds support for directly adjusting the
intelRdtobject of the container config.Previously, we have the concept of RDT class, which provides indirect manipulation of the
intelRdt.closIDfield. There the higher level runtime (containerd, cri-o) registers a resolver function which the NRI adaptation uses to translate the provided "RDT class" to actual closID (which is basically the resctrl group in the Linux resctrl filesystem).The API extensions in this PR provides direct control of the relevant fields of the
intelRdtobject. The PR also includes a sample plugin to test/demonstrate the functionality (including deployment files and CI integration to build/publish images)REFS:
NOTES:
closID: supported by runc and crunschemataenableMonitoring:OPEN QUESTION/TODO:
In this PR there is no mechanism for removing the
intelRdtobject from the config. That is probably a desirable feature.