Make irq-load-balancing.crio.io configurable per CPU#9223
Make irq-load-balancing.crio.io configurable per CPU#9223openshift-merge-bot[bot] merged 3 commits intocri-o:mainfrom
Conversation
|
Hi @andreaskaris. Thanks for your PR. I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
2ef2c88 to
6b73368
Compare
|
/ok-to-test |
0e59050 to
5d58c45
Compare
758735e to
674c9a6
Compare
|
/retest |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #9223 +/- ##
==========================================
- Coverage 67.04% 66.46% -0.59%
==========================================
Files 202 202
Lines 28159 28238 +79
==========================================
- Hits 18880 18767 -113
- Misses 7702 7870 +168
- Partials 1577 1601 +24 🚀 New features to boost your workflow:
|
|
Andreas proposes irq-load-balancing.crio.io: "{: , ...}" But it might be better to implement it using something like this I think housekeeping-cpus.crio.io/[container-name]: 0,1 cri-o would have to automatically determine the housekeeping cpus' siblings too This would allow easier reuse of the cpu list with other features like "oc exec" cpus. |
|
I'll change this PR according to Martin's suggestion |
ce8f83c to
34ad65a
Compare
| // injectHousekeepingEnv adds the HOUSEKEEPING_CPUS environment variable to the container. | ||
| // This allows the container to be aware of which CPUs are designated for housekeeping tasks. | ||
| func injectHousekeepingEnv(specgen *generate.Generator, housekeeping cpuset.CPUSet) error { | ||
| if specgen == nil || specgen.Config == nil || specgen.Config.Process == nil { | ||
| return fmt.Errorf("specgen or one of its required elements is nil, specgen: %v", specgen) | ||
| } | ||
|
|
||
| specgen.Config.Process.Env = append(specgen.Config.Process.Env, | ||
| fmt.Sprintf("%s=%s", HousekeepingCPUsEnvVar, housekeeping.String()), | ||
| ) | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
We can replace this function to https://pkg.go.dev/github.com/opencontainers/runtime-tools/generate#Generator.AddProcessEnv
There was a problem hiding this comment.
When I add:
specgen.AddProcessEnv(HousekeepingCPUsEnvVar, housekeeping.String())
That yields:
$ go test -v . -ginkgo.v -ginkgo.focus='.*setIRQLoadBalancing with housekeeping CPUs.*'
(...)
[PANICKED] Test Panicked
In [It] at: /usr/local/go/src/internal/runtime/maps/runtime_faststr_swiss.go:265 @ 10/03/25 14:04:38.558
assignment to entry in nil map
Full Stack Trace
github.com/opencontainers/runtime-tools/generate.(*Generator).addEnv(...)
/home/akaris/development/cri-o/vendor/github.com/opencontainers/runtime-tools/generate/generate.go:543
github.com/opencontainers/runtime-tools/generate.(*Generator).AddProcessEnv(0xc0000117a0, {0x2b88dbd, 0x11}, {0xc0003c6014, 0x3})
/home/akaris/development/cri-o/vendor/github.com/opencontainers/runtime-tools/generate/generate.go:519 +0x2f2
github.com/cri-o/cri-o/internal/runtimehandlerhooks.injectHousekeepingEnv(0xc0000117a0, {0x2f070b8?})
/home/akaris/development/cri-o/internal/runtimehandlerhooks/high_performance_hooks_linux.go:1576 +0x1a9
github.com/cri-o/cri-o/internal/runtimehandlerhooks.init.func1.6.4(0xc0002edd40, 0xc0001d0e00, 0x0, {0x2b88d35, 0x11}, {0x2b88d46, 0x11}, {0x2b6d115, 0x3}, 0x0)
/home/akaris/development/cri-o/internal/runtimehandlerhooks/high_performance_hooks_test.go:392 +0x5ed
github.com/cri-o/cri-o/internal/runtimehandlerhooks.init.func1.6.6.5()
/home/akaris/development/cri-o/internal/runtimehandlerhooks/high_performance_hooks_test.go:448 +0xe5
Because g.envMap is nil:
func (g *Generator) addEnv(env, key string) {
if idx, ok := g.envMap[key]; ok {
// The ENV exists in the cache, so change its value in g.Config.Process.Env
g.Config.Process.Env[idx] = env
} else {
// else the env doesn't exist, so add it and add it's index to g.envMap
g.Config.Process.Env = append(g.Config.Process.Env, env)
g.envMap[key] = len(g.Config.Process.Env) - 1
}
}
In the same file, we already have this function here which I based injectHouseKeepingEnv on:
func injectCpusetEnv(specgen *generate.Generator, isolated, shared *cpuset.CPUSet) {
spec := specgen.Config
spec.Process.Env = append(spec.Process.Env,
fmt.Sprintf("%s=%s", IsolatedCPUsEnvVar, isolated.String()),
fmt.Sprintf("%s=%s", SharedCPUsEnvVar, shared.String()))
}
Given the failure in CI (may be a false positive and I might have to somehow initialize the envmap, but dunno), I'd prefer to not change it
There was a problem hiding this comment.
Other fields are normally initialized, so I expected Env also to be initialized. It sounds like a bug in runtime-tools, and definitely not a blocker of this PR.
There was a problem hiding this comment.
After a second look ...
In theory, in order to make this work, I have to simply change the initialization in the unit test:
specgen := generate.NewFromSpec(
&specs.Spec{
Process: &specs.Process{},
})
However, my problem with this is that there are locations in the code where cri-o calls generate.Generator{} directly, instead of using the New... functions:
[akaris@workstation cri-o (numeric-irq-load-balancing)]$ grep generate.Generator{ -RI
internal/lib/restore_test.go: g := generate.Generator{Config: &spec}
internal/oci/oci_linux.go: g := &generate.Generator{
internal/runtimehandlerhooks/high_performance_hooks_test.go: return &generate.Generator{
pkg/config/workloads_test.go: g := &generate.Generator{
pkg/config/workloads_test.go: g := &generate.Generator{
[akaris@workstation cri-o (numeric-irq-load-balancing)]$ grep generate.New -RI
internal/lib/restore.go: ctrSpec, err := generate.NewFromFile(filepath.Join(ctr.Dir(), "config.json"))
internal/lib/checkpoint.go: specgen, err := generate.NewFromFile(configFile)
internal/lib/checkpoint.go: g, err := generate.NewFromFile(jsonPath)
internal/config/seccomp/seccomp_test.go: generator, err := generate.New("linux")
internal/config/seccomp/seccomp_test.go: generator, err := generate.New("linux")
internal/config/seccomp/seccomp_test.go: generator, err := generate.New("linux")
internal/factory/container/container.go: spec, err := generate.New(runtime.GOOS)
internal/runtimehandlerhooks/high_performance_hooks_test.go: specgen := generate.NewFromSpec(
So there's a legit risk that the map may not be initialized.
So this here might work:
$ git diff
diff --git a/internal/runtimehandlerhooks/high_performance_hooks_linux.go b/internal/runtimehandlerhooks/high_performance_hooks_linux.go
index c1ba7301f6..014758a484 100644
--- a/internal/runtimehandlerhooks/high_performance_hooks_linux.go
+++ b/internal/runtimehandlerhooks/high_performance_hooks_linux.go
@@ -1569,13 +1569,11 @@ func excludeHousekeepingCPUs(cpus string, housekeepingSiblings cpuset.CPUSet) (c
// injectHousekeepingEnv adds the HOUSEKEEPING_CPUS environment variable to the container.
// This allows the container to be aware of which CPUs are designated for housekeeping tasks.
func injectHousekeepingEnv(specgen *generate.Generator, housekeeping cpuset.CPUSet) error {
- if specgen == nil || specgen.Config == nil || specgen.Config.Process == nil {
- return fmt.Errorf("specgen or one of its required elements is nil, specgen: %v", specgen)
+ if specgen == nil {
+ return fmt.Errorf("specgen is nil, specgen")
}
- specgen.Config.Process.Env = append(specgen.Config.Process.Env,
- fmt.Sprintf("%s=%s", HousekeepingCPUsEnvVar, housekeeping.String()),
- )
+ specgen.AddProcessEnv(HousekeepingCPUsEnvVar, housekeeping.String())
return nil
}
diff --git a/internal/runtimehandlerhooks/high_performance_hooks_test.go b/internal/runtimehandlerhooks/high_performance_hooks_test.go
index e297e7c43b..d46c32c68a 100644
--- a/internal/runtimehandlerhooks/high_performance_hooks_test.go
+++ b/internal/runtimehandlerhooks/high_performance_hooks_test.go
@@ -383,11 +383,10 @@ var _ = Describe("high_performance_hooks", func() {
Expect(bannedCPUs).To(Equal(expectedBan))
// Also test that injection via injectHousekeepingEnv works correctly.
- specgen := generate.Generator{
- Config: &specs.Spec{
+ specgen := generate.NewFromSpec(
+ &specs.Spec{
Process: &specs.Process{},
- },
- }
+ })
if !housekeepingSiblings.IsEmpty() {
err := injectHousekeepingEnv(&specgen, housekeepingSiblings)
Expect(err).ToNot(HaveOccurred())
But I feel it's risky. I'll create a helper to initialize the missing spec and stuff manually instead of relying on specgen.AddProcessEnv(
There was a problem hiding this comment.
https://pkg.go.dev/github.com/opencontainers/runtime-tools/generate#NewFromSpec
NewFromSpec is deprecated.
There was a problem hiding this comment.
my bad, I was looking at the old version. The deprecation was cancelled.
opencontainers/runtime-tools#747
There was a problem hiding this comment.
Precisely because of The recommended replacement does not work because other code paths expect an initialized envCacheMap.. I kind of wished runtime-tools initialized that cache on the fly if it wasn't initialized ...
There was a problem hiding this comment.
Anyway, I reported the issue here:
20cfba2 to
f33ce9f
Compare
|
/lgtm cancel |
f02ac74 to
8031195
Compare
|
@bitoku So, I did the following, now - add this commit: With that, I think that all the production code is actually using the generator.New...() methods; and it's just the _test stuff that doesn't: I added another commit for the injectCPUsetEnv: And in the actual commit, I did: And in the unit test: |
8031195 to
0da5659
Compare
The Generator struct contains an internal envMap field that caches environment variable positions for performance optimization when adding environment variables. This map is critical for the proper operation of the AddProcessEnv() and AddMultipleProcessEnv() methods, which rely on it to efficiently detect and update duplicate environment variables. The previous code directly instantiated a Generator struct literal, which leaves the envMap field as nil (the zero value for maps in Go). When methods like AddProcessEnv() are subsequently called, they attempt to access g.envMap[key], which will work on a nil map for reads but creates a bug: the code path at line 543 tries to assign to g.envMap, which will panic with "assignment to entry in nil map". While the current code path in createContainerPlatform() doesn't immediately call methods that would trigger this panic, the bug represents a latent defect that could cause runtime panics if future changes call environment variable manipulation methods on this Generator instance. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Add support for the "housekeeping" annotation value in the IRQ load balancing feature. When irq-load-balancing.crio.io is set to "housekeeping", IRQ interrupts are preserved on the first CPU and its thread siblings, while being disabled on the remaining container CPUs. The implementation also injects environment variable HOUSEKEEPING_CPUS into containers. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Add a nil pointer check in isContainerRequestWholeCPU, just for safety measures, as the cSpec or several of the fields might be nil. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
|
@andreaskaris: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/retest |
|
Thanks for the reviews so far. I'm just wondering if I addressed all of your concerns @bitoku @bartwensley , otherwise please lmk and I'm happy to make the required changes |
bitoku
left a comment
There was a problem hiding this comment.
/lgtm
I think it's better to get LGTM from @MarSik or @bartwensley .
|
@MarSik: changing LGTM is restricted to collaborators DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@cri-o/cri-o-maintainers PTAL for approval. |
|
/lgtm |
|
@bartwensley: changing LGTM is restricted to collaborators DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retest |
|
/approve thank you! |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andreaskaris, haircommander The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/cherry-pick release-1.34 |
|
@haircommander: new pull request created: #9564 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
For more details, see https://issues.redhat.com/browse/RFE-7465 | https://issues.redhat.com/browse/TELCOSTRAT-318
What type of PR is this?
/kind feature
What this PR does / why we need it:
This commit introduces new value
housekeepingfor annotationirq-load-balancing.crio.io.When
housekeepingis set:The housekeeping CPUs are chosen as the first CPU inside each container plus its thread siblings.
Reason for this change:
Which issue(s) this PR fixes:
This enhancement introduces the ability to configure the
irq-load-balancing.crio.ioannotation to use a new "housekeeping" mode within CRI-O. When set to "housekeeping", it preserves IRQs on the first CPU and its thread siblings in side each container.Special notes for your reviewer:
It is acknowledged that CPUs where IRQs are not disabled will handle IRQs for the entire system. Customers have reviewed this behavior and confirmed it is acceptable. The added flexibility and improved efficiency are seen as a worthwhile trade-off.
Does this PR introduce a user-facing change?
Smoketest and smoke test environment (single thread per core)
smoketest.sh
pods.yaml.j2
The virtual system that I'm running the smoke test on has:
Kubelet config:
crio config:
Smoketest and smoke test environment (2 thread siblings)
smoketest.sh
Everything else the same, 7 cores, 2 thread siblings (0-1, 2-3, and so on) on my test VM.