-
Notifications
You must be signed in to change notification settings - Fork 163
Report recommended CPU models for heterogeneous clusters #3944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Skipping CI for Draft Pull Request. |
1 similar comment
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test all |
a18fa9d to
4423d64
Compare
deploy/olm-catalog/community-kubevirt-hyperconverged/1.16.0/manifests/hco00.crd.yaml
Show resolved
Hide resolved
4423d64 to
ffb8383
Compare
nunnatsa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the comment, Added some more inline.
Please also handle the linter issues (redefinition of the builtin function max).
ffb8383 to
5d2791e
Compare
Pull Request Test Coverage Report for Build 20880735694Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 4 issues, and left some high level feedback:
- The
cpu_models_test.godescription and expectations talk about “weak node filtering” and excluding models that appear only on a weak node (e.g. Penryn), butprocessCpuModelsonly sorts and truncates by score without any explicit filtering, so either the heuristic or the test/comments should be aligned to make the intended behavior clear and actually enforced. - In the CRD schema
CpuModelInfo.cpuandmemoryare marked as required, but the Go struct uses*resource.Quantityand the code paths inprocessCpuModelscan leave them nil if capacity is missing, which can violate the schema; consider either making these fields non-pointer in the API or relaxing the CRDrequiredlist and guarding against nil more explicitly. - You now have two separate equality implementations for
CpuModelInfo(one inrecommendedModelsEqualand one inline viaslices.EqualFuncinupdateStatus); it would be more robust to centralize this comparison logic in a shared helper to avoid future drift between the cache and status update behaviors.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `cpu_models_test.go` description and expectations talk about “weak node filtering” and excluding models that appear only on a weak node (e.g. Penryn), but `processCpuModels` only sorts and truncates by score without any explicit filtering, so either the heuristic or the test/comments should be aligned to make the intended behavior clear and actually enforced.
- In the CRD schema `CpuModelInfo.cpu` and `memory` are marked as required, but the Go struct uses `*resource.Quantity` and the code paths in `processCpuModels` can leave them nil if capacity is missing, which can violate the schema; consider either making these fields non-pointer in the API or relaxing the CRD `required` list and guarding against nil more explicitly.
- You now have two separate equality implementations for `CpuModelInfo` (one in `recommendedModelsEqual` and one inline via `slices.EqualFunc` in `updateStatus`); it would be more robust to centralize this comparison logic in a shared helper to avoid future drift between the cache and status update behaviors.
## Individual Comments
### Comment 1
<location> `pkg/internal/nodeinfo/cpu_models.go:26-27` </location>
<code_context>
+ nodeWeight = 0.15
+)
+
+// cpuModelPassMarkScores maps libvirt CPU model names to approximate PassMark scores.
+// Keys must match exactly the cpu-model.node.kubevirt.io/* label values.
+// The precise values do not matter much, this is intended only as a rough heuristic.
+var cpuModelPassMarkScores = map[string]int{
</code_context>
<issue_to_address>
**nitpick:** The comment about cpuModelPassMarkScores referring to label "values" is misleading given extractCpuModels uses the label key suffix.
`extractCpuModels` derives the model name from the suffix of the label *key* (after `CpuModelLabelPrefix`) and only checks that the label value is "true". So this map’s keys must match `strings.TrimPrefix(labelKey, CpuModelLabelPrefix)`, not the label value. Please update the comment to reflect that to avoid confusion.
</issue_to_address>
### Comment 2
<location> `pkg/internal/nodeinfo/cpu_models_test.go:120-126` </location>
<code_context>
+ Expect(modelNames).ToNot(ContainElement("Penryn"))
+ })
+
+ It("should correctly aggregate CPU across nodes", func() {
+ Expect(modelMap).To(HaveKey("Skylake"))
+ Expect(modelMap["Skylake"].CPU.String()).To(Equal("16"))
+
+ Expect(modelMap).To(HaveKey("Haswell"))
+ Expect(modelMap["Haswell"].CPU.String()).To(Equal("24")) // 16 + 8
+
+ Expect(modelMap).To(HaveKey("Broadwell"))
+ Expect(modelMap["Broadwell"].CPU.String()).To(Equal("8"))
+ })
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add coverage for the Benchmark field and the mapping from CPU model name to PassMark score
Current tests cover CPU aggregation but not that `CpuModelInfo.Benchmark` is derived correctly from `cpuModelPassMarkScores`. Please add a test that creates nodes with a few known models (e.g., `Haswell`, `Skylake`, `EPYC`), runs the handler, and asserts that the resulting `Benchmark` values match the expected scores from the map. This will help catch regressions in the score table and mapping logic that aggregation tests alone won’t detect.
```suggestion
It("should exclude models that appear only once on a weak node", func() {
modelNames := make([]string, len(recommendedCpuModels))
for i, model := range recommendedCpuModels {
modelNames[i] = model.Name
}
Expect(modelNames).ToNot(ContainElement("Penryn"))
})
It("should correctly aggregate CPU across nodes", func() {
Expect(modelMap).To(HaveKey("Skylake"))
Expect(modelMap["Skylake"].CPU.String()).To(Equal("16"))
Expect(modelMap).To(HaveKey("Haswell"))
Expect(modelMap["Haswell"].CPU.String()).To(Equal("24")) // 16 + 8
Expect(modelMap).To(HaveKey("Broadwell"))
Expect(modelMap["Broadwell"].CPU.String()).To(Equal("8"))
})
It("should set Benchmark from PassMark scores", func() {
Expect(modelMap).To(HaveKey("Skylake"))
Expect(modelMap["Skylake"].Benchmark).To(Equal(cpuModelPassMarkScores["Skylake"]))
Expect(modelMap).To(HaveKey("Haswell"))
Expect(modelMap["Haswell"].Benchmark).To(Equal(cpuModelPassMarkScores["Haswell"]))
Expect(modelMap).To(HaveKey("Broadwell"))
Expect(modelMap["Broadwell"].Benchmark).To(Equal(cpuModelPassMarkScores["Broadwell"]))
})
```
</issue_to_address>
### Comment 3
<location> `pkg/internal/nodeinfo/cpu_models_test.go:150-158` </location>
<code_context>
+ Expect(modelMap["Broadwell"].Memory.String()).To(Equal("32Gi"))
+ })
+
+ It("should correctly count nodes for each model", func() {
+ Expect(modelMap).To(HaveKey("Skylake"))
+ Expect(modelMap["Skylake"].Nodes).To(Equal(1))
+
+ Expect(modelMap).To(HaveKey("Haswell"))
+ Expect(modelMap["Haswell"].Nodes).To(Equal(2))
+
+ Expect(modelMap).To(HaveKey("Broadwell"))
+ Expect(modelMap["Broadwell"].Nodes).To(Equal(1))
+ })
+ })
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests for recommendation ordering and the top-N truncation behavior
This scenario only checks node counts, not the relative ordering or the top-4 truncation described for this feature. Please add tests that:
- Use a cluster with >4 distinct CPU models and assert that only four `CpuModelInfo` entries are returned.
- Assert that the returned models are in the expected order based on benchmark, node count, and resource distribution.
That way the tests cover the core recommendation behavior rather than just the aggregation.
Suggested implementation:
```golang
Expect(modelMap).To(HaveKey("Skylake"))
Expect(modelMap["Skylake"].Memory.String()).To(Equal("64Gi"))
Expect(modelMap).To(HaveKey("Haswell"))
Expect(modelMap["Haswell"].Memory.String()).To(Equal("96Gi")) // 64Gi + 32Gi
Expect(modelMap).To(HaveKey("Broadwell"))
Expect(modelMap["Broadwell"].Memory.String()).To(Equal("32Gi"))
})
It("should correctly count nodes for each model", func() {
Expect(modelMap).To(HaveKey("Skylake"))
Expect(modelMap["Skylake"].Nodes).To(Equal(1))
Expect(modelMap).To(HaveKey("Haswell"))
Expect(modelMap["Haswell"].Nodes).To(Equal(2))
Expect(modelMap).To(HaveKey("Broadwell"))
Expect(modelMap["Broadwell"].Nodes).To(Equal(1))
})
Context("recommendation ordering and top-N truncation", func() {
It("should only return the top 4 CPU models", func() {
// cpuModels is expected to be the ordered recommendation slice under test.
// It should contain more than 4 models before truncation; the implementation
// should truncate it to 4.
Expect(cpuModels).To(HaveLen(4))
})
It("should order models by benchmark, node count, and resource distribution", func() {
// Verify that the models are in the expected order. The exact expected model
// names here should match the test cluster setup used to build cpuModels.
// For example, given a cluster where:
// - "Skylake" has the best benchmark score
// - "Broadwell" is next
// - "Haswell" and a fourth model follow
// we can assert an explicit order.
Expect(cpuModels[0].Model).To(Equal("Skylake"))
Expect(cpuModels[1].Model).To(Equal("Broadwell"))
Expect(cpuModels[2].Model).To(Equal("Haswell"))
// The fourth entry should be the next-best model according to the scoring logic.
Expect(cpuModels[3].Model).ToNot(BeEmpty())
// Additionally, validate that the ordering is consistent with the scoring
// dimensions: benchmark score first, then node count, then resource distribution.
// These expectations assume that higher Nodes/Memory correspond to better
// recommendations when benchmark scores are equal.
Expect(cpuModels[0].Nodes).To(BeNumerically(">=", cpuModels[1].Nodes))
Expect(cpuModels[1].Nodes).To(BeNumerically(">=", cpuModels[2].Nodes))
Expect(cpuModels[0].Memory.Value()).To(BeNumerically(">=", cpuModels[1].Memory.Value()))
Expect(cpuModels[1].Memory.Value()).To(BeNumerically(">=", cpuModels[2].Memory.Value()))
})
})
})
})
```
To make these tests compile and correctly cover the behavior, you will need to:
1. Ensure that `cpuModels` is the ordered slice of `CpuModelInfo` returned by the recommendation function and that it is in scope for this `Describe`/`Context`. If the slice is currently named something else (e.g. `models` or `recommendations`), rename the references in the new tests accordingly.
2. Make sure the test fixture used to build `cpuModels` includes more than 4 distinct CPU models so that the top-4 truncation behavior is exercised. At least one additional model beyond `"Skylake"`, `"Haswell"`, and `"Broadwell"` should be present.
3. Adjust the explicit expected ordering (`"Skylake"`, `"Broadwell"`, `"Haswell"`, and the fourth model) so that it matches the actual scoring logic:
- If you have an explicit benchmark/score field on `CpuModelInfo`, you may want to assert on that field directly rather than inferring via `Nodes`/`Memory`.
- Update the per-field expectations (`Nodes`, `Memory`) so they align with how your recommendation algorithm uses benchmark, node count, and resource distribution to rank models.
4. If the scoring logic uses additional fields (e.g. CPU quantity or benchmark scores), add appropriate assertions mirroring the real ordering criteria so the tests fail when the ordering logic regresses.
</issue_to_address>
### Comment 4
<location> `pkg/internal/nodeinfo/cpu_models_test.go:30-39` </location>
<code_context>
+ BeforeEach(func(ctx context.Context) {
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a smaller, more focused unit test for GetRecommendedCpuModels and the cache behavior
Right now the cache behavior is only covered indirectly via `HandleNodeChanges` and the fake client. A dedicated test could populate the cache once, call `GetRecommendedCpuModels` twice, mutate the slice from the first call, and then assert the second call is unchanged. That would explicitly verify that the function returns a copy rather than exposing internal state.
Suggested implementation:
```golang
var modelMap map[string]v1beta1.CpuModelInfo
```
You’ll want to add a new context and spec similar to the following in `pkg/internal/nodeinfo/cpu_models_test.go`. Place it near your other `Context` blocks (for example, after the existing "3-node cluster with resource aggregation and weak node filtering" context), and adjust helper usage to match your test setup:
```go
Context("GetRecommendedCpuModels cache behavior", func() {
var (
scheme *runtime.Scheme
nodes []client.Object
cachePopulated bool
initialRecommended []v1beta1.CpuModelInfo
secondRecommended []v1beta1.CpuModelInfo
fakeClient client.Client
recommendationWatcher *CpuModelRecommender // or whatever type exposes HandleNodeChanges/GetRecommendedCpuModels
)
BeforeEach(func(ctx context.Context) {
scheme = runtime.NewScheme()
Expect(corev1.AddToScheme(scheme)).To(Succeed())
Expect(v1beta1.AddToScheme(scheme)).To(Succeed())
// Use a minimal set of nodes sufficient to generate at least one recommended model.
nodes = []client.Object{
&corev1.Node{
ObjectMeta: metav1.ObjectMeta{
Name: "node-1",
Labels: map[string]string{
nodeinfo.LabelNodeRoleWorker: "",
},
},
Status: corev1.NodeStatus{
Capacity: corev1.ResourceList{
corev1.ResourceCPU: resource.MustParse("4"),
corev1.ResourceMemory: resource.MustParse("8Gi"),
},
},
},
}
// Create fake client with the nodes
fakeClient = fake.NewClientBuilder().
WithScheme(scheme).
WithObjects(nodes...).
Build()
// Instantiate whatever component owns the cache and GetRecommendedCpuModels.
// This may need to be adjusted to match your code.
recommendationWatcher = NewCpuModelRecommender(fakeClient, scheme)
})
JustBeforeEach(func(ctx context.Context) {
// Populate the cache once, e.g. by handling node changes.
// Adjust to the actual method you use to build the cache.
Expect(recommendationWatcher.HandleNodeChanges(ctx)).To(Succeed())
cachePopulated = true
// First call: this should read from the cache and return a slice.
initialRecommended = recommendationWatcher.GetRecommendedCpuModels(ctx)
Expect(initialRecommended).ToNot(BeEmpty())
// Second call: should also read from the cache and return a fresh copy.
secondRecommended = recommendationWatcher.GetRecommendedCpuModels(ctx)
Expect(secondRecommended).ToNot(BeEmpty())
})
It("returns a defensive copy of cached CPU models", func(ctx context.Context) {
Expect(cachePopulated).To(BeTrue(), "cache should be populated in JustBeforeEach")
// Sanity: both slices have the same length and content initially.
Expect(secondRecommended).To(HaveLen(len(initialRecommended)))
for i := range initialRecommended {
Expect(secondRecommended[i]).To(Equal(initialRecommended[i]))
}
// Mutate the slice returned by the first call.
initialRecommended[0].Model = "mutated-model-id"
initialRecommended[0].BaselineModel = "mutated-baseline"
// Verify the second slice is unaffected (defensive copy semantics).
Expect(secondRecommended[0].Model).ToNot(Equal("mutated-model-id"))
Expect(secondRecommended[0].BaselineModel).ToNot(Equal("mutated-baseline"))
// Optional: call GetRecommendedCpuModels a third time and
// assert it still matches the unmodified cache state.
third := recommendationWatcher.GetRecommendedCpuModels(ctx)
Expect(third[0].Model).To(Equal(secondRecommended[0].Model))
Expect(third[0].BaselineModel).To(Equal(secondRecommended[0].BaselineModel))
})
})
```
You will likely need to adapt the following to your actual code:
1. **Constructor / type names**:
- Replace `CpuModelRecommender` and `NewCpuModelRecommender` with the real type and constructor that:
- Owns the CPU model cache.
- Exposes `HandleNodeChanges` (or equivalent) to populate that cache.
- Exposes `GetRecommendedCpuModels`.
2. **Client and scheme wiring**:
- If the existing tests already have helpers for building a fake client or the recommender/controller, reuse those instead of the inline `fake.NewClientBuilder` shown above.
- If `GetRecommendedCpuModels` is a package-level function instead of a method, call it directly, and adjust the cache-initialization call to whatever is appropriate in your design.
3. **Node setup**:
- If your recommendation logic depends on particular CPU model labels or extended resources, mirror the minimal label set used in your existing 3-node cluster tests to ensure at least one CPU model appears in the recommendations.
4. **Ginkgo version**:
- The example uses `func(ctx context.Context)` signatures for `BeforeEach`, `JustBeforeEach`, and `It`, aligned with Ginkgo v2 and your snippet. Ensure the imports and Ginkgo configuration in the file already support `ctx`; otherwise, drop the `ctx` argument and the `context.Context` parameters.
By adding this context and spec, you’ll explicitly validate that `GetRecommendedCpuModels` returns a defensive copy and does not expose the internal cache slice to callers.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
hco-e2e-operator-sdk-sno-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-sno-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-azure, ci/prow/hco-e2e-operator-sdk-gcp DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-upgrade-prev-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-operator-sdk-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-sno-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-consecutive-operator-sdk-upgrades-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-kv-smoke-gcp lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-kv-smoke-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
nunnatsa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dankenigsberg
Added some more comments.
|
|
||
| if cpuModels := nodeinfo.GetRecommendedCpuModels(); !slices.EqualFunc(req.Instance.Status.NodeInfo.RecommendedCpuModels, cpuModels, func(a, b hcov1beta1.CpuModelInfo) bool { | ||
| return a.Name == b.Name && a.Benchmark == b.Benchmark && a.Nodes == b.Nodes && | ||
| ((a.CPU == nil && b.CPU == nil) || (a.CPU != nil && b.CPU != nil && a.CPU.Equal(*b.CPU))) && | ||
| ((a.Memory == nil && b.Memory == nil) || (a.Memory != nil && b.Memory != nil && a.Memory.Equal(*b.Memory))) | ||
| }) { | ||
| req.Instance.Status.NodeInfo.RecommendedCpuModels = cpuModels | ||
| req.StatusDirty = true | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we already have a comparison function in the nodeinfo package, we can't expose it for reuse, because it's in an internal package, but maybe we can add it as the CpuModel type's method, instead?
- this code is not tested. Can we have a few unit tests for it? We can use a simple mock, as done for the rest of the functions in the nodeinfo package, to return any desired value.
In heterogeneous clusters with multiple CPU vendors and generations, choosing the best CPU model for VMs is challenging: - Using a newer CPU model limits VM placement to a subset of nodes - Using an older model impacts guest performance unnecessarily This change adds automatic CPU model recommendations to the HyperConverged status. The recommendations are sorted by a weighted score that balances: - PassMark performance score (50%) - for guest performance - Available CPU cores (20%) - for workload capacity - Available memory (15%) - for workload capacity - Node count (15%) - for availability and redundancy The top recommendations appear in status.nodeInfo.recommendedCpuModels, helping cluster administrators make informed decisions when setting spec.defaultCPUModel. The CPU model data is gathered from node labels (cpu-model.node.kubevirt.io/*) set by KubeVirt's virt-handler, and is automatically refreshed when nodes change or hourly. Assisted-by: claude-4-sonnet Signed-off-by: Dan Kenigsberg <[email protected]>
5d2791e to
f435588
Compare
|
|
hco-e2e-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-operator-sdk-azure, ci/prow/hco-e2e-operator-sdk-gcp, ci/prow/hco-e2e-operator-sdk-sno-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-upgrade-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-operator-sdk-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-consecutive-operator-sdk-upgrades-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure, ci/prow/hco-e2e-upgrade-prev-operator-sdk-sno-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@dankenigsberg: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
hco-e2e-upgrade-prev-operator-sdk-aws lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-upgrade-prev-operator-sdk-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
hco-e2e-kv-smoke-gcp lane succeeded. |
|
@hco-bot: Overrode contexts on behalf of hco-bot: ci/prow/hco-e2e-kv-smoke-azure DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Thank you for this important effort. Have you considered clusters with multiple vendors or architectures? Would this recommendation still be valid in those cases? |
Yes: if most of the CPU power of the cluster come from an "exotic" architecture, the code would recommend models from that architecture. |
| ) | ||
|
|
||
| // cpuModelPassMarkScores maps libvirt CPU model names to approximate PassMark scores. | ||
| // Keys must match exactly the cpu-model.node.kubevirt.io/* label values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads-up that the keys in this list tend to change with different libvirt versions, so this would require continued maintenance.



In heterogeneous clusters with multiple CPU vendors and generations, choosing the best CPU model for VMs is challenging:
This change adds automatic CPU model recommendations to the HyperConverged status. The recommendations are sorted by a weighted score that balances:
The top recommendations appear in status.nodeInfo.recommendedCpuModels, helping cluster administrators make informed decisions when setting spec.defaultCPUModel.
The CPU model data is gathered from node labels (cpu-model.node.kubevirt.io/*) set by KubeVirt's virt-handler, and is automatically refreshed when nodes change or hourly.
Assisted-by: claude-4-sonnet
Reviewer Checklist
Jira Ticket:
Release note: