feat!: add toolset versioning by Cali0707 · Pull Request #639 · containers/kubernetes-mcp-server

Cali0707 · 2026-01-12T20:39:55Z

This PR adds toolset versioning and versioning guidelines to the server.

A couple key points for reviewers:

Based on the current guidelines, no toolsets are "stable". The best is "core" which is "beta". As such, the default version has been set to "beta"
The helm toolset has no evals, so it is considered "alpha". As such, it is not included by default in tool lists (this is a breaking change)

Signed-off-by: Calum Murray <cmurray@redhat.com>

This is a breaking change as the helm toolset does not meet the requirements to be a "beta" toolset, so it was reverted to "alpha". This means that helm is now disabled by default. Signed-off-by: Calum Murray <cmurray@redhat.com>

Signed-off-by: Calum Murray <cmurray@redhat.com>

matzew · 2026-01-13T09:31:16Z

docs/TOOLSET_VERSIONING.md

+The general idea for these versions is:
+- "alpha": the toolset is not guaranteed to work well
+- "beta": the toolset is not guaranteed to work well, but we are evaluating how well it works
+- "stable": the toolset works well, and we are evaluating how well it works to avoid regressions


stable: this reads (b/c of evaluating) not that stable. that sentence reads a bit vague.

matzew · 2026-01-13T09:32:58Z

docs/TOOLSET_VERSIONING.md

+### Beta
+
+For a tool/prompt/toolset to enter into "beta", we require that there are eval scenarios. For a toolset to enter "beta", there must be scenarios
+excercising all of the tools and prompts in the toolset. For individual tools and prompts to enter "beta", we only require an eval scenario


perhaps would be good to eventually point to concrete example for that. But generally I like the requirement of having some sort of eval (w/ our toolkit)

matzew · 2026-01-13T09:33:13Z

docs/TOOLSET_VERSIONING.md

+
+### GA/Stable
+
+For a tool/prompt/toolset to enter into "stable", we require that 95% or more of the eval scenarios are passing. There is the same requirements as "beta" in terms of the number of evaluation scenarios.


95% comes from?

I think @mrunalp was mentioning this as a threshold

matzew · 2026-01-13T09:33:39Z

docs/TOOLSET_VERSIONING.md

+```toml
+default_toolset_version = "beta"
+
+toolsets = [ "core", "config", "helm:alpha" ]


I like this schema

matzew · 2026-01-13T11:19:53Z

pkg/config/config.go

 	ServerInstructions string `toml:"server_instructions,omitempty"`

+	// Which toolset version to enable (any tools/toolsets below this will be disabled)
+	DefaultToolsetVersion api.Version `toml:"default_toolset_version"`


So, at "server level", we say stable - hence no beta (for instance) enabled.

what about explicit enablement - given a "global default" ?

E.g. something like toolsets = [ "core", "config", "helm:alpha" ] would than "win", right?

IMO there are two parts to configuring everything:

Which toolsets you want. In my mind, this doesn't necessarily align with how mature the toolsets are, but rather with which domains you want to interact

What level of maturity of tools you want to use

So, if you set "core", "config", and "helm:alpha" in the current setup, what would happen is:

The core and config toolsets would both pick up the default version of "stable". As they are both in "beta" currently, no tools would be selected

The helm toolset would use the overridden "alpha" version, and since it is in alpha, all of it's tools would be available

I wasn't 100% convinced that the way I wrote it is the most intuititive, I just want to capture somewhere that there are those two key separate ideas in the config (which toolsets/domains, which versions you are okay with). My main thought is that enabling "stable" or "beta" should not enable all the toolsets with that version

matzew · 2026-01-13T11:22:40Z

pkg/api/toolsets.go

 	GetPrompts() []ServerPrompt
+	// GetVersion returns the version of the toolset.
+	// This version can be overridden by specific tools/prompts (e.g. a toolset may be beta, but have an alpha tool).
+	GetVersion() Version


for downstreaming impls we would than just set those?

matzew · 2026-01-13T11:25:02Z

pkg/mcp/testdata/toolsets-kiali-tools.json

-    "name": "kiali_workload_logs"
-  }
-]
+[]


Oh I think this is a mistake - because now by default this toolset is not enabled, there are no tools so the json snapshot update removed these.

Will switch it so these are enabled in the tests

manusa · 2026-01-13T13:59:57Z

Given what we discussed internally yesterday and this proposal, I'm not that sure that versioning is exactly what we want.
Also, I don't think this fits exactly the upstream purposes.

This is how I see it:

I don't think that 3 levels (alpha, beta, stable) are really necessary but I might be missing something (e.g. I don't see a need to differentiate between beta and alpha). (See next point)
Instead of providing versioning, I believe that for the downstream story, it should be something more like certified or supported
This should cover the productization story where the MCP server is provided with a given set of tools that are supported for customers.
In case certified misses something, then I guess version is fine.
Upstream (this repo), SHOULD provide the required infrastructure to declare a toolset as certified/versioned.
However, I think it SHOULD NOT take on the responsibility of declaring a toolset certified (or GA), since it doesn't really matter for the upstream world and its licensing ("AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND).
For maintenance reasons and also to facilitate the creation of the certified infrastructure upstream and being able to leverage it downstream, I believe that a slice of certified toolsets should be declared somewhere (maybe the toolsets package) upstream with the _overrides.go pattern to be able to redefine it downstream.
This is considering that entire toolsets should be certified as opposed to the current approach which provide granularity for individual tools or prompts (I don't really see the case for that, but I might be missing some points here too).
I like the fact that certified toolsets (or a given version threshold in case we go that route) can be enforced while starting the server.
But this should really account for their intersections and edge cases (enabled toolsets, enabled/disabled tools, readonly, etc.) Maybe warnings or exceptions should be printed with informative messages (e.g. a server is started with settings that effectively provide no toolsets)

Signed-off-by: Calum Murray <cmurray@redhat.com>

matzew · 2026-01-13T14:41:15Z

for the downstream story, it should be something more like certified or supported

I think for that (downstream) certified/supported versus NOT certified is good.

For upstream (here) I do see the reasoning behind the idea of coming in at a more flexible stage (e.g. alpha, beta, ga), allowing to define a set of robustness there. E.g. in some CNCF projects we had similar stage of "feature promotion". Showing how robust a (new) feature is. Which still than would fit with the term on the license.

@manusa I guess you are more for a lower (or different) bar of adding toolsets, like "if it passes X % of eval it can come in" ? Should all be enabled by default? (Which of course can be changed on the downstream too)

Cali0707 · 2026-01-13T15:15:03Z

From discussions today the idea (as I understand it) is:

Toolsets will either be "certified" or not (possibly under some other name, like "passing evals")
We will have some slice of the certified toolsets
There will be an eval requirement to make a toolset certified - if the toolset falls below the bar, we remove it from certified
There will be some flag (default false) to enable only certified toolsets
We will warn if a toolset is disabled by the flag

The idea then for new toolsets is that they will not be certified by default, until they get evals. For existing toolsets we can add a toolsetname_experimental toolset for adding/making large changes to existing tools

nader-ziada · 2026-01-13T19:14:49Z

docs/TOOLSET_VERSIONING.md

+### Beta
+
+For a tool/prompt/toolset to enter into "beta", we require that there are eval scenarios. For a toolset to enter "beta", there must be scenarios
+excercising all of the tools and prompts in the toolset. For individual tools and prompts to enter "beta", we only require an eval scenario


type excercising -> exercising

nader-ziada · 2026-01-13T19:17:23Z

pkg/api/toolsets.go

+	case "beta":
+		tmp = VersionBeta
+	case "ga", "", "stable":
+		tmp = VersionGA


minor issue:
The empty string "" is grouped with "ga" and "stable", so if a user has a config like:

default_toolset_version = ""

it silently becomes VersionGA instead of the default of VersionBeta

Cali0707 added 3 commits January 12, 2026 14:54

feat(api): add versioning to tools/toolsets

6208cbd

Signed-off-by: Calum Murray <cmurray@redhat.com>

feat!: enable filtering of tools/toolsets by version

662ad9a

This is a breaking change as the helm toolset does not meet the requirements to be a "beta" toolset, so it was reverted to "alpha". This means that helm is now disabled by default. Signed-off-by: Calum Murray <cmurray@redhat.com>

docs: explain toolset versioning

4b25711

Signed-off-by: Calum Murray <cmurray@redhat.com>

Cali0707 requested review from manusa, matzew, mrunalp and nader-ziada January 12, 2026 20:40

cleanup: fix lint errors

a352a85

Signed-off-by: Calum Murray <cmurray@redhat.com>

matzew reviewed Jan 13, 2026

View reviewed changes

fix(test): enable all tools in snapshot tests

9da490f

Signed-off-by: Calum Murray <cmurray@redhat.com>

nader-ziada reviewed Jan 13, 2026

View reviewed changes


		### GA/Stable

		For a tool/prompt/toolset to enter into "stable", we require that 95% or more of the eval scenarios are passing. There is the same requirements as "beta" in terms of the number of evaluation scenarios.

Conversation

Cali0707 commented Jan 12, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

manusa commented Jan 13, 2026

Uh oh!

matzew commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cali0707 commented Jan 13, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

matzew commented Jan 13, 2026 •

edited

Loading