forked from containers/kubernetes-mcp-server
-
Notifications
You must be signed in to change notification settings - Fork 32
MG-34: Add oc cli like must-gather collection with ServerPrompt #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
swghosh
wants to merge
4
commits into
openshift:main
Choose a base branch
from
swghosh:plan-mg-tool
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,219 @@ | ||
| # OpenShift Toolset | ||
|
|
||
| This toolset provides OpenShift-specific prompts for cluster management and troubleshooting. | ||
|
|
||
| ## Prompts | ||
|
|
||
| ### plan_mustgather | ||
|
|
||
| Plan for collecting a must-gather archive from an OpenShift cluster. Must-gather is a tool for collecting cluster data related to debugging and troubleshooting like logs, Kubernetes resources, and more. | ||
|
|
||
| This prompt generates YAML manifests for the must-gather resources that can be applied to the cluster. | ||
|
|
||
| **Arguments:** | ||
| - `node_name` (optional) - Specific node name to run must-gather pod on | ||
| - `node_selector` (optional) - Node selector in `key=value,key2=value2` format to filter nodes for the pod | ||
| - `source_dir` (optional) - Custom gather directory inside pod (default: `/must-gather`) | ||
| - `namespace` (optional) - Privileged namespace to use for must-gather (auto-generated if not specified) | ||
| - `gather_command` (optional) - Custom gather command e.g. `/usr/bin/gather_audit_logs` (default: `/usr/bin/gather`) | ||
| - `timeout` (optional) - Timeout duration for gather command (e.g., `30m`, `1h`) | ||
| - `since` (optional) - Only gather data newer than this duration (e.g., `5s`, `2m5s`, or `3h6m10s`), defaults to all data | ||
| - `host_network` (optional) - Use host network for must-gather pod (`true`/`false`) | ||
| - `keep_resources` (optional) - Keep pod resources after collection (`true`/`false`, default: `false`) | ||
| - `all_component_images` (optional) - Include must-gather images from all installed operators (`true`/`false`) | ||
| - `images` (optional) - Comma-separated list of custom must-gather container images | ||
|
|
||
| **Example:** | ||
| ``` | ||
| # Basic must-gather collection | ||
| {} | ||
|
|
||
| # Collect with custom timeout and since | ||
| { | ||
| "timeout": "30m", | ||
| "since": "1h" | ||
| } | ||
|
|
||
| # Collect from all component images | ||
| { | ||
| "all_component_images": "true" | ||
| } | ||
|
|
||
| # Collect from specific operator image | ||
| { | ||
| "images": "registry.redhat.io/openshift-logging/cluster-logging-rhel9-operator@sha256:..." | ||
| } | ||
| ``` | ||
|
|
||
| ## Enable the OpenShift Toolset | ||
|
|
||
| ### Option 1: Command Line | ||
|
|
||
| ```bash | ||
| kubernetes-mcp-server --toolsets core,config,helm,openshift | ||
| ``` | ||
|
|
||
| ### Option 2: Configuration File | ||
|
|
||
| ```toml | ||
| toolsets = ["core", "config", "helm", "openshift"] | ||
| ``` | ||
|
|
||
| ### Option 3: MCP Client Configuration | ||
|
|
||
| ```json | ||
| { | ||
| "mcpServers": { | ||
| "kubernetes": { | ||
| "command": "npx", | ||
| "args": ["-y", "kubernetes-mcp-server@latest", "--toolsets", "core,config,helm,openshift"] | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| The OpenShift toolset requires: | ||
|
|
||
| 1. **OpenShift cluster** - These prompts are designed for OpenShift and automatically detect the cluster type | ||
| 2. **Proper RBAC** - The user/service account must have permissions to: | ||
| - Create namespaces | ||
| - Create service accounts | ||
| - Create cluster role bindings | ||
| - Create pods with privileged access | ||
| - List ClusterOperators and ClusterServiceVersions (for `all_component_images`) | ||
|
|
||
| ## How It Works | ||
|
|
||
| ### Must-Gather Plan Generation | ||
|
|
||
| The `plan_mustgather` prompt generates YAML manifests for collecting diagnostic data from an OpenShift cluster: | ||
|
|
||
| 1. **Namespace** - A temporary namespace (e.g., `openshift-must-gather-xyz`) is created unless an existing namespace is specified | ||
| 2. **ServiceAccount** - A service account with cluster-admin permissions is created for the must-gather pod | ||
| 3. **ClusterRoleBinding** - Binds the service account to the cluster-admin role | ||
| 4. **Pod** - Runs the must-gather container(s) with the specified configuration | ||
|
|
||
| ### Component Image Discovery | ||
|
|
||
| When `all_component_images` is enabled, the prompt discovers must-gather images from: | ||
| - **ClusterOperators** - Looks for the `operators.openshift.io/must-gather-image` annotation | ||
| - **ClusterServiceVersions** - Checks OLM-installed operators for the same annotation | ||
|
|
||
| ### Multiple Images Support | ||
|
|
||
| Up to 8 gather images can be run concurrently. Each image runs in a separate container within the same pod, sharing the output volume. | ||
|
|
||
| ## Common Use Cases | ||
|
|
||
| ### Basic Cluster Diagnostics | ||
|
|
||
| Collect general cluster diagnostics: | ||
| ```json | ||
| {} | ||
| ``` | ||
|
|
||
| ### Audit Logs Collection | ||
|
|
||
| Collect audit logs with a custom gather command: | ||
| ```json | ||
| { | ||
| "gather_command": "/usr/bin/gather_audit_logs", | ||
| "timeout": "2h" | ||
| } | ||
| ``` | ||
|
|
||
| ### Recent Logs Only | ||
|
|
||
| Collect logs from the last 30 minutes: | ||
| ```json | ||
| { | ||
| "since": "30m" | ||
| } | ||
| ``` | ||
|
|
||
| ### Specific Operator Diagnostics | ||
|
|
||
| Collect diagnostics for a specific operator: | ||
| ```json | ||
| { | ||
| "images": "registry.redhat.io/openshift-logging/cluster-logging-rhel9-operator@sha256:..." | ||
| } | ||
| ``` | ||
|
|
||
| ### Host Network Access | ||
|
|
||
| For gather scripts that need host-level network access: | ||
| ```json | ||
| { | ||
| "host_network": "true" | ||
| } | ||
| ``` | ||
|
|
||
| ### All Component Diagnostics | ||
|
|
||
| Collect diagnostics from all operators with must-gather images: | ||
| ```json | ||
| { | ||
| "all_component_images": "true", | ||
| "timeout": "1h" | ||
| } | ||
| ``` | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Permission Errors | ||
|
|
||
| If you see permission warnings, ensure your user has the required RBAC permissions: | ||
| ```bash | ||
| oc auth can-i create namespaces | ||
| oc auth can-i create clusterrolebindings | ||
| oc auth can-i create pods --as=system:serviceaccount:openshift-must-gather-xxx:must-gather-collector | ||
| ``` | ||
|
|
||
| ### Pod Not Starting | ||
|
|
||
| Check if the node has enough resources and can pull the must-gather image: | ||
| ```bash | ||
| oc get pods -n openshift-must-gather-xxx | ||
| oc describe pod <pod-name> -n openshift-must-gather-xxx | ||
| ``` | ||
|
|
||
| ### Timeout Issues | ||
|
|
||
| For large clusters or audit log collection, increase the timeout: | ||
| ```json | ||
| { | ||
| "timeout": "2h" | ||
| } | ||
| ``` | ||
|
|
||
| ### Image Pull Errors | ||
|
|
||
| Ensure the must-gather image is accessible: | ||
| ```bash | ||
| oc get secret -n openshift-config pull-secret | ||
| ``` | ||
|
|
||
| ## Security Considerations | ||
|
|
||
| ### Privileged Access | ||
|
|
||
| The must-gather pods run with: | ||
| - `cluster-admin` ClusterRoleBinding | ||
| - `system-cluster-critical` priority class | ||
| - Tolerations for all taints | ||
| - Optional host network access | ||
|
|
||
| ### Temporary Resources | ||
|
|
||
| By default, all created resources (namespace, service account, cluster role binding) should be cleaned up after the must-gather collection is complete. Use `"keep_resources": "true"` to retain them for debugging. | ||
|
|
||
| ### Image Sources | ||
|
|
||
| The prompt uses these default images: | ||
| - **Must-gather**: `registry.redhat.io/openshift4/ose-must-gather:latest` | ||
| - **Wait container**: `registry.redhat.io/ubi9/ubi-minimal` | ||
|
|
||
| Custom images should be from trusted sources. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this seems more like an MCP prompt than an MCP tool. See for example
openshift-mcp-server/pkg/toolsets/core/health_check.go
Lines 22 to 45 in 44e41c4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our thinking here was while it does guide a workflow, the complexity of the parameters makes it better suited as a tool rather than a prompt - @swghosh did we investigate this route?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the time of initially writing this PR, the upstream MCP server lacked support for Prompts so we ended up using the tools approach.
Also, per what we've had comprehended earlier: prompts are mainly static description/instructions to guide the agent in different things; unlike the health_check example shared it seems we can have a fully-dynamic prompt with params support generated by the MCP server to print full yamls (which is pretty much what we need in the planning). It sounds reasonable to investigate the agent flow by flipping the Tools -> Prompt assuming we can print the same text blurb in the current tool response.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO one concern comes to my mind, OpenShift Lightspeed being one of the primary agent's we're targetting for this use case probably does not support MCP prompts at this time (only tools).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, worth to raise this there, for supporting prompts? They are part of the mcp spec.
There is a bit of similarity on what @Cali0707 shared for the "health", as being a
ServerPrompt