Skip to content

MG-34: Add oc cli like must-gather collection with ServerPrompt#51

Open
swghosh wants to merge 4 commits intoopenshift:mainfrom
swghosh:plan-mg-tool
Open

MG-34: Add oc cli like must-gather collection with ServerPrompt#51
swghosh wants to merge 4 commits intoopenshift:mainfrom
swghosh:plan-mg-tool

Conversation

@swghosh
Copy link
Member

@swghosh swghosh commented Oct 10, 2025

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too
Details [MCP inspector](https://modelcontextprotocol.io/docs/tools/inspector):

Input (inferred defaults):

{
  "gather_command": "/usr/bin/gather",
  "source_dir": "/must-gather"
}

Output:

The generated plan contains YAML manifests for must-gather pods and required resources (namespace, serviceaccount, clusterrolebinding). Suggest how the user can apply the manifest and copy results locally (oc cp / kubectl cp).

Ask the user if they want to apply the plan

  • use the resource_create_or_update tool to apply the manifest
  • alternatively, advise the user to execute oc apply / kubectl apply instead.

Once the must-gather collection is completed, the user may which to cleanup the created resources.

  • use the resources_delete tool to delete the namespace and the clusterrolebinding
  • or, execute cleanup using kubectl delete.
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-must-gather-tn7jzk
spec: {}
status: {}
apiVersion: v1
kind: ServiceAccount
metadata:
  name: must-gather-collector
  namespace: openshift-must-gather-tn7jzk
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: openshift-must-gather-tn7jzk-must-gather-collector
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: must-gather-collector
  namespace: openshift-must-gather-tn7jzk
apiVersion: v1
kind: Pod
metadata:
  generateName: must-gather-
  namespace: openshift-must-gather-tn7jzk
spec:
  containers:
  - command:
    - /usr/bin/gather
    image: registry.redhat.io/openshift4/ose-must-gather:latest
    imagePullPolicy: IfNotPresent
    name: gather
    resources: {}
    volumeMounts:
    - mountPath: /must-gather
      name: must-gather-output
  - command:
    - /bin/bash
    - -c
    - sleep infinity
    image: registry.redhat.io/ubi9/ubi-minimal
    imagePullPolicy: IfNotPresent
    name: wait
    resources: {}
    volumeMounts:
    - mountPath: /must-gather
      name: must-gather-output
  priorityClassName: system-cluster-critical
  restartPolicy: Never
  serviceAccountName: must-gather-collector
  tolerations:
  - operator: Exists
  volumes:
  - emptyDir: {}
    name: must-gather-output
status: {}

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 10, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 10, 2025

@swghosh: This pull request references MG-34 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from Cali0707 and matzew October 10, 2025 19:36
@swghosh
Copy link
Member Author

swghosh commented Oct 10, 2025

@harche @ardaguclu referring to #38 (comment), should we move this into pkg/ocp? Given this is also an OpenShift specific tool.

@swghosh
Copy link
Member Author

swghosh commented Oct 10, 2025

/cc @Prashanth684 @shivprakashmuley

@Cali0707
Copy link

@harche @ardaguclu referring to #38 (comment), should we move this into pkg/ocp? Given this is also an OpenShift specific tool.

My thoughts are we should probably be making one or more OpenShift specific toolgroups eventually

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 10, 2025

@swghosh: This pull request references MG-34 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too
[MCP inspector](https://modelcontextprotocol.io/docs/tools/inspector):

Input (inferred defaults):

{
 "gather_command": "/usr/bin/gather",
 "source_dir": "/must-gather",
 "timeout": "10m"
}

Output:

Save the following content to a file (e.g., must-gather-plan.yaml) and apply it with 'kubectl apply -f must-gather-plan.yaml'
Monitor the pod's logs to see when the must-gather process is complete:
kubectl logs -f -n openshift-must-gather-wwt74j -c gather
Once the logs indicate completion, copy the results with:
kubectl cp -n openshift-must-gather-wwt74j :/must-gather ./must-gather-output -c wait
Finally, clean up the resources with:
kubectl delete ns openshift-must-gather-wwt74j
kubectl delete clusterrolebinding openshift-must-gather-wwt74j-must-gather-collector

apiVersion: v1
kind: ServiceAccount
metadata:
 name: must-gather-collector
 namespace: openshift-must-gather-wwt74j
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: openshift-must-gather-wwt74j-must-gather-collector
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: cluster-admin
subjects:
- kind: ServiceAccount
 name: must-gather-collector
 namespace: openshift-must-gather-wwt74j
metadata:
 generateName: must-gather-
 namespace: openshift-must-gather-wwt74j
spec:
 containers:
 - command:
   - /usr/bin/timeout 10m /usr/bin/gather
   image: registry.redhat.io/openshift4/ose-must-gather:latest
   imagePullPolicy: IfNotPresent
   name: gather
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 - command:
   - /bin/bash
   - -c
   - sleep infinity
   image: registry.redhat.io/ubi9/ubi-minimal
   imagePullPolicy: IfNotPresent
   name: wait
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 priorityClassName: system-cluster-critical
 restartPolicy: Never
 serviceAccountName: must-gather-collector
 tolerations:
 - operator: Exists
 volumes:
 - emptyDir: {}
   name: must-gather-output
status: {}

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@Prashanth684
Copy link

@harche @ardaguclu referring to #38 (comment), should we move this into pkg/ocp? Given this is also an OpenShift specific tool.

My thoughts are we should probably be making one or more OpenShift specific toolgroups eventually

yes. maybe a pkg/toolsets/ocp/must-gather or equivalent.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 10, 2025

@swghosh: This pull request references MG-34 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too
[MCP inspector](https://modelcontextprotocol.io/docs/tools/inspector):

Input (inferred defaults):

{
 "gather_command": "/usr/bin/gather",
 "source_dir": "/must-gather",
 "timeout": "10m"
}

Output:

Save the following content to a file (e.g., must-gather-plan.yaml) and apply it with 'kubectl apply -f must-gather-plan.yaml'
Monitor the pod's logs to see when the must-gather process is complete:
kubectl logs -f -n openshift-must-gather-wwt74j -c gather
Once the logs indicate completion, copy the results with:
kubectl cp -n openshift-must-gather-wwt74j :/must-gather ./must-gather-output -c wait
Finally, clean up the resources with:
kubectl delete ns openshift-must-gather-wwt74j
kubectl delete clusterrolebinding openshift-must-gather-wwt74j-must-gather-collector

Save the following content to a file (e.g., must-gather-plan.yaml) and apply it with 'kubectl apply -f must-gather-plan.yaml'

Monitor the pod's logs to see when the must-gather process is complete:

kubectl logs -f -n openshift-must-gather-fzq7f5 -c gather

Once the logs indicate completion, copy the results with:

kubectl cp -n openshift-must-gather-fzq7f5 :/must-gather ./must-gather-output -c wait

Finally, clean up the resources with:

kubectl delete ns openshift-must-gather-fzq7f5

kubectl delete clusterrolebinding openshift-must-gather-fzq7f5-must-gather-collector

apiVersion: v1
kind: Namespace
metadata:
 name: openshift-must-gather-vhph8d
spec: {}
status: {}
---
---
apiVersion: v1
kind: ServiceAccount
metadata:
 name: must-gather-collector
 namespace: openshift-must-gather-vhph8d
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: openshift-must-gather-vhph8d-must-gather-collector
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: cluster-admin
subjects:
- kind: ServiceAccount
 name: must-gather-collector
 namespace: openshift-must-gather-vhph8d
metadata:
 generateName: must-gather-
 namespace: openshift-must-gather-vhph8d
spec:
 containers:
 - command:
   - /usr/bin/timeout 10m /usr/bin/gather
   image: registry.redhat.io/openshift4/ose-must-gather:latest
   imagePullPolicy: IfNotPresent
   name: gather
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 - command:
   - /bin/bash
   - -c
   - sleep infinity
   image: registry.redhat.io/ubi9/ubi-minimal
   imagePullPolicy: IfNotPresent
   name: wait
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 priorityClassName: system-cluster-critical
 restartPolicy: Never
 serviceAccountName: must-gather-collector
 tolerations:
 - operator: Exists
 volumes:
 - emptyDir: {}
   name: must-gather-output
status: {}

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 10, 2025

@swghosh: This pull request references MG-34 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too
[MCP inspector](https://modelcontextprotocol.io/docs/tools/inspector):

Input (inferred defaults):

{
 "gather_command": "/usr/bin/gather",
 "source_dir": "/must-gather",
 "timeout": "10m"
}

Output:

Save the following content to a file (e.g., must-gather-plan.yaml) and apply it with 'kubectl apply -f must-gather-plan.yaml'
Monitor the pod's logs to see when the must-gather process is complete:
kubectl logs -f -n openshift-must-gather-fzq7f5 -c gather
Once the logs indicate completion, copy the results with:
kubectl cp -n openshift-must-gather-fzq7f5 :/must-gather ./must-gather-output -c wait
Finally, clean up the resources with:
kubectl delete ns openshift-must-gather-fzq7f5
kubectl delete clusterrolebinding openshift-must-gather-fzq7f5-must-gather-collector

apiVersion: v1
kind: Namespace
metadata:
 name: openshift-must-gather-vhph8d
spec: {}
status: {}
---
---
apiVersion: v1
kind: ServiceAccount
metadata:
 name: must-gather-collector
 namespace: openshift-must-gather-vhph8d
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: openshift-must-gather-vhph8d-must-gather-collector
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: cluster-admin
subjects:
- kind: ServiceAccount
 name: must-gather-collector
 namespace: openshift-must-gather-vhph8d
metadata:
 generateName: must-gather-
 namespace: openshift-must-gather-vhph8d
spec:
 containers:
 - command:
   - /usr/bin/timeout 10m /usr/bin/gather
   image: registry.redhat.io/openshift4/ose-must-gather:latest
   imagePullPolicy: IfNotPresent
   name: gather
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 - command:
   - /bin/bash
   - -c
   - sleep infinity
   image: registry.redhat.io/ubi9/ubi-minimal
   imagePullPolicy: IfNotPresent
   name: wait
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 priorityClassName: system-cluster-critical
 restartPolicy: Never
 serviceAccountName: must-gather-collector
 tolerations:
 - operator: Exists
 volumes:
 - emptyDir: {}
   name: must-gather-output
status: {}

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 10, 2025

@swghosh: This pull request references MG-34 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too
[MCP inspector](https://modelcontextprotocol.io/docs/tools/inspector):

Input (inferred defaults):

{
 "gather_command": "/usr/bin/gather",
 "source_dir": "/must-gather",
 "timeout": "10m"
}

Output:

Save the following content to a file (e.g., must-gather-plan.yaml) and apply it with 'kubectl apply -f must-gather-plan.yaml'
Monitor the pod's logs to see when the must-gather process is complete:
kubectl logs -f -n openshift-must-gather-fzq7f5 -c gather
Once the logs indicate completion, copy the results with:
kubectl cp -n openshift-must-gather-fzq7f5 :/must-gather ./must-gather-output -c wait
Finally, clean up the resources with:
kubectl delete ns openshift-must-gather-fzq7f5
kubectl delete clusterrolebinding openshift-must-gather-fzq7f5-must-gather-collector

apiVersion: v1
kind: Namespace
metadata:
 name: openshift-must-gather-vhph8d
spec: {}
status: {}
---
apiVersion: v1
kind: ServiceAccount
metadata:
 name: must-gather-collector
 namespace: openshift-must-gather-vhph8d
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: openshift-must-gather-vhph8d-must-gather-collector
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: cluster-admin
subjects:
- kind: ServiceAccount
 name: must-gather-collector
 namespace: openshift-must-gather-vhph8d
metadata:
 generateName: must-gather-
 namespace: openshift-must-gather-vhph8d
spec:
 containers:
 - command:
   - /usr/bin/timeout 10m /usr/bin/gather
   image: registry.redhat.io/openshift4/ose-must-gather:latest
   imagePullPolicy: IfNotPresent
   name: gather
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 - command:
   - /bin/bash
   - -c
   - sleep infinity
   image: registry.redhat.io/ubi9/ubi-minimal
   imagePullPolicy: IfNotPresent
   name: wait
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 priorityClassName: system-cluster-critical
 restartPolicy: Never
 serviceAccountName: must-gather-collector
 tolerations:
 - operator: Exists
 volumes:
 - emptyDir: {}
   name: must-gather-output
status: {}

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 10, 2025

@swghosh: This pull request references MG-34 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too
[MCP inspector](https://modelcontextprotocol.io/docs/tools/inspector):

Input (inferred defaults):

{
 "gather_command": "/usr/bin/gather",
 "source_dir": "/must-gather",
 "timeout": "10m"
}

Output:

Save the following content to a file (e.g., must-gather-plan.yaml) and apply it with 'kubectl apply -f must-gather-plan.yaml'
Monitor the pod's logs to see when the must-gather process is complete:
kubectl logs -f -n openshift-must-gather-jkbn9p -c gather
Once the logs indicate completion, copy the results with:
kubectl cp -n openshift-must-gather-jkbn9p :/must-gather ./must-gather-output -c wait
Finally, clean up the resources with:
kubectl delete ns openshift-must-gather-jkbn9p
kubectl delete clusterrolebinding openshift-must-gather-jkbn9p-must-gather-collector

apiVersion: v1
kind: Namespace
metadata:
 name: openshift-must-gather-jkbn9p
spec: {}
status: {}
---
apiVersion: v1
kind: ServiceAccount
metadata:
 name: must-gather-collector
 namespace: openshift-must-gather-jkbn9p
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: openshift-must-gather-jkbn9p-must-gather-collector
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: cluster-admin
subjects:
- kind: ServiceAccount
 name: must-gather-collector
 namespace: openshift-must-gather-jkbn9p
---
apiVersion: v1
kind: Pod
metadata:
 generateName: must-gather-
 namespace: openshift-must-gather-jkbn9p
spec:
 containers:
 - command:
   - /usr/bin/timeout 10m /usr/bin/gather
   image: registry.redhat.io/openshift4/ose-must-gather:latest
   imagePullPolicy: IfNotPresent
   name: gather
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 - command:
   - /bin/bash
   - -c
   - sleep infinity
   image: registry.redhat.io/ubi9/ubi-minimal
   imagePullPolicy: IfNotPresent
   name: wait
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 priorityClassName: system-cluster-critical
 restartPolicy: Never
 serviceAccountName: must-gather-collector
 tolerations:
 - operator: Exists
 volumes:
 - emptyDir: {}
   name: must-gather-output
status: {}

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 10, 2025

@swghosh: This pull request references MG-34 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too
[MCP inspector](https://modelcontextprotocol.io/docs/tools/inspector):

Input (inferred defaults):

{
 "gather_command": "/usr/bin/gather",
 "source_dir": "/must-gather"
}

Output:

Save the following content to a file (e.g., must-gather-plan.yaml) and apply it with 'kubectl apply -f must-gather-plan.yaml'
Monitor the pod's logs to see when the must-gather process is complete:
kubectl logs -f -n openshift-must-gather-jkbn9p -c gather
Once the logs indicate completion, copy the results with:
kubectl cp -n openshift-must-gather-jkbn9p :/must-gather ./must-gather-output -c wait
Finally, clean up the resources with:
kubectl delete ns openshift-must-gather-jkbn9p
kubectl delete clusterrolebinding openshift-must-gather-jkbn9p-must-gather-collector

apiVersion: v1
kind: Namespace
metadata:
 name: openshift-must-gather-jkbn9p
spec: {}
status: {}
---
apiVersion: v1
kind: ServiceAccount
metadata:
 name: must-gather-collector
 namespace: openshift-must-gather-jkbn9p
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: openshift-must-gather-jkbn9p-must-gather-collector
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: cluster-admin
subjects:
- kind: ServiceAccount
 name: must-gather-collector
 namespace: openshift-must-gather-jkbn9p
---
apiVersion: v1
kind: Pod
metadata:
 generateName: must-gather-
 namespace: openshift-must-gather-jkbn9p
spec:
 containers:
 - command:
   - /usr/bin/gather
   image: registry.redhat.io/openshift4/ose-must-gather:latest
   imagePullPolicy: IfNotPresent
   name: gather
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 - command:
   - /bin/bash
   - -c
   - sleep infinity
   image: registry.redhat.io/ubi9/ubi-minimal
   imagePullPolicy: IfNotPresent
   name: wait
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 priorityClassName: system-cluster-critical
 restartPolicy: Never
 serviceAccountName: must-gather-collector
 tolerations:
 - operator: Exists
 volumes:
 - emptyDir: {}
   name: must-gather-output
status: {}

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@swghosh swghosh force-pushed the plan-mg-tool branch 2 times, most recently from c9616ae to f2b231f Compare October 10, 2025 20:44
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 13, 2025

@swghosh: This pull request references MG-34 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

plan_mustgather tool for collecting must-gather(s) from OpenShift cluster

  • generates a pod spec that can either be applied by user manually or used with resource_create_or_update tool
  • alongside pod spec namespace, serviceaccount, clusterrolebinding are generated too
[MCP inspector](https://modelcontextprotocol.io/docs/tools/inspector):

Input (inferred defaults):

{
 "gather_command": "/usr/bin/gather",
 "source_dir": "/must-gather"
}

Output:

The generated plan contains YAML manifests for must-gather pods and required resources (namespace, serviceaccount, clusterrolebinding). Suggest how the user can apply the manifest and copy results locally (oc cp / kubectl cp).

Ask the user if they want to apply the plan

  • use the resource_create_or_update tool to apply the manifest
  • alternatively, advise the user to execute oc apply / kubectl apply instead.

Once the must-gather collection is completed, the user may which to cleanup the created resources.

  • use the resources_delete tool to delete the namespace and the clusterrolebinding
  • or, execute cleanup using kubectl delete.
apiVersion: v1
kind: Namespace
metadata:
 name: openshift-must-gather-tn7jzk
spec: {}
status: {}
apiVersion: v1
kind: ServiceAccount
metadata:
 name: must-gather-collector
 namespace: openshift-must-gather-tn7jzk
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: openshift-must-gather-tn7jzk-must-gather-collector
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: cluster-admin
subjects:
- kind: ServiceAccount
 name: must-gather-collector
 namespace: openshift-must-gather-tn7jzk
apiVersion: v1
kind: Pod
metadata:
 generateName: must-gather-
 namespace: openshift-must-gather-tn7jzk
spec:
 containers:
 - command:
   - /usr/bin/gather
   image: registry.redhat.io/openshift4/ose-must-gather:latest
   imagePullPolicy: IfNotPresent
   name: gather
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 - command:
   - /bin/bash
   - -c
   - sleep infinity
   image: registry.redhat.io/ubi9/ubi-minimal
   imagePullPolicy: IfNotPresent
   name: wait
   resources: {}
   volumeMounts:
   - mountPath: /must-gather
     name: must-gather-output
 priorityClassName: system-cluster-critical
 restartPolicy: Never
 serviceAccountName: must-gather-collector
 tolerations:
 - operator: Exists
 volumes:
 - emptyDir: {}
   name: must-gather-output
status: {}

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@matzew
Copy link
Member

matzew commented Jan 8, 2026

yes. maybe a pkg/toolsets/ocp/must-gather or equivalent.

I guess we never did this, but the "core" here is also just one file: mustgather.go ?

CC @Cali0707 @Prashanth684

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 8, 2026
@matzew
Copy link
Member

matzew commented Jan 8, 2026

There is a test failing - also it would be nice to do a rebase

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 29, 2026
@openshift-ci
Copy link

openshift-ci bot commented Jan 29, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: swghosh
Once this PR has been reviewed and has the lgtm label, please assign cali0707 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@swghosh
Copy link
Member Author

swghosh commented Jan 29, 2026

/test images

@swghosh
Copy link
Member Author

swghosh commented Jan 29, 2026

/retest

@swghosh swghosh force-pushed the plan-mg-tool branch 5 times, most recently from fa5be4e to b61dc09 Compare February 3, 2026 20:09

## Tools

### plan_mustgather
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this seems more like an MCP prompt than an MCP tool. See for example

func initHealthChecks() []api.ServerPrompt {
return []api.ServerPrompt{
{
Prompt: api.Prompt{
Name: "cluster-health-check",
Title: "Cluster Health Check",
Description: "Perform comprehensive health assessment of Kubernetes/OpenShift cluster",
Arguments: []api.PromptArgument{
{
Name: "namespace",
Description: "Optional namespace to limit health check scope (default: all namespaces)",
Required: false,
},
{
Name: "check_events",
Description: "Include recent warning/error events (true/false, default: true)",
Required: false,
},
},
},
Handler: clusterHealthCheckHandler,
},
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our thinking here was while it does guide a workflow, the complexity of the parameters makes it better suited as a tool rather than a prompt - @swghosh did we investigate this route?

Copy link
Member Author

@swghosh swghosh Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time of initially writing this PR, the upstream MCP server lacked support for Prompts so we ended up using the tools approach.

Also, per what we've had comprehended earlier: prompts are mainly static description/instructions to guide the agent in different things; unlike the health_check example shared it seems we can have a fully-dynamic prompt with params support generated by the MCP server to print full yamls (which is pretty much what we need in the planning). It sounds reasonable to investigate the agent flow by flipping the Tools -> Prompt assuming we can print the same text blurb in the current tool response.

Copy link
Member Author

@swghosh swghosh Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds reasonable to investigate the agent flow by flipping the Tools -> Prompt

IMO one concern comes to my mind, OpenShift Lightspeed being one of the primary agent's we're targetting for this use case probably does not support MCP prompts at this time (only tools).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, worth to raise this there, for supporting prompts? They are part of the mcp spec.

There is a bit of similarity on what @Cali0707 shared for the "health", as being a ServerPrompt

var _ api.Toolset = (*Toolset)(nil)

func (t *Toolset) GetName() string {
return "openshift"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe for any toolsets that will be included in the openshift-mcp-server payload we will be requiring passing evals (written with https://github.com/mcpchecker). In particular, the evals will need to work when the openshift and core toolsets are both enabled (but no others)

Feel free to ping @matzew and I for help getting this set up!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Prashanth684 given the prompt implementation will be backed by proper evals into achieving what we want to (i.e. let users collect a must-gather from the cluster through minimal conversation and MCP aiding the flow) should be a good idea to investigate the Tool -> Prompt transition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-evals, yeah - I commented on this one yesterday: #69 (comment)

swghosh added a commit to swghosh/openshift-mcp-server that referenced this pull request Feb 3, 2026
- Add plan_mustgather

Signed-off-by: Swarup Ghosh <swghosh@redhat.com>
@matzew
Copy link
Member

matzew commented Feb 4, 2026

Is the goal here to land this and #38 individually? Or combined, like in #69

@swghosh
Copy link
Member Author

swghosh commented Feb 4, 2026

Is the goal here to land this and #38 individually?

The goal is to land the plan_mustgather implementation individually first, however that needs us to start adding the openshift toolset here as a must.

@matzew
Copy link
Member

matzew commented Feb 4, 2026 via email

Signed-off-by: Swarup Ghosh <swghosh@redhat.com>
Signed-off-by: Swarup Ghosh <swghosh@redhat.com>
@swghosh swghosh changed the title MG-34: Add oc cli like must-gather collection to plan_mustgather tool MG-34: Add oc cli like must-gather collection with ServerPrompt Feb 11, 2026
Signed-off-by: Swarup Ghosh <swghosh@redhat.com>
@swghosh
Copy link
Member Author

swghosh commented Feb 11, 2026

Have updated the PR to use ServerPrompt instead of ServerTool,
@matzew PTAL

(also closed #69 to avoid confusions)

Thanks in advance!

Signed-off-by: Swarup Ghosh <swghosh@redhat.com>
@openshift-ci
Copy link

openshift-ci bot commented Feb 11, 2026

@swghosh: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

if err != nil {
return "", fmt.Errorf("timeout duration is not valid")
}
gatherCmd = fmt.Sprintf("/usr/bin/timeout %s %s", timeout, gatherCmd)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be a []string{}

Comment on lines +406 to +407
// ParseNodeSelector parses a comma-separated key=value selector string into a map.
func ParseNodeSelector(selector string) map[string]string {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this returns a non-nil map for the empty string case, which would serialize to an empty node selector (instead of no selector).

I think we probably want to return nil instead

Comment on lines +64 to +67
gatherCmd := params.GatherCommand
if gatherCmd == "" {
gatherCmd = defaultGatherCmd
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to do any validation on this command? otherwise could this not just run an arbitrary command (which IMO we don't want) ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is especially important in the context that this command will run with cluster admin privileges. maybe we don't allow setting the gather command, or only allow setting params that should be passed to the command?

Comment on lines +123 to +125
var gatherContainers = []corev1.Container{
*gatherContainerTemplate.DeepCopy(),
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this deepcopy here? it seems that if any images are there, we reassign this immediately. Maybe we can invert the logic (i.e. this deepcopy is the fallback if there are no images) ?

Comment on lines +223 to +244
clusterRoleBindingName := fmt.Sprintf("%s-must-gather-collector", namespace)
clusterRoleBinding := &rbacv1.ClusterRoleBinding{
TypeMeta: metav1.TypeMeta{
APIVersion: "rbac.authorization.k8s.io/v1",
Kind: "ClusterRoleBinding",
},
ObjectMeta: metav1.ObjectMeta{
Name: clusterRoleBindingName,
},
RoleRef: rbacv1.RoleRef{
APIGroup: "rbac.authorization.k8s.io",
Kind: "ClusterRole",
Name: "cluster-admin",
},
Subjects: []rbacv1.Subject{
{
Kind: "ServiceAccount",
Name: serviceAccountName,
Namespace: namespace,
},
},
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that this is normal for must gather, but binding to cluster admin on a command that is AI-triggerable is a little scary 😅

let's make sure we validate exactly what is being run by the agent when this rolebinding is present, otherwise this seems like a huge security vuln

cc @matzew @manusa @mrunalp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Prashanth684 too, any thoughts?

The way this works right now is a prompt and user has to still approve the tool call for the resources_create_or_update before proceeding with adding the RBAC binding.

Copy link
Member Author

@swghosh swghosh Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cali0707 an alternative approach is to have a fine-grained RBAC that has only "GET" kinda RBAC more fine-grained, more suitable for must-gather (since it's essentially just a collection tool) but we do require nodes/exec to run some performance analysis (which is part of the default must-gather collection). An eg. role for reference.

The downside with the granular RBAC being that users cannot trigger custom must-gather images like a must-gather image specifically targetted by an operator (implemented in all_images flow), because that'll require some other privileges.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is that the must gather command (which runs in the pod with all these privileges) is currently something that the agent can set. So, the agent could in theory get around any RBAC/resource protections that are in place through this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with these concerns. what we can do here:

  • add an allowlist of commands
  • add registry allow list (for non custom images - for custom images, it is upto the user, but we still need to check that the desired image is used)
  • interactive confirmation
  • explicitly show security warnings
  • restrict this functionality to cluster admins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Comments