Skip to content

NE-2450: query_prometheus tool#115

Merged
matzew merged 5 commits intoopenshift:mainfrom
bentito:feat/netedge-prometheus
Feb 10, 2026
Merged

NE-2450: query_prometheus tool#115
matzew merged 5 commits intoopenshift:mainfrom
bentito:feat/netedge-prometheus

Conversation

@bentito
Copy link

@bentito bentito commented Jan 23, 2026

query_prometheus MCP tool gathers prometheus data relevant to network edge diagnostics. Uses the shared prometheus querying capabilty introduced into this server in #117

some more info at: https://issues.redhat.com/browse/NE-2450

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 23, 2026
@openshift-ci
Copy link

openshift-ci bot commented Jan 23, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link

openshift-ci bot commented Jan 23, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bentito
Once this PR has been reviewed and has the lgtm label, please assign matzew for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Cali0707
Copy link

@bentito what is the difference between the query_prometheus tool coming in here and the prometheus_query tool seen in #117 (from @sakhoury )

Maybe we can align around prometheus tooling between the two to not have duplicate/similar tools?

@sakhoury
Copy link

@Cali0707 Thanks for flagging the overlap. I've reviewed both approaches and wanted to share what my PR #117 introduces:

My observability toolset (#117) provides:

  • prometheus_query - Instant PromQL queries against Thanos Querier
  • prometheus_query_range - Range queries with support for relative times (e.g., -1h, -30m, now) and configurable step resolution
  • alertmanager_alerts - Query active, silenced, and inhibited alerts with filter support

Key differences from the query_prometheus in the netedge toolset:

  1. Auto-discovery: My implementation discovers the Thanos Querier and Alertmanager endpoints via OpenShift routes, so no manual prometheus_url configuration is required
  2. Range queries: Support for historical/time-series analysis via prometheus_query_range
  3. Alertmanager integration: Direct access to cluster alerts

The goal of my PR is to establish a general-purpose observability toolset that can house all monitoring/alerting tools in one place. This aligns with how toolsets are organized in this project (e.g., core, helm, kiali).

Question for @bentito: Would the tools in my PR satisfy your use case for NetEdge diagnostics? If there are specific Prometheus query patterns or additional parameters your workflow needs that aren't covered, I'm open to extending the observability toolset to support them. Alternatively, if NetEdge needs a completely separate toolset with domain-specific logic beyond just Prometheus queries, we could potentially have your netedge toolset consume the shared observability infrastructure.

Let me know your thoughts - I'd rather we collaborate and avoid duplicating effort.

@bentito
Copy link
Author

bentito commented Jan 23, 2026

@bentito what is the difference between the query_prometheus tool coming in here and the prometheus_query tool seen in #117 (from @sakhoury )

Maybe we can align around prometheus tooling between the two to not have duplicate/similar tools?

Yeah we should definitely have shared prometheus machinery and then the separate tools can have unique promql to do the things needed. I am about to put up a commit that will add that specialization. I'll still put that up, and then let's consider the following:

@sakhoury what do think about combining to a single tool and then have a "mode" parameter flag? This allows a single set of code for prom querying then whatever promql you want can go into your mode. For instance we'd expect agents running the tool for our usage to run it as "query_prometheus --mode netedge", in the tool code we can branch in cmd, have a share pkg for setting up to call prometheus and separate packages holding the promql or other logic.

@bentito
Copy link
Author

bentito commented Jan 26, 2026

@sakhoury does my mode triggered by a flag like --mode netedge idea work? Maybe relocate the prometheus code to a /toolsets/util location just so it's obvious it's shared functionality and more than one group is responsible for maintaining the code?

@sakhoury
Copy link

@bentito Thanks for the follow-up. To answer your questions directly:

On the --mode netedge flag: I don't think a mode flag on a shared tool is the right approach. MCP tools take JSON parameters, not CLI flags, and coupling domain-specific logic (NetEdge queries) into a general tool would make it harder to maintain as more teams add their own modes.

That said, looking at your latest commit, what you actually implemented - a diagnostic_target parameter with pre-defined queries for ingress/dns - is a better pattern. It keeps the NetEdge-specific logic in the NetEdge toolset where it belongs. I think that approach makes more sense than a shared tool with modes.

On relocating prometheus code to /toolsets/util: I agree we should share the Prometheus client code. I'd suggest pkg/prometheus/ rather than /toolsets/util since it's infrastructure that could be used beyond just toolsets (and matches the existing layout like pkg/helm/, pkg/kubernetes/). But I'm not strongly attached to the location - happy to go with whatever the maintainers prefer.

So my proposal:

  1. Extract shared Prometheus HTTP client to a common package (pkg/prometheus/ or pkg/toolsets/util/)
  2. Keep separate toolsets: observability for general-purpose tools, netedge for domain-specific diagnostics
  3. Both toolsets import the shared client

Does that work for you?

@bentito
Copy link
Author

bentito commented Jan 26, 2026

So my proposal:

1. Extract shared Prometheus HTTP client to a common package (`pkg/prometheus/` or `pkg/toolsets/util/`)

2. Keep separate toolsets: observability for general-purpose tools, netedge for domain-specific diagnostics

3. Both toolsets import the shared client

Yes @sakhoury, that would be great, and yes, I meant CLI flag-like functionality within LLM calling tool via JSON

We'll avoid redundant code, yay!

Your suggestion for pkg/prometheus seems good. Since you have better general code for prom already, would you want to restructure your PR for that and I'll wait until it merges and then restructure mine to import and use it?

@sakhoury
Copy link

Your suggestion for pkg/prometheus seems good. Since you have better general code for prom already, would you want to restructure your PR for that and I'll wait until it merges and then restructure mine to import and use it?

Sounds good @bentito! I'll restructure my PR to extract the shared Prometheus client to pkg/prometheus/.

@bentito bentito force-pushed the feat/netedge-prometheus branch from cd03f62 to 2aa93ea Compare January 28, 2026 19:06
@bentito bentito changed the title WIP: Establish Netedge Toolset - prom query tool init [Netedge Toolset] - query_prom tool Feb 2, 2026
@bentito bentito changed the title [Netedge Toolset] - query_prom tool [Netedge Toolset] query_prometheus tool Feb 2, 2026
@bentito bentito marked this pull request as ready for review February 2, 2026 19:11
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 2, 2026
@openshift-ci openshift-ci bot requested review from Cali0707 and manusa February 2, 2026 19:13
@openshift-ci
Copy link

openshift-ci bot commented Feb 2, 2026

@bentito: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@bentito bentito changed the title [Netedge Toolset] query_prometheus tool NE-2450: query_prometheus tool Feb 4, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 4, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 4, 2026

@bentito: This pull request references NE-2450 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

query_prometheus MCP tool gathers prometheus data relevant to network edge diagnostics. Uses the shared prometheus querying capabilty introduced into this server in #117

some more info at: https://issues.redhat.com/browse/NE-2450

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@matzew matzew merged commit 5e4eb74 into openshift:main Feb 10, 2026
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments