MGMT-21387: Add metrics for just the assisted service API calls by carbonin · Pull Request #67 · openshift-assisted/assisted-service-mcp

carbonin · 2025-08-15T13:44:13Z

Ideally this should be nearly the whole request, but if there is a difference we want to know about it.

Also skew the buckets toward the smaller end since we don't expect many 10s or longer requests

https://issues.redhat.com/browse/MGMT-21387

Summary by CodeRabbit

New Features
- Added detailed per-method API latency metrics, enabling finer-grained observability in dashboards.
- Refined latency histogram buckets for tool requests to improve measurement accuracy.
Refactor
- Centralized API call timing through a common wrapper for consistent latency tracking.
- Reorganized imports to use a dedicated metrics module; no runtime behavior changes.
Chores
- Streamlined metrics module exposure and cleaned up redundant public exports to simplify usage.

Ideally this should be nearly the whole request, but if there is a difference we want to know about it. Also skew the buckets toward the smaller end since we don't expect many 10s or longer requests https://issues.redhat.com/browse/MGMT-21387

openshift-ci-robot · 2025-08-15T13:44:17Z

@carbonin: This pull request references MGMT-21387 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Ideally this should be nearly the whole request, but if there is a difference we want to know about it.

Also skew the buckets toward the smaller end since we don't expect many 10s or longer requests

https://issues.redhat.com/browse/MGMT-21387

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2025-08-15T13:44:20Z

Walkthrough

Consolidates metrics into a dedicated metrics package, updates histogram buckets, adds API_CALL_LATENCY, centralizes API call timing via a new internal _api_call wrapper, and adjusts imports accordingly without changing public method signatures or runtime behavior.

Changes

Cohort / File(s)	Summary
Metrics package exports `metrics/__init__.py`	Adds package initializer that re-exports initiate_metrics, metrics, track_tool_usage via all.
Metrics definitions and buckets `metrics/metrics.py`	Refines MCP tool request histogram buckets; introduces API_CALL_LATENCY histogram labeled by api_method.
Server import adjustments `server.py`	Switches metrics imports from service_client to metrics; imports log from service_client.logger explicitly.
Service client public API surface `service_client/__init__.py`	Removes metrics-related re-exports; all now only includes InventoryClient and log.
InventoryClient API timing wrapper `service_client/assisted_service_api.py`	Adds private _api_call to wrap asyncio.to_thread calls and time via API_CALL_LATENCY; updates all API methods to use the wrapper; public signatures unchanged.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant InventoryClient
  participant _api_call
  participant SDK_Function
  participant Metrics

  Caller->>InventoryClient: get_cluster(...)/list_clusters(...)/...
  InventoryClient->>_api_call: _api_call(func, *args)
  _api_call->>Metrics: API_CALL_LATENCY.labels(api_method).time()
  _api_call->>SDK_Function: asyncio.to_thread(func, *args)
  SDK_Function-->>_api_call: result
  _api_call-->>InventoryClient: result
  InventoryClient-->>Caller: result

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15–20 minutes

Possibly related PRs

Implement Prometheus metrics endpoint #35: Earlier metrics instrumentation and track_tool_usage groundwork related to current metrics consolidation.
initiate metrics #37: Changes around initiate_metrics and server wiring that align with this refactor to the metrics package.

Suggested labels

size/L, ok-to-test

Suggested reviewers

jhernand

Poem

A whisk of whiskers, timers set—tick-tock!
I hop through calls, I measure every clock.
New buckets brim, API beats align,
A metric nibble, latency so fine.
Carrot in paw, I chart the line—
Hop, track, report—our service shines! 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a78492e and 3c157d2.

📒 Files selected for processing (5)

metrics/__init__.py (1 hunks)
metrics/metrics.py (1 hunks)
server.py (1 hunks)
service_client/__init__.py (1 hunks)
service_client/assisted_service_api.py (19 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (2)

server.py (2)

service_client/assisted_service_api.py (1)

InventoryClient (25-536)

metrics/metrics.py (3)

metrics (69-71)

track_tool_usage (51-65)

initiate_metrics (44-48)

service_client/assisted_service_api.py (1)

metrics/metrics.py (1)

metrics (69-71)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Red Hat Konflux / assisted-service-mcp-saas-main-on-pull-request

🔇 Additional comments (7)

service_client/__init__.py (1)

11-11: LGTM! Clean separation of concerns.

The removal of metrics-related exports aligns with the architectural refactoring to consolidate metrics functionality into a dedicated package. This improves modularity and follows the single responsibility principle.

server.py (1)

19-21: LGTM! Import reorganization aligns with metrics consolidation.

The import changes successfully implement the architectural refactoring:

Metrics-related functionality (metrics, track_tool_usage, initiate_metrics) now imported from the dedicated metrics package

log explicitly imported from service_client.logger for clarity

InventoryClient remains the primary export from service_client

This improves separation of concerns and makes the dependency structure more explicit.

metrics/__init__.py (1)

1-10: LGTM! Clean public API establishment for metrics package.

The module properly establishes a public API by:

Adding clear documentation about the metrics functionality

Re-exporting the core metrics functions (initiate_metrics, metrics, track_tool_usage)

Using __all__ to explicitly define the public interface

This enables clean imports from external modules while maintaining encapsulation.

metrics/metrics.py (1)

32-41: LGTM! Histogram bucket adjustments align with PR objectives.

The changes successfully implement the PR requirements:

Updated REQUEST_LATENCY buckets (Line 32): Changed from (0.1, 1.0, 10.0, 30.0, float("inf")) to (0.05, 0.1, 0.5, 1.0, 5.0, 10.0, float("inf")) - properly skewed toward smaller values as requested

New API_CALL_LATENCY histogram (Lines 36-41): Added with appropriate configuration for tracking assisted service API call durations, using the same optimized bucket structure

The bucket configuration correctly addresses the PR objective of skewing toward smaller values since long requests (10s+) are not expected, with most requests likely falling in the sub-second range.

service_client/assisted_service_api.py (3)

11-11: LGTM! Clean imports for generic typing and metrics integration.

The additions properly support the new _api_call wrapper:

Callable and TypeVar for generic method signature support

API_CALL_LATENCY import from the metrics package for latency instrumentation

T = TypeVar("T") for generic return type preservation

Also applies to: 20-22

46-62: LGTM! Centralized API call wrapper with proper latency tracking.

The _api_call method provides excellent centralization of concerns:

Strengths:

Consistent latency tracking: Uses API_CALL_LATENCY with method-specific labels

Preserved typing: Generic T return type maintains type safety

Thread safety: Properly uses asyncio.to_thread for synchronous API calls

Error handling: Maintains existing exception handling through decorators

Implementation details:

Method name extraction via getattr(func, "__name__", "unknown_api_method") provides meaningful metrics labels

Context manager pattern ensures accurate timing measurement

Generic typing preserves API contract integrity

This approach successfully centralizes latency instrumentation while maintaining the existing public API signatures and behavior.

137-143: LGTM! Consistent API call instrumentation across all methods.

All public API methods have been systematically updated to use the centralized _api_call wrapper, which provides:

Benefits:

Uniform latency tracking: Every assisted service API call is now properly instrumented

Preserved functionality: All methods maintain their original signatures and behavior

Consistent error handling: The @sanitize_exceptions decorators continue to work as expected

Type safety: Return types are preserved through the generic _api_call wrapper

Coverage verification: The instrumentation covers all major API operations:

Cluster management (get, list, create, update, install)

Infrastructure environments (get, list, create, update, download URLs)

Events (cluster and host events)

Versions and operators (list, get, add bundles)

Host management (update)

Credentials (presigned URLs)

This comprehensive instrumentation aligns perfectly with the PR objective to "capture metrics representing nearly the whole request body for assisted service API calls."

Also applies to: 154-156, 190-200, 214-218, 232-239, 274-278, 298-303, 321-327, 358-364, 378-382, 398-403, 414-416, 433-433, 469-477, 499-509, 528-536

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

openshift-ci · 2025-08-15T13:44:20Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: carbonin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [carbonin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2025-08-15T13:46:39Z

@carbonin: This pull request references MGMT-21387 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Ideally this should be nearly the whole request, but if there is a difference we want to know about it.

Also skew the buckets toward the smaller end since we don't expect many 10s or longer requests

https://issues.redhat.com/browse/MGMT-21387

Summary by CodeRabbit

New Features

Added detailed per-method API latency metrics, enabling finer-grained observability in dashboards.

Refined latency histogram buckets for tool requests to improve measurement accuracy.

Refactor

Centralized API call timing through a common wrapper for consistent latency tracking.

Reorganized imports to use a dedicated metrics module; no runtime behavior changes.

Chores

Streamlined metrics module exposure and cleaned up redundant public exports to simplify usage.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

carbonin · 2025-08-15T14:34:34Z

/retest

eranco74

/lgtm

carbonin added 2 commits August 15, 2025 09:42

move metrics out of service_client

83631c0

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 15, 2025

openshift-ci Bot requested review from eranco74 and omertuc August 15, 2025 13:44

openshift-ci Bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 15, 2025

eranco74 reviewed Aug 17, 2025

View reviewed changes

openshift-ci Bot assigned eranco74 Aug 17, 2025

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Aug 17, 2025

openshift-merge-bot Bot merged commit 8463fc5 into openshift-assisted:master Aug 17, 2025
15 checks passed

maorfr mentioned this pull request Aug 18, 2025

copy metrics to container #71

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MGMT-21387: Add metrics for just the assisted service API calls#67

MGMT-21387: Add metrics for just the assisted service API calls#67
openshift-merge-bot[bot] merged 2 commits intoopenshift-assisted:masterfrom
carbonin:assisted-service-metrics

carbonin commented Aug 15, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

openshift-ci-robot commented Aug 15, 2025 •

edited by openshift-ci Bot

Loading

Uh oh!

coderabbitai Bot commented Aug 15, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

openshift-ci Bot commented Aug 15, 2025

Uh oh!

openshift-ci-robot commented Aug 15, 2025 •

edited by openshift-ci Bot

Loading

Summary by CodeRabbit

Uh oh!

carbonin commented Aug 15, 2025

Uh oh!

eranco74 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

carbonin commented Aug 15, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Aug 15, 2025 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

openshift-ci Bot commented Aug 15, 2025

Uh oh!

openshift-ci-robot commented Aug 15, 2025 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

carbonin commented Aug 15, 2025

Uh oh!

eranco74 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

carbonin commented Aug 15, 2025 •

edited by coderabbitai Bot

Loading

openshift-ci-robot commented Aug 15, 2025 •

edited by openshift-ci Bot

Loading

coderabbitai Bot commented Aug 15, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

openshift-ci-robot commented Aug 15, 2025 •

edited by openshift-ci Bot

Loading