Skip to content

feat: api sends metrics and traces to local otel#2961

Merged
chronark merged 3 commits intomainfrom
local-otel
Mar 14, 2025
Merged

feat: api sends metrics and traces to local otel#2961
chronark merged 3 commits intomainfrom
local-otel

Conversation

@chronark
Copy link
Collaborator

@chronark chronark commented Mar 13, 2025

Summary by CodeRabbit

  • New Features

    • Introduced comprehensive observability capabilities including distributed tracing, metrics collection, and log management.
    • Rolled out preconfigured monitoring dashboards and data sources for streamlined system oversight.
    • Enhanced HTTP request metrics reporting within the middleware for better observability.
  • Refactor

    • Optimized service initialization and error handling to enhance overall performance and reliability.
    • Upgraded HTTP metrics tracking for improved operational insights.

@vercel
Copy link

vercel bot commented Mar 13, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
engineering ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 14, 2025 7:42am
play ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 14, 2025 7:42am
www ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 14, 2025 7:42am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
dashboard ⬜️ Ignored (Inspect) Visit Preview Mar 14, 2025 7:42am

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This pull request introduces several new deployment configurations and service definitions aimed at enhancing observability and tracing. New services such as Tempo, OpenTelemetry Collector, Prometheus, and Loki are defined in the docker-compose file along with updates to environment variables and volumes. Additional configuration files for Grafana datasources, Loki, Prometheus, Tempo, and the OpenTelemetry Collector are provided. The PR also refines the initialization and metrics registration in various Go modules by updating method signatures, error handling, and introducing new metrics interfaces and observable types, while removing deprecated configurations.

Changes

File(s) Change Summary
deployment/docker-compose.yaml Added new services (tempo, otel-collector, prometheus, loki), updated environment variables for apiv2, added volumes, and removed commented-out sections.
deployment/grafana/provisioning/datasources/datasources.yaml Introduced new Grafana datasource configuration defining Prometheus, Tempo, and Loki with respective connection settings.
deployment/loki/config.yaml, deployment/otel-collector-config.yaml, deployment/prometheus/config.yaml, deployment/tempo/config.yaml Added new configuration files for Loki, OpenTelemetry Collector, Prometheus, and Tempo with server, scrape, and pipeline settings.
deployment/prometheus/prometheus.yaml Removed legacy Prometheus configuration file.
go/apps/api/run.go, go/pkg/otel/grafana.go Streamlined Grafana initialization via OpenTelemetry; updated function signatures, removed intermediate variables, and added new configuration fields (NodeID, CloudRegion).
go/pkg/cache/cache.go, go/pkg/cache/cache_test.go, go/pkg/cache/simulation_test.go Enhanced error handling in cache initialization; updated metric registration methods and corresponding tests to capture and validate errors.
go/pkg/clickhouse/client.go Removed the crypto/tls import and disabled TLS configuration by commenting out related code.
go/pkg/cluster/cluster.go Introduced a new registerMetrics method to report cluster member counts via OpenTelemetry metrics.
go/pkg/membership/serf.go Added a new parseHost function to resolve hostnames to IP addresses for membership configuration.
go/pkg/otel/metrics/interface.go, go/pkg/otel/metrics/metrics.go, go/pkg/otel/metrics/observable.go Added metrics interfaces and refactored implementations using observable gauges; introduced a new Cluster metric for size monitoring.
go/pkg/zen/middleware_metrics.go Integrated HTTP middleware with observability by reporting request metrics with attributes such as host, method, path, and status.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Telemetry Client
    participant Collector as OTLP Collector
    participant Batch as Batch Processor
    participant Tempo as Tempo Exporter
    participant Prom as Prometheus Exporter
    participant Debug as Debug Exporter

    Client->>Collector: Send Telemetry Data (traces/metrics/logs)
    Collector->>Batch: Receive via OTLP (HTTP/gRPC)
    Batch->>Collector: Process data in batches
    alt Traces Pipeline
        Collector->>Tempo: Export traces
        Collector->>Debug: Export traces for debugging
    else Metrics Pipeline
        Collector->>Prom: Export metrics
        Collector->>Debug: Export metrics for debugging
    else Logs Pipeline
        Collector->>Debug: Export logs for debugging
    end
Loading
sequenceDiagram
    participant API as API Runner
    participant Otel as otel.InitGrafana
    participant Resource as Resource Setup
    participant Exporter as Exporter Setup
    participant Shutdown as Shutdown Manager

    API->>Otel: Initialize Grafana with config (including NodeID, CloudRegion)
    Otel->>Resource: Create and configure resource attributes
    Otel->>Exporter: Set up trace and metric exporters (with compression, insecure options)
    Exporter->>Shutdown: Register shutdown callbacks
    Otel-->>API: Return error status (if any)
Loading

Possibly related PRs

  • feat: use otel #2901: Involves similar changes for OpenTelemetry integration and enhancements in observability setups with modifications to InitGrafana.
  • feat: api v2 init #2832: Introduces a new apiv2 service in the docker-compose, directly modifying the same configuration structure.

Suggested reviewers

  • mcstepp
  • ogzhanolguncu
  • perkinsjr
  • MichaelUnkey

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 golangci-lint (1.62.2)

Error: can't load config: the Go language version (go1.23) used to build golangci-lint is lower than the targeted Go version (1.24.0)
Failed executing command with error: can't load config: the Go language version (go1.23) used to build golangci-lint is lower than the targeted Go version (1.24.0)

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)
  • We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
    - To enable this feature, set early_access to true under in the settings.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1047bb4 and 3e56843.

📒 Files selected for processing (1)
  • go/pkg/otel/metrics/metrics.go (9 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 13, 2025

Thank you for following the naming conventions for pull request titles! 🙏

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (11)
deployment/prometheus/config.yaml (1)

1-20: Configure scrape settings based on application needs.

The configuration looks good for setting up Prometheus to scrape metrics from the different services. However, consider if the default 15s scrape interval is appropriate for your use case - shorter intervals provide more granular data but increase load.

Also consider adding job-specific scrape intervals if different components have different requirements:

  - job_name: "prometheus"
+   scrape_interval: 15s
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "tempo"
+   scrape_interval: 30s  # Adjust based on your needs
    static_configs:
      - targets: ["tempo:3200"]
go/pkg/membership/serf.go (1)

101-113: Well-implemented hostname resolution function

The parseHost function correctly resolves hostnames to IP addresses and includes proper error handling. Nice work!

One minor consideration: the function always selects the first IP address from the resolved list. In environments where a host resolves to multiple IP addresses, it might not always be the correct one to use (though this is typically fine for Docker Compose environments).

You might consider adding a comment explaining why you're selecting the first address or enhancing the function to be more selective about which IP address to use if that becomes necessary in the future.

go/pkg/otel/metrics/observable.go (1)

1-17: Clean implementation of the Int64ObservableGauge interface

The implementation is well-structured and follows OpenTelemetry's patterns for observable metrics. The type assertion var _ Int64ObservableGauge = (*int64ObservableGauge)(nil) is a good practice to ensure compile-time verification of interface implementation.

Since the int64ObservableGauge struct is unexported but implements an exported interface, consider adding a factory function to create instances of this type, like:

// NewInt64ObservableGauge creates a new Int64ObservableGauge with the given meter, name, and options
func NewInt64ObservableGauge(meter metric.Meter, name string, opts ...metric.Int64ObservableGaugeOption) Int64ObservableGauge {
    return &int64ObservableGauge{
        m:    meter,
        name: name,
        opts: opts,
    }
}
go/pkg/otel/grafana.go (2)

93-97: Consider making TLS verification configurable.

The code uses WithInsecure() which disables TLS verification. While this is commented as being for local development, it could pose a security risk in production environments.

- otlptracehttp.WithInsecure(), // For local development
+ otlptracehttp.WithInsecure(config.GrafanaEndpoint == "localhost" || strings.HasPrefix(config.GrafanaEndpoint, "127.0.0.1")), // Only disable TLS for local endpoints

Alternatively, add a boolean flag to the Config struct to explicitly control this behavior.


120-124: Consider making TLS verification configurable.

Similar to the trace exporter, the metric exporter uses WithInsecure() which should be made configurable based on the environment.

- otlpmetrichttp.WithInsecure(), // For local development
+ otlpmetrichttp.WithInsecure(config.GrafanaEndpoint == "localhost" || strings.HasPrefix(config.GrafanaEndpoint, "127.0.0.1")), // Only disable TLS for local endpoints
deployment/docker-compose.yaml (6)

173-186: Pin the OpenTelemetry Collector version.

Using otel/opentelemetry-collector-contrib:latest may introduce unpredictability if the image changes upstream. Pin the image to a specific version to ensure consistent builds.

-    image: otel/opentelemetry-collector-contrib:latest
+    image: otel/opentelemetry-collector-contrib:<pinned-version>

187-199: Consider pinned Prometheus version and retention settings.

“latest” can cause unexpected upgrades. Also, you might want to specify a retention period (e.g., --storage.tsdb.retention.time=7d) to manage disk usage.


200-212: Secure your Grafana admin credentials and pin its version.

Storing admin credentials in plain text is fine for local testing, but for production, consider environment variable management or secrets. Also, pinning Grafana to a known version prevents unexpected upgrades.


213-223: Pin the Tempo version.

Using grafana/tempo:latest may introduce instability when new versions roll out. Pin a specific version to maintain consistent behavior.


224-233: Pin the Loki version.

Similarly, consider pinning grafana/loki:latest to a stable version to avoid unintended breakage.


237-238: Review volumes for data retention strategy.

tempo and loki volumes will grow over time. Confirm whether you need a retention policy or an automatic cleanup approach in your local environment for these new volumes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75e10f6 and 06bcdc9.

📒 Files selected for processing (19)
  • deployment/docker-compose.yaml (3 hunks)
  • deployment/grafana/provisioning/datasources/datasources.yaml (1 hunks)
  • deployment/loki/config.yaml (1 hunks)
  • deployment/otel-collector-config.yaml (1 hunks)
  • deployment/prometheus/config.yaml (1 hunks)
  • deployment/prometheus/prometheus.yaml (0 hunks)
  • deployment/tempo/config.yaml (1 hunks)
  • go/apps/api/run.go (1 hunks)
  • go/pkg/cache/cache.go (2 hunks)
  • go/pkg/cache/cache_test.go (6 hunks)
  • go/pkg/cache/simulation_test.go (1 hunks)
  • go/pkg/clickhouse/client.go (1 hunks)
  • go/pkg/cluster/cluster.go (3 hunks)
  • go/pkg/membership/serf.go (3 hunks)
  • go/pkg/otel/grafana.go (3 hunks)
  • go/pkg/otel/metrics/interface.go (1 hunks)
  • go/pkg/otel/metrics/metrics.go (9 hunks)
  • go/pkg/otel/metrics/observable.go (1 hunks)
  • go/pkg/zen/middleware_metrics.go (3 hunks)
💤 Files with no reviewable changes (1)
  • deployment/prometheus/prometheus.yaml
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Test Packages / Test ./internal/clickhouse
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Test Agent Local / test_agent_local
  • GitHub Check: Build / Build
  • GitHub Check: autofix
🔇 Additional comments (31)
deployment/grafana/provisioning/datasources/datasources.yaml (1)

1-20: LGTM! Good observability stack setup.

The Grafana datasources configuration correctly sets up Prometheus, Tempo, and Loki as data sources, which is essential for the metrics and traces implementation. Prometheus is appropriately set as the default datasource.

go/pkg/cache/simulation_test.go (1)

108-116: LGTM! Improved error handling.

Good improvement to handle errors from the cache initialization rather than letting potential errors propagate silently. This makes the test more robust.

deployment/tempo/config.yaml (1)

1-24: Nice configuration for local Tempo setup!

The configuration is well structured with appropriate sections for server, distributor, storage, ingester, and compactor. The OTLP receiver endpoints are properly configured for both HTTP and gRPC protocols on standard ports.

Note that using /tmp/tempo for storage means trace data will be lost on system restarts. While this is acceptable for a local development environment, a more persistent storage location would be needed for any production-like environments.

go/pkg/otel/metrics/interface.go (1)

1-15: Well-designed interfaces for OpenTelemetry metrics

These interfaces provide a clean abstraction over the OpenTelemetry metrics API. The Int64Counter and Int64ObservableGauge interfaces are minimal and focused on their specific responsibilities, which follows good interface design principles.

go/pkg/membership/serf.go (1)

49-53: Good improvement for hostname resolution

Adding hostname resolution is a good enhancement for Docker Compose environments where containers often have dynamic hostnames that need to be resolved to IP addresses.

go/pkg/cluster/cluster.go (3)

10-13: LGTM! Appropriate imports for the metrics system.

The added imports provide all the necessary dependencies for the metrics implementation, including the custom metrics package and OpenTelemetry's attribute and metric packages.


55-58: Good error handling for metrics registration.

Registering metrics during cluster initialization with proper error propagation ensures that metrics are correctly set up before the cluster begins operating.


92-109: Well-implemented metrics registration.

The registerMetrics method:

  1. Properly registers a callback with the metrics system
  2. Reports the cluster size by counting members
  3. Includes the node ID as an attribute for better observability
  4. Handles errors appropriately both in registration and during execution

This implementation aligns well with the PR objective of sending metrics to OpenTelemetry.

deployment/loki/config.yaml (1)

1-62: LGTM! Properly configured Loki for local development.

The configuration is suitable for a local development environment with:

  • Authentication disabled for simplicity
  • Server listening on standard ports (HTTP 3100, gRPC 9096)
  • Debug logging enabled
  • Local filesystem storage
  • Single-node configuration (replication factor 1)
  • In-memory key-value store for the ring
  • Embedded cache for query results
  • TSDB storage with a v13 schema

The anonymous usage reporting is properly documented with clear instructions on how to disable it if needed.

deployment/otel-collector-config.yaml (1)

1-44: LGTM! Well-configured OpenTelemetry Collector.

The configuration establishes a complete observability pipeline:

  1. Receivers: Properly configured to accept both HTTP and gRPC OTLP on standard ports
  2. Processors: Batch processing configured with appropriate limits
  3. Exporters: Well-defined exporters for:
    • Traces to Tempo
    • Metrics to Prometheus
    • Debug output for troubleshooting
  4. Pipelines: Properly structured for traces, metrics, and logs

The insecure TLS settings are acceptable for a local development environment.

go/pkg/zen/middleware_metrics.go (2)

10-13: LGTM! Appropriate imports for metrics integration.

The added imports provide all the necessary dependencies for implementing HTTP request metrics with OpenTelemetry.


65-70: LGTM! Well-implemented HTTP metrics collection.

The implementation:

  1. Counts HTTP requests with the OpenTelemetry counter
  2. Attaches useful attributes:
    • Host
    • HTTP method
    • Path
    • Status code

These metrics will provide valuable insights into API performance and usage patterns.

go/apps/api/run.go (1)

65-73: Good implementation of new OTel configuration fields.

The updated call to InitGrafana correctly implements the new function signature by:

  1. Adding the new NodeID and CloudRegion fields to enhance observability
  2. Passing the shutdowns instance directly, which aligns with the new approach to resource cleanup

This change improves service identification in metrics and traces, making it easier to troubleshoot issues across multiple service instances.

go/pkg/cache/cache_test.go (4)

18-26: Proper error handling for cache initialization.

The test has been correctly updated to capture and verify errors from the cache.New function.


36-45: Proper error handling for cache initialization.

The test has been correctly updated to capture and verify errors from the cache.New function.


59-68: Proper error handling for cache initialization.

The test has been correctly updated to capture and verify errors from the cache.New function.


83-91: Proper error handling for cache initialization.

The test has been correctly updated to capture and verify errors from the cache.New function.

go/pkg/cache/cache.go (2)

58-95: Improved error handling in cache initialization.

The New function now properly returns errors instead of panicking, which is a significant improvement in error handling. This change allows calling code to gracefully handle initialization failures, making the application more robust.


98-144: Enhanced metrics registration using callbacks.

The refactored registerMetrics method (previously collectMetrics) improves the metrics collection approach by:

  1. Using callback registration instead of a ticker-based approach, which is more efficient
  2. Providing proper error handling for each metric registration
  3. Following OpenTelemetry best practices for observability

This implementation will better integrate with the new OpenTelemetry Collector service introduced in this PR.

go/pkg/otel/grafana.go (3)

25-31: Good addition of service identification fields.

The new NodeID and CloudRegion fields enhance observability by providing better context for metrics and traces. This makes it easier to:

  1. Distinguish between multiple instances of the same service
  2. Identify regional performance patterns or issues
  3. Correlate metrics across distributed services

78-90: Improved resource creation with enriched attributes.

The creation of a resource with common attributes including service name, version, instance ID, and region follows OpenTelemetry best practices. This provides richer context for all telemetry data.


104-146: Well-structured shutdown management.

The code correctly registers shutdown handlers for all created resources using the provided shutdowns instance. This is a cleaner approach than returning a slice of functions and ensures proper cleanup of resources during application termination.

deployment/docker-compose.yaml (2)

55-55: Confirm dependency on Tempo service.

Adding tempo as a dependency to apiv2 can block apiv2 from starting if Tempo is unavailable. Consider whether a hard dependency is needed or if a soft/fallback approach is possible.

Please verify whether apiv2 truly requires Tempo to be running at startup or if you can safely allow it to continue without blocking.


65-65: Clarify endpoint vs. comment.

The comment says “Point directly to Tempo,” but the value references otel-collector:4318. Verify whether the comment is correct or if the endpoint should indeed point to the collector.

go/pkg/otel/metrics/metrics.go (7)

24-24: No substantive changes detected.


39-39: Transition to custom Int64Counter looks fine.

The Requests metric remains a counter, consistent with measuring event frequency.


133-133: Appropriate use of gauge for “Size”.

Cache size is a point-in-time measurement that can fluctuate, making a gauge a suitable choice.


135-149: Appropriate introduction of a gauge for “Capacity”.

“Capacity” is also an instantaneous measurement, so using a gauge is appropriate for monitoring current available space.


151-167: New “Cluster” struct and gauge for “Size”.

Using a gauge to observe cluster node count is sensible for monitoring dynamic scaling.


240-246: Gauge usage for “cache_size”.

A dynamic measure of current cache occupancy is an appropriate gauge.


248-254: Gauge usage for “cluster_size”.

Measuring the number of nodes is a point-in-time value, so a gauge is suitable.

Comment on lines +68 to +71
// if opts.TLS == nil {
//
// opts.TLS = new(tls.Config)
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Security concern: TLS configuration has been commented out.

The code that initializes TLS configuration for the ClickHouse connection has been commented out. While this might be intentional for a local development environment where TLS isn't required, it could pose security risks in production deployments.

Consider making this configuration conditional based on the environment rather than commenting it out:

-	//	if opts.TLS == nil {
-	//
-	//		opts.TLS = new(tls.Config)
-	//	}
+	if env != "local" && opts.TLS == nil {
+		opts.TLS = new(tls.Config)
+	}

Don't forget to add the crypto/tls import back if you restore this code.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// if opts.TLS == nil {
//
// opts.TLS = new(tls.Config)
// }
if env != "local" && opts.TLS == nil {
opts.TLS = new(tls.Config)
}

@changeset-bot
Copy link

changeset-bot bot commented Mar 13, 2025

⚠️ No Changeset found

Latest commit: 3e56843

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06bcdc9 and 1047bb4.

📒 Files selected for processing (4)
  • go/pkg/otel/metrics/doc.go (1 hunks)
  • go/pkg/otel/metrics/interface.go (1 hunks)
  • go/pkg/otel/metrics/metrics.go (9 hunks)
  • go/pkg/otel/metrics/observable.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • go/pkg/otel/metrics/doc.go
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Test Packages / Test ./internal/clickhouse
  • GitHub Check: Build / Build
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test Agent Local / test_agent_local
  • GitHub Check: autofix
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (18)
go/pkg/otel/metrics/interface.go (2)

9-21: Interfaces for counters look solid.

The documentation is thorough and properly clarifies concurrency safety for the Add method.


23-38: Observable interface design is clear.

The interface reflects typical usage in OpenTelemetry for registering callbacks. The concurrency note is helpful for guiding initialization usage.

go/pkg/otel/metrics/observable.go (2)

5-34: Gauge observable implementation looks correct.

The design matches the Int64Observable interface. The method doc clarifies concurrency constraints and intended usage, which is good practice.


36-66: Counter observable implementation is well structured.

Likewise, the counter version adequately ensures the callback is registered. The monotonic behavior note aligns with OpenTelemetry guidelines.

go/pkg/otel/metrics/metrics.go (14)

24-24: No change to review.


39-39: No significant comment.


51-57: No further concerns.


66-72: No further concerns.


81-87: No further concerns.


98-104: No further concerns.


127-133: No further concerns.


151-167: No issues identified.


200-205: Correct approach to use a counter for cache hits.


208-214: Correct approach to use a counter for cache misses.


216-222: Good to track cache writes with a counter.


224-230: Using a counter for evictions is appropriate.


240-245: Gauge usage for cache size is suitably chosen.


248-254: Gauge for cluster size is a good fit.

@chronark chronark enabled auto-merge March 14, 2025 07:41
@chronark chronark disabled auto-merge March 14, 2025 07:42
@chronark chronark merged commit 3cafc43 into main Mar 14, 2025
23 of 28 checks passed
@chronark chronark deleted the local-otel branch March 14, 2025 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant