Skip to content

chore: add metrics and logs to debug db replica usage#3606

Merged
Flo4604 merged 3 commits intomainfrom
07-16-chore_add_metrics_and_logs_to_debug_db_replica_usage
Jul 16, 2025
Merged

chore: add metrics and logs to debug db replica usage#3606
Flo4604 merged 3 commits intomainfrom
07-16-chore_add_metrics_and_logs_to_debug_db_replica_usage

Conversation

@chronark
Copy link
Collaborator

@chronark chronark commented Jul 16, 2025

What does this PR do?

This PR adds Prometheus metrics for database operations to track performance and usage patterns. It instruments all database operations (exec, prepare, query, query_row, begin) to measure latency and count total operations with labels for replica type (rw/ro), operation type, and status (success/error). The PR also adds logging to indicate whether the database is configured with a separate read replica.

Fixes #

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Chore (refactoring code, technical debt, workflow improvements)
  • Enhancement (small improvements)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How should this be tested?

  • Verify that Prometheus metrics are being collected for database operations by checking the metrics endpoint
  • Test database operations with both primary and read replica configurations to ensure metrics are properly labeled
  • Verify that log messages appear indicating whether a read replica is configured

Checklist

Required

  • Filled out the "How to test" section in this PR
  • Read Contributing Guide
  • Self-reviewed my own code
  • Commented on my code in hard-to-understand areas
  • Ran pnpm build
  • Ran pnpm fmt
  • Checked for warnings, there are none
  • Removed all console.logs
  • Merged the latest changes from main onto my branch with git pull origin main
  • My changes don't cause any responsiveness issues

Appreciated

  • If a UI change was made: Added a screen recording or screenshots to this PR
  • Updated the Unkey Docs if changes were necessary

Summary by CodeRabbit

  • New Features

    • Added Prometheus metrics to monitor database operation latency and total operation counts, with detailed labels for replica type, operation type, and status.
    • Introduced new metrics for improved visibility into database performance and error rates.
  • Other Improvements

    • Enhanced logging during database initialization to indicate whether a separate read replica is configured.

@vercel
Copy link

vercel bot commented Jul 16, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
dashboard ⬜️ Ignored (Inspect) Visit Preview Jul 16, 2025 2:39pm
engineering ⬜️ Ignored (Inspect) Visit Preview Jul 16, 2025 2:39pm

@changeset-bot
Copy link

changeset-bot bot commented Jul 16, 2025

⚠️ No Changeset found

Latest commit: ac3ffb0

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 16, 2025

📝 Walkthrough

Walkthrough

The changes introduce Prometheus-based metrics for database operations, including latency and operation counts, by instrumenting the Replica struct's public methods. Informational logging is added to the database initialization process, and new metric collectors are defined for database monitoring in a new metrics source file.

Changes

File(s) Change Summary
go/pkg/db/database.go Added informational logging to the database initialization regarding read replica configuration.
go/pkg/db/replica.go Instrumented all public methods of Replica with Prometheus metrics for latency and operation counts.
go/pkg/prometheus/metrics/database.go Added new metric collectors: DatabaseOperationLatency (histogram) and DatabaseOperationTotal (counter).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Replica
    participant DB
    participant PrometheusMetrics

    Client->>Replica: ExecContext / QueryContext / etc.
    Replica->>PrometheusMetrics: Record start time
    Replica->>DB: Perform DB operation
    DB-->>Replica: Return result/error
    Replica->>PrometheusMetrics: Observe latency, increment count (with labels)
    Replica-->>Client: Return result/error
Loading
sequenceDiagram
    participant App
    participant DatabaseInit
    participant Logger

    App->>DatabaseInit: Initialize database (New)
    DatabaseInit->>Logger: Log replica configuration info
    DatabaseInit-->>App: Return database instance
Loading
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@chronark chronark marked this pull request as ready for review July 16, 2025 09:56
@vercel vercel bot temporarily deployed to Preview – engineering July 16, 2025 09:57 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard July 16, 2025 09:57 Inactive
@github-actions
Copy link
Contributor

github-actions bot commented Jul 16, 2025

Thank you for following the naming conventions for pull request titles! 🙏

@graphite-app
Copy link

graphite-app bot commented Jul 16, 2025

Video gif. An elderly man in suspenders smiles as he holds up two thumbs as if saying good luck.  (Added via Giphy)

@graphite-app
Copy link

graphite-app bot commented Jul 16, 2025

Graphite Automations

"Post a GIF when PR approved" took an action on this PR • (07/16/25)

1 gif was posted to this PR based on Andreas Thomas's automation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0c8fd65 and faba70c.

📒 Files selected for processing (3)
  • go/pkg/db/database.go (1 hunks)
  • go/pkg/db/replica.go (6 hunks)
  • go/pkg/prometheus/metrics/database.go (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: chronark
PR: unkeyed/unkey#3420
File: go/pkg/hydra/store/gorm/gorm.go:486-498
Timestamp: 2025-07-02T11:51:58.572Z
Learning: The Hydra package (go/pkg/hydra) is planned to be migrated from GORM to sqlc for database operations, which explains why raw SQL queries are acceptable in the current implementation.
🧬 Code Graph Analysis (2)
go/pkg/db/database.go (1)
go/pkg/otel/logging/interface.go (1)
  • Logger (11-116)
go/pkg/db/replica.go (1)
go/pkg/prometheus/metrics/database.go (2)
  • DatabaseOperationLatency (24-33)
  • DatabaseOperationTotal (42-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Build / Build
  • GitHub Check: Test Agent Local / test_agent_local
  • GitHub Check: Test Packages / Test ./packages/nextjs
  • GitHub Check: Test Packages / Test ./packages/hono
  • GitHub Check: Test Packages / Test ./apps/dashboard
  • GitHub Check: Test Packages / Test ./internal/clickhouse
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (7)
go/pkg/prometheus/metrics/database.go (1)

14-51: Well-designed metrics structure for database monitoring.

The metrics definitions follow Prometheus best practices with appropriate labels and metric types. The histogram for latency and counter for operations will provide valuable insights into database performance patterns.

go/pkg/db/database.go (1)

77-80: Helpful operational logging for database configuration.

These log messages provide clear visibility into the database configuration, which is valuable operational context for understanding the metrics that will be collected from read-write vs read-only replicas.

go/pkg/db/replica.go (5)

35-48: Consistent and well-implemented metrics instrumentation.

The metrics instrumentation follows a clean pattern: capture start time, execute operation, determine status, and record metrics. This provides accurate latency measurements and operation counts.


113-124: Consider potential limitation with QueryRowContext error tracking.

The QueryRowContext method always records operations as "success" since it doesn't return an error. However, SQL errors can still occur but are only discovered when scanning the row later. This could lead to misleading success rates in your metrics.

Consider whether this is acceptable for your monitoring needs, or if you need a different approach to track actual query failures for QueryRowContext operations. You might want to:

  1. Accept this limitation and document it
  2. Implement a wrapper that tracks scan errors
  3. Use a different status like "completed" instead of "success" for QueryRowContext

What's your preference for handling this scenario?


61-76: Metrics instrumentation follows consistent pattern.

The implementation correctly measures operation latency and tracks success/error status for the prepare operation.


87-102: Proper metrics tracking for query operations.

The query operation metrics are implemented correctly with accurate timing and status determination.


135-150: Transaction begin operations properly instrumented.

The Begin method correctly tracks transaction start latency and success/error status.

@vercel vercel bot temporarily deployed to Preview – engineering July 16, 2025 13:35 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard July 16, 2025 13:35 Inactive
@Flo4604 Flo4604 added this pull request to the merge queue Jul 16, 2025
Merged via the queue into main with commit be3f201 Jul 16, 2025
17 of 18 checks passed
@Flo4604 Flo4604 deleted the 07-16-chore_add_metrics_and_logs_to_debug_db_replica_usage branch July 16, 2025 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants