Skip to content

feat: expose worker run options and wire Prometheus metrics by default#87

Merged
ankurs merged 4 commits into
mainfrom
fix/wire-worker-metrics
May 1, 2026
Merged

feat: expose worker run options and wire Prometheus metrics by default#87
ankurs merged 4 commits into
mainfrom
fix/wire-worker-metrics

Conversation

@ankurs
Copy link
Copy Markdown
Member

@ankurs ankurs commented Apr 30, 2026

Summary

  • core.Run() was invoking workers.Run with no RunOptions, so workers fell back to BaseMetrics{} (no-op) and every worker_started_total, worker_panicked_total, worker_active_count, etc. silently stayed at zero in any service using CBWorkerProvider.
  • Adds core.AddWorkerRunOptions(opts ...workers.RunOption) for init-time configuration of run-level options (metrics, default jitter, run-level interceptors).
  • Prepends workers.WithMetrics(workers.NewPrometheusMetrics(AppName)) by default when APP_NAME is set and Prometheus is not disabled. A user-supplied WithMetrics overrides the default because workers.WithMetrics overwrites runConfig.metrics on each apply.
  • Empty APP_NAME skips the default to avoid ambiguous unprefixed metric names; users can still opt in by calling AddWorkerRunOptions(workers.WithMetrics(workers.NewPrometheusMetrics(""))) explicitly.

Why run-level (not Worker.WithMetrics)

Worker.WithMetrics exists, but a framework default belongs at run level — otherwise every app adopting CBWorkerProvider has to remember to apply metrics on every worker. Per-worker overrides still work for callers that want to.

Test plan

  • make build
  • make test (race + cover, 80.4%)
  • make lint (0 issues, govulncheck clean)
  • New workers_test.go covers default Prometheus path, DisablePrometheus / deprecated DisablePormetheus / empty AppName skip paths, multi-add append, and an end-to-end core.Run() test asserting WorkerStarted fires on a workers.Metrics recorder injected via AddWorkerRunOptions.
  • make doc regenerated README.md.
  • docs.coldbrew.cloud/howto/workers.md — added a "Metrics" subsection under "ColdBrew Integration" (separate submodule; will land in the docs repo).

Manual smoke

Build a service with a CBWorkerProvider, set APP_NAME=demo, scrape /metrics, confirm demo_worker_started_total is present and non-zero.

Summary by CodeRabbit

  • New Features

    • Added support for registering worker run options during initialization.
    • Implemented automatic Prometheus metrics collection for worker lifecycle events.
  • Documentation

    • Updated API reference documentation for new configuration capabilities.
  • Tests

    • Expanded test suite with comprehensive coverage for worker options and metrics scenarios.

core.Run() invoked workers.Run with no RunOptions, so the workers package
silently fell back to BaseMetrics{} (no-op) and every worker_* Prometheus
metric stayed at zero for any service using CBWorkerProvider.

Add AddWorkerRunOptions for init-time configuration and prepend a default
workers.WithMetrics(workers.NewPrometheusMetrics(AppName)) when APP_NAME
is set and Prometheus is not disabled. User-supplied WithMetrics still
wins because workers.WithMetrics overwrites runConfig.metrics on each
apply.
Copilot AI review requested due to automatic review settings April 30, 2026 06:37
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

Warning

Rate limit exceeded

@ankurs has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 32 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bfa30d79-9a9b-4066-81ab-322ed41a0153

📥 Commits

Reviewing files that changed from the base of the PR and between 9b99708 and c506e33.

📒 Files selected for processing (3)
  • README.md
  • workers.go
  • workers_test.go
📝 Walkthrough

Walkthrough

A new framework-level mechanism is added to register default worker run options via AddWorkerRunOptions, which are incorporated during worker startup in (*cb).Run through a new buildWorkerRunOpts method. This enables Prometheus metrics to be automatically applied to workers while allowing user-supplied options to override defaults.

Changes

Cohort / File(s) Summary
Worker Options Mechanism
workers.go, workers_test.go
Added AddWorkerRunOptions for registering global worker run options and buildWorkerRunOpts to construct final option lists, with Prometheus metrics prepended and user options appended. Tests verify Prometheus integration across multiple scenarios (disabled/enabled states, empty AppName, deprecated DisablePormetheus field) and validate option override behavior.
Worker Startup Integration
core.go
Modified (*cb).Run to call buildWorkerRunOpts() and pass constructed options to workers.Run as variadic arguments, replacing the previous direct worker list call.
API Reference Documentation
README.md
Updated README to document the new AddWorkerRunOptions function with usage notes (init-time only, not concurrency-safe, Prometheus default behavior) and adjusted anchor URL for the New function.

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant Core as (*cb).Run
    participant BuildOpts as buildWorkerRunOpts()
    participant WorkersRun as workers.Run()

    User->>User: AddWorkerRunOptions(opts...) [during init]
    Note over User: Registers options in global workerRunOpts slice

    User->>Core: Run(ctx)
    activate Core
    Core->>BuildOpts: buildWorkerRunOpts()
    activate BuildOpts
    Note over BuildOpts: 1. Prepend Prometheus metrics if enabled<br/>2. Append registered workerRunOpts<br/>(user options override defaults)
    BuildOpts-->>Core: []workers.RunOption
    deactivate BuildOpts
    Core->>WorkersRun: workers.Run(ctx, workerList, opts...)
    activate WorkersRun
    WorkersRun-->>Core: error or nil
    deactivate WorkersRun
    deactivate Core
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through worker paths so fine,
Adding options that in defaults align,
Prometheus metrics hop in with care,
While user tweaks float lighter through air,
The Run flow bounds—thump thump—more fair! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: exposing worker run options via AddWorkerRunOptions() and enabling Prometheus metrics by default for workers.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/wire-worker-metrics

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 3 minutes and 32 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@workers_test.go`:
- Around line 132-150: The test can hang on the blocking receive from errCh and
doesn't detect early exit of Run(); change the logic around errCh/started to use
bounded waits: in the startup loop that currently polls rec.started.Load(), also
select on errCh to fail fast if instance.Run() returned early, and replace the
final blocking receive "err := <-errCh" with a select that waits for either err
from errCh or a timeout/context deadline (e.g., 2–5s) and fails the test on
timeout; reference symbols: errCh, instance.Run, instance.Stop, rec.started,
WorkerStarted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 070039dc-1757-4b85-a0f2-670a937bf7c9

📥 Commits

Reviewing files that changed from the base of the PR and between f36bb4e and 9b99708.

📒 Files selected for processing (4)
  • README.md
  • core.go
  • workers.go
  • workers_test.go

Comment thread workers_test.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Exposes framework-level worker run configuration in core and ensures worker Prometheus metrics are wired by default (when enabled) so services using CBWorkerProvider emit worker lifecycle metrics without needing per-worker setup.

Changes:

  • Added core.AddWorkerRunOptions(...workers.RunOption) and a buildWorkerRunOpts() helper to assemble run options (including default Prometheus metrics when APP_NAME is set and Prometheus isn’t disabled).
  • Wired core.Run() to pass the assembled run options into workers.Run(...).
  • Added unit + end-to-end tests covering default/disabled paths and verifying that AddWorkerRunOptions reaches workers.Run; regenerated README.md.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
workers.go Introduces global run options + default Prometheus metrics wiring logic.
core.go Passes run options into workers.Run(...) during service startup.
workers_test.go Adds tests for default option behavior and end-to-end wiring through core.Run().
README.md Documents the new AddWorkerRunOptions API (generated).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread workers_test.go Outdated
Comment thread workers.go Outdated
Comment thread workers_test.go Outdated
- harden TestRun_WorkerMetricsWired: bound the startup wait on errCh,
  bound the post-Stop receive, and guarantee Stop+drain via t.Cleanup so
  a failing assertion never leaks the Run goroutine
- workers.go: spell "behavior" in American English to match the rest of
  the package; regenerate README.md
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread workers_test.go
Adds TestRun_UserMetricsOverridesDefaultPrometheus and extracts the
shared lifecycle scaffolding into runWithRecorder. The new test runs
the full Run lifecycle with a unique AppName (so the default
NewPrometheusMetrics is prepended) and a caller-supplied
workers.WithMetrics(rec); asserting rec.WorkerStarted fires proves the
caller-supplied metrics is the effective implementation.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread workers_test.go
The cleanup's drain previously fell through silently on the 2s timeout,
which could leak the Run goroutine and surface as a less actionable
goleak error later. Report the timeout via t.Errorf so the failure points
directly at the cleanup path.
@ankurs ankurs merged commit 752f100 into main May 1, 2026
7 checks passed
@ankurs ankurs deleted the fix/wire-worker-metrics branch May 1, 2026 14:21
ankurs added a commit to go-coldbrew/docs.coldbrew.cloud that referenced this pull request May 1, 2026
* docs: document worker metrics defaults and AddWorkerRunOptions

Adds a Metrics subsection under ColdBrew Integration explaining that core
wires workers.NewPrometheusMetrics(APP_NAME) automatically when APP_NAME
is set and DISABLE_PROMETHEUS is not, and how to override via
core.AddWorkerRunOptions for custom metrics backends or run-level options.

Tracks go-coldbrew/core#87.

* docs: address review feedback on metrics subsection

- rename "### Metrics" to "### Metrics defaults" so the anchor doesn't
  collide with the existing "## Metrics" earlier in the document
- update the in-text reference to be unambiguous
- pass &myDatadogMetrics{} (pointer) to match the pointer-receiver
  example shown earlier in the same document

* docs: document opt-in worker middleware (Recover, Tracing, Slog, LogContext)

Add a "Tracing and observability middleware (opt-in)" subsection to the
ColdBrew Integration section explaining that core does not wire workers
middleware automatically and showing the AddWorkerRunOptions one-liner
to enable middleware.DefaultInterceptors() — plus a conservative subset
(Recover + LogContext) for fast-tick workers where per-cycle spans/logs
would be noisy.

* docs: address review feedback on metrics/middleware section

- core.Run() -> cb.Run() to match the canonical entrypoint shown
  elsewhere in the docs (cb := core.New(cfg); cb.Run())
- "behaviour" -> "behavior" for consistency with the rest of the page

* docs: reconcile Slog log-lines-per-cycle wording

The middleware section earlier in this document describes Slog as
emitting two log lines per cycle (start + end/error). The new opt-in
section was paraphrasing as "one log line per cycle" — update to match.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants