Skip to content

fix: add retries to getKey operation#3825

Merged
chronark merged 10 commits intomainfrom
feat/db_retries
Aug 25, 2025
Merged

fix: add retries to getKey operation#3825
chronark merged 10 commits intomainfrom
feat/db_retries

Conversation

@Flo4604
Copy link
Member

@Flo4604 Flo4604 commented Aug 21, 2025

What does this PR do?

Fixes # (issue)

If there is not an issue for this, please create one first. This is used to tracking purposes and also helps use understand why this PR exists

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Chore (refactoring code, technical debt, workflow improvements)
  • Enhancement (small improvements)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How should this be tested?

  • Test A
  • Test B

Checklist

Required

  • Filled out the "How to test" section in this PR
  • Read Contributing Guide
  • Self-reviewed my own code
  • Commented on my code in hard-to-understand areas
  • Ran pnpm build
  • Ran pnpm fmt
  • Checked for warnings, there are none
  • Removed all console.logs
  • Merged the latest changes from main onto my branch with git pull origin main
  • My changes don't cause any responsiveness issues

Appreciated

  • If a UI change was made: Added a screen recording or screenshots to this PR
  • Updated the Unkey Docs if changes were necessary

Summary by CodeRabbit

  • Bug Fixes

    • Reduced intermittent failures by adding automatic retries around transient database operations, improving reliability for key verification, rate limits, and usage updates.
    • More consistent handling of not-found and duplicate conditions to avoid unnecessary retries.
  • New Features

    • Test mode can inject a request timestamp via header to reproduce time-dependent rate-limit scenarios.
  • Refactor

    • Centralized retry utilities to standardize resilience across services.
  • Tests

    • Added comprehensive unit and integration tests for retry behavior.

@changeset-bot
Copy link

changeset-bot bot commented Aug 21, 2025

⚠️ No Changeset found

Latest commit: def1fa6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link

vercel bot commented Aug 21, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
dashboard Ready Ready Preview Comment Aug 25, 2025 8:56am
1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
engineering Ignored Ignored Preview Aug 25, 2025 8:56am

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 21, 2025

📝 Walkthrough

Walkthrough

Adds a generic DB retry helper and integrates it into multiple DB query sites (ratelimit handler, key verification, usage limiter, cache ops); extends internal retry framework with a ShouldRetry predicate; and adds unit + integration tests for retry behavior. Also adds test-time header handling in the ratelimit handler and small formatting tweaks.

Changes

Cohort / File(s) Summary
DB retry helper (new)
go/pkg/db/retry.go, go/pkg/db/retry_test.go
Adds generic db.WithRetry[T] and DefaultBackoff; implements 3-attempt exponential backoff with a ShouldRetry predicate that skips retries for NotFound and DuplicateKey errors. Includes unit and integration tests covering retry/no-retry paths and generic types.
Retry framework updates
go/pkg/retry/retry.go, go/pkg/retry/retry_test.go
Makes retry option type public (Apply), adds ShouldRetry(...) option, and updates Do to honor non-retryable errors and avoid sleeping after the final attempt. Tests updated for new backoff semantics and ShouldRetry behavior.
Service integrations: DB retries & test-time header
go/apps/api/routes/v2_ratelimit_limit/handler.go, go/internal/services/keys/get.go, go/internal/services/usagelimiter/limit.go, go/internal/services/usagelimiter/redis.go
Replaces direct DB calls with db.WithRetry(...) wrappers (some reads switched to RO handle). In the ratelimit handler, adds test-mode support to parse X-Test-Time as milliseconds into the request timestamp. Control flow, error handling, and public signatures unchanged.
Cache not-found handling
go/internal/services/caches/op.go
Replaces errors.Is(err, sql.ErrNoRows) with db.IsNotFound(err) and updates imports; behavior otherwise unchanged.
Formatting
go/pkg/zen/middleware_errors.go
Whitespace-only change (inserted empty line); no functional change.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant S as Service (handler/keys/limiter)
  participant R as db.WithRetry[T]
  participant Rt as retry (policy)
  participant DB as Database

  S->>R: Call WithRetry(fn)
  R->>Rt: Configure Attempts, Backoff, ShouldRetry
  loop up to N attempts
    Rt->>DB: fn() → query/update
    DB-->>Rt: value, err
    alt err == nil
      Rt-->>R: return value, nil
      R-->>S: value
    else shouldRetry(err) == true
      Rt->>Rt: backoff (sleep if not last attempt)
    else shouldRetry(err) == false
      Rt-->>R: return err (no further retries)
      R-->>S: err
    end
  end
  Rt-->>R: final err after attempts
  R-->>S: value (zero if failed), err
Loading
sequenceDiagram
  autonumber
  participant C as Client
  participant H as v2_ratelimit_limit Handler
  participant R as db.WithRetry
  participant DB as Database

  C->>H: POST /v2/ratelimit/limit (+X-Test-Time?)
  alt TestMode && X-Test-Time present
    H->>H: Parse header (ms) → set limitReq.Time
  end
  H->>R: Fetch namespace (FindRatelimitNamespaceRow)
  R->>DB: Query
  DB-->>R: row or error
  R-->>H: row or error
  H->>H: Process overrides/permissions
  H-->>C: Response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

Bug

Suggested reviewers

  • imeyer
  • perkinsjr

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/db_retries

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@vercel vercel bot temporarily deployed to Preview – engineering August 21, 2025 15:03 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard August 21, 2025 15:03 Inactive
@Flo4604 Flo4604 marked this pull request as ready for review August 21, 2025 15:34
@github-actions
Copy link
Contributor

github-actions bot commented Aug 21, 2025

Thank you for following the naming conventions for pull request titles! 🙏

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
go/pkg/retry/retry.go (2)

27-63: Outdated docstring still references Build; update to New and fix examples

The top-level constructor docs reference retry.Build while the exported function is New. This violates our Go doc guideline (public identifiers must be correctly documented) and will confuse consumers.

Apply this diff to correct the docstring and examples:

-// Build creates a new retry instance with default configuration.
+// New creates a new retry instance with default configuration.
@@
-//	r := retry.Build()
+//	r := retry.New()
@@
-// The retry behavior can be customized using Attempts() and Backoff():
+// The retry behavior can be customized using Attempts(), Backoff(), and ShouldRetry():
@@
-// r := retry.Build(
+// r := retry.New(
 //
 //	retry.Attempts(5),
 //	retry.Backoff(func(n int) time.Duration {
 //		return time.Duration(1<<uint(n)) * time.Second // exponential backoff
 //	}),
+//  retry.ShouldRetry(func(err error) bool { return true }),
 //
 // )

Also applies to: 63-74


132-156: Consider a context-aware variant to allow canceling sleeps/backoff

Today, long backoffs can’t be canceled when the parent request context is done. Add a DoCtx to respect ctx.Done() during backoff. Backward compatible; existing Do can delegate.

Example implementation (new method):

// DoCtx is like Do but respects context cancellation during backoff.
func (r *retry) DoCtx(ctx context.Context, fn func() error) error {
	if r.attempts < 1 {
		return fmt.Errorf("attempts must be greater than 0")
	}
	var err error
	for i := 1; i <= r.attempts; i++ {
		if err = fn(); err == nil {
			return nil
		}
		if r.shouldRetry != nil && !r.shouldRetry(err) {
			return err
		}
		if i < r.attempts {
			d := r.backoff(i)
			if d <= 0 {
				continue
			}
			t := time.NewTimer(d)
			select {
			case <-t.C:
			case <-ctx.Done():
				if !t.Stop() {
					<-t.C
				}
				return ctx.Err()
			}
		}
	}
	return err
}

Follow-up: optionally add retry.NewWithContext(...) helpers in db layer to wire ctx through.

go/internal/services/usagelimiter/redis.go (2)

156-158: Compilation error: cannot range over int (replayWorkers)

for range replayWorkers is invalid; replayWorkers is an int. This won’t compile.

Apply this diff:

-for range replayWorkers {
-	go s.replayRequests()
-}
+for i := 0; i < replayWorkers; i++ {
+	go s.replayRequests()
+}

239-245: Optional: log retry attempts/latency for observability

Consider emitting a debug log or counter for the number of attempts taken by db.WithRetry to aid SRE during incidents (e.g., prometheus counter usagelimiter_db_read_retries_total).

go/internal/services/caches/op.go (1)

8-9: Doc nit: broaden wording beyond “sql error”

Since this now relies on db.IsNotFound, the comment should avoid tying to sql. Minor clarity improvement.

Apply this diff:

-// DefaultFindFirstOp returns the appropriate cache operation based on the sql error
+// DefaultFindFirstOp returns the appropriate cache operation based on the database error
go/apps/api/routes/v2_ratelimit_limit/handler.go (1)

193-203: Test header handling is gated behind TestMode; minor logging nit.

The X-Test-Time parsing under TestMode is safe. Consider including the parse error in the warning for easier forensics, e.g., Warn("invalid test time", "header", header, "err", parseErr).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2809b5e and 3ee9ea8.

📒 Files selected for processing (10)
  • go/apps/api/routes/v2_ratelimit_limit/handler.go (3 hunks)
  • go/internal/services/caches/op.go (2 hunks)
  • go/internal/services/keys/get.go (1 hunks)
  • go/internal/services/usagelimiter/limit.go (2 hunks)
  • go/internal/services/usagelimiter/redis.go (1 hunks)
  • go/pkg/db/retry.go (1 hunks)
  • go/pkg/db/retry_test.go (1 hunks)
  • go/pkg/retry/retry.go (4 hunks)
  • go/pkg/retry/retry_test.go (4 hunks)
  • go/pkg/zen/middleware_errors.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Follow comprehensive documentation guidelines for Go code as described in go/GO_DOCUMENTATION_GUIDELINES.md
Every public function/type in Go code must be documented
Prefer interfaces for testability in Go code
Use AIDEV-* comments for complex/important code in Go services

Files:

  • go/internal/services/usagelimiter/redis.go
  • go/pkg/zen/middleware_errors.go
  • go/internal/services/usagelimiter/limit.go
  • go/pkg/db/retry_test.go
  • go/internal/services/caches/op.go
  • go/pkg/retry/retry.go
  • go/apps/api/routes/v2_ratelimit_limit/handler.go
  • go/internal/services/keys/get.go
  • go/pkg/db/retry.go
  • go/pkg/retry/retry_test.go
**/*.{env,js,ts,go}

📄 CodeRabbit inference engine (CLAUDE.md)

All environment variables must follow the format: UNKEY_<SERVICE_NAME>_VARNAME

Files:

  • go/internal/services/usagelimiter/redis.go
  • go/pkg/zen/middleware_errors.go
  • go/internal/services/usagelimiter/limit.go
  • go/pkg/db/retry_test.go
  • go/internal/services/caches/op.go
  • go/pkg/retry/retry.go
  • go/apps/api/routes/v2_ratelimit_limit/handler.go
  • go/internal/services/keys/get.go
  • go/pkg/db/retry.go
  • go/pkg/retry/retry_test.go
**/*_test.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*_test.go: Use table-driven tests in Go
Organize Go integration tests with real dependencies
Organize Go tests by HTTP status codes

Files:

  • go/pkg/db/retry_test.go
  • go/pkg/retry/retry_test.go
🧠 Learnings (2)
📓 Common learnings
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/workflows/runbook-freshness-check.yaml:157-173
Timestamp: 2025-08-08T14:59:52.283Z
Learning: Repo unkeyed/unkey: When a CI/workflow fix is deferred, imeyer prefers a thorough GitHub issue be opened with sections (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References) and assigned to imeyer, including backlinks to the originating PR and comment.
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/workflows/job_detect_changes.yaml:111-114
Timestamp: 2025-08-08T16:10:00.224Z
Learning: Repo: unkeyed/unkey — When imeyer writes “make me an issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: Repo: unkeyed/unkey — Preference: If imeyer comments “issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
📚 Learning: 2025-08-19T09:42:40.919Z
Learnt from: Flo4604
PR: unkeyed/unkey#3800
File: go/internal/services/keys/validation.go:45-52
Timestamp: 2025-08-19T09:42:40.919Z
Learning: In go/internal/services/keys/validation.go, keys with unlimited usage (RemainingRequests.Valid = false) have an early return that bypasses the usage limiter entirely. The usage limiter is only called for keys with finite remaining request counts.

Applied to files:

  • go/internal/services/usagelimiter/limit.go
🧬 Code graph analysis (9)
go/internal/services/usagelimiter/redis.go (1)
go/pkg/db/retry.go (1)
  • WithRetry (28-59)
go/internal/services/usagelimiter/limit.go (2)
go/pkg/db/retry.go (1)
  • WithRetry (28-59)
go/pkg/db/key_update_credits_decrement.sql_generated.go (1)
  • UpdateKeyCreditsDecrementParams (22-25)
go/pkg/db/retry_test.go (9)
go/pkg/db/retry.go (1)
  • WithRetry (28-59)
go/pkg/retry/retry.go (1)
  • New (63-74)
go/pkg/db/handle_err_no_rows.go (1)
  • IsNotFound (8-10)
go/pkg/db/handle_err_duplicate_key.go (1)
  • IsDuplicateKeyError (7-13)
go/pkg/testutil/containers/containers.go (1)
  • MySQL (88-99)
go/pkg/db/workspace_insert.sql_generated.go (1)
  • InsertWorkspaceParams (37-42)
go/pkg/db/key_insert.sql_generated.go (1)
  • InsertKeyParams (51-67)
go/pkg/hash/sha256.go (1)
  • Sha256 (28-33)
go/pkg/db/key_find_for_verification.sql_generated.go (1)
  • FindKeyForVerificationRow (92-119)
go/internal/services/caches/op.go (1)
go/pkg/db/handle_err_no_rows.go (1)
  • IsNotFound (8-10)
go/pkg/retry/retry.go (1)
go/pkg/clickhouse/client.go (1)
  • New (59-157)
go/apps/api/routes/v2_ratelimit_limit/handler.go (2)
go/pkg/db/retry.go (1)
  • WithRetry (28-59)
go/pkg/db/ratelimit_namespace_find.sql_generated.go (2)
  • FindRatelimitNamespaceRow (37-45)
  • FindRatelimitNamespaceParams (32-35)
go/internal/services/keys/get.go (2)
go/pkg/db/retry.go (1)
  • WithRetry (28-59)
go/internal/services/caches/op.go (1)
  • DefaultFindFirstOp (9-22)
go/pkg/db/retry.go (3)
go/pkg/retry/retry.go (4)
  • New (63-74)
  • Attempts (81-86)
  • Backoff (91-97)
  • ShouldRetry (116-121)
go/pkg/db/handle_err_no_rows.go (1)
  • IsNotFound (8-10)
go/pkg/db/handle_err_duplicate_key.go (1)
  • IsDuplicateKeyError (7-13)
go/pkg/retry/retry_test.go (1)
go/pkg/retry/retry.go (2)
  • New (63-74)
  • ShouldRetry (116-121)
🪛 golangci-lint (2.2.2)
go/pkg/db/retry.go

[high] 33-33: G115: integer overflow conversion int -> uint

(gosec)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Build / Build
  • GitHub Check: Test API / API Test Local
🔇 Additional comments (9)
go/pkg/zen/middleware_errors.go (1)

204-206: No-op formatting change — OK to keep

The blank line after the log call slightly improves readability and has no behavioral impact. LGTM.

go/pkg/retry/retry.go (1)

99-121: Solid addition: ShouldRetry short-circuits non-transient errors

This option is well-scoped and integrates cleanly with Do(). Nice.

go/pkg/retry/retry_test.go (1)

11-121: Good table-driven coverage and accurate sleep accounting after final attempt

The tests validate success/failure paths, custom attempts, and the “no sleep after final attempt” rule. LGTM.

go/internal/services/usagelimiter/redis.go (1)

235-238: All key lookup calls use db.WithRetry consistently
I ran the ripgrep checks and confirmed there are no direct db.Query.FindKeyCredits or db.Query.FindKeyForVerification calls outside of a db.WithRetry block. Key locations verified include:

  • internal/services/usagelimiter/limit.go
  • internal/services/usagelimiter/redis.go
  • internal/services/keys/get.go

The codebase uniformly wraps these DB reads in retry logic, so no further changes are required.

go/internal/services/caches/op.go (1)

5-6: Good centralization of NotFound detection via db.IsNotFound

Shifting sql.ErrNoRows checks behind db.IsNotFound keeps the cache layer decoupled from sql. LGTM.

Also applies to: 15-18

go/internal/services/usagelimiter/limit.go (1)

16-18: RO read wrapped with retry is appropriate.

Using s.db.RO() with db.WithRetry for the credits read is a good fit; NotFound is treated as non-retryable by the helper and maps cleanly to {Valid:false} here.

go/apps/api/routes/v2_ratelimit_limit/handler.go (2)

79-84: LGTM: Retry-wrapped namespace lookup respects NotFound semantics.

Using db.WithRetry around FindRatelimitNamespace with h.DB.RO() is consistent with other read paths; NotFound will bypass retries and flow into the SWR op correctly.


224-235: Response building is correct; optional: avoid pre-initializing OverrideId to nil.

Current pattern is fine. Alternatively, only set OverrideId when present to reduce nil churn.
[ suggest_nitpick ]

go/pkg/db/retry_test.go (1)

49-76: Good coverage of non-retryable classification (NotFound/DuplicateKey).

These assertions validate the helper’s predicates well.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (2)
go/pkg/db/retry_test.go (2)

18-29: Prefer a table-driven test for the basic success vs transient-retry paths; also make unit tests parallelizable

Combining these two nearly identical tests reduces duplication and aligns with the repo guideline “*_test.go uses table-driven tests.” Marking each subtest with t.Parallel() speeds the suite without affecting correctness.

Apply this diff to consolidate the first two tests:

-func TestWithRetry_Success(t *testing.T) {
-	callCount := 0
-
-	result, err := WithRetry(func() (string, error) {
-		callCount++
-		return "success", nil
-	})
-
-	require.NoError(t, err)
-	require.Equal(t, "success", result)
-	require.Equal(t, 1, callCount, "should succeed on first try")
-}
-
-func TestWithRetry_RetriesTransientErrors(t *testing.T) {
-	callCount := 0
-	transientErr := errors.New("connection timeout")
-
-	result, err := WithRetry(func() (string, error) {
-		callCount++
-		if callCount < 3 {
-			return "", transientErr
-		}
-		return "success", nil
-	})
-
-	require.NoError(t, err)
-	require.Equal(t, "success", result)
-	require.Equal(t, 3, callCount, "should retry twice then succeed")
-}
+func TestWithRetry_StringPaths(t *testing.T) {
+	t.Parallel()
+	cases := []struct {
+		name      string
+		setup     func(*int) func() (string, error)
+		want      string
+		wantErr   error
+		wantCalls int
+	}{
+		{
+			name: "success_first_try",
+			setup: func(calls *int) func() (string, error) {
+				return func() (string, error) {
+					*calls++
+					return "success", nil
+				}
+			},
+			want:      "success",
+			wantErr:   nil,
+			wantCalls: 1,
+		},
+		{
+			name: "transient_then_success",
+			setup: func(calls *int) func() (string, error) {
+				transientErr := errors.New("connection timeout")
+				return func() (string, error) {
+					*calls++
+					if *calls < 3 {
+						return "", transientErr
+					}
+					return "success", nil
+				}
+			},
+			want:      "success",
+			wantErr:   nil,
+			wantCalls: 3,
+		},
+	}
+	for _, tc := range cases {
+		tc := tc
+		t.Run(tc.name, func(t *testing.T) {
+			t.Parallel()
+			callCount := 0
+			got, err := WithRetry(tc.setup(&callCount))
+			if tc.wantErr != nil {
+				require.ErrorIs(t, err, tc.wantErr)
+			} else {
+				require.NoError(t, err)
+			}
+			require.Equal(t, tc.want, got)
+			require.Equal(t, tc.wantCalls, callCount)
+		})
+	}
+}

Also applies to: 31-46


126-137: Integration harness assumes an externally running MySQL and preloaded schema; add a reachability check to auto-skip

As written, this will fail locally/CI unless Docker/MySQL is up and the “unkey” schema exists. Add a fast ping to skip gracefully when the dependency is unavailable.

  // Create database instance
  dbInstance, err := New(Config{
    PrimaryDSN: mysqlCfg.FormatDSN(),
    Logger:     logging.NewNoop(),
  })
  require.NoError(t, err)
  defer dbInstance.Close()
+
+ // Skip if DB is not reachable or schema is missing
+ pingCtx, cancel := context.WithTimeout(ctx, 3*time.Second)
+ defer cancel()
+ if err := dbInstance.RW().PingContext(pingCtx); err != nil {
+   t.Skipf("skipping integration test: database not reachable or schema missing: %v", err)
+ }

Follow-up (preferred): evolve go/pkg/testutil/containers.MySQL(t) to actually start a disposable MySQL via testcontainers-go and apply schema before returning a DSN. That will make these tests hermetic and CI-friendly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3ee9ea8 and e86f734.

📒 Files selected for processing (1)
  • go/pkg/db/retry_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Follow comprehensive documentation guidelines for Go code as described in go/GO_DOCUMENTATION_GUIDELINES.md
Every public function/type in Go code must be documented
Prefer interfaces for testability in Go code
Use AIDEV-* comments for complex/important code in Go services

Files:

  • go/pkg/db/retry_test.go
**/*_test.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*_test.go: Use table-driven tests in Go
Organize Go integration tests with real dependencies
Organize Go tests by HTTP status codes

Files:

  • go/pkg/db/retry_test.go
**/*.{env,js,ts,go}

📄 CodeRabbit inference engine (CLAUDE.md)

All environment variables must follow the format: UNKEY_<SERVICE_NAME>_VARNAME

Files:

  • go/pkg/db/retry_test.go
🧠 Learnings (4)
📓 Common learnings
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/workflows/runbook-freshness-check.yaml:157-173
Timestamp: 2025-08-08T14:59:52.283Z
Learning: Repo unkeyed/unkey: When a CI/workflow fix is deferred, imeyer prefers a thorough GitHub issue be opened with sections (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References) and assigned to imeyer, including backlinks to the originating PR and comment.
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/workflows/job_detect_changes.yaml:111-114
Timestamp: 2025-08-08T16:10:00.224Z
Learning: Repo: unkeyed/unkey — When imeyer writes “make me an issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: Repo: unkeyed/unkey — Preference: If imeyer comments “issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
📚 Learning: 2025-08-04T07:44:39.438Z
Learnt from: CR
PR: unkeyed/unkey#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-04T07:44:39.438Z
Learning: Applies to **/*_test.go : Use table-driven tests in Go

Applied to files:

  • go/pkg/db/retry_test.go
📚 Learning: 2025-08-14T16:25:48.167Z
Learnt from: Flo4604
PR: unkeyed/unkey#3785
File: go/apps/api/routes/v2_keys_reroll_key/401_test.go:52-61
Timestamp: 2025-08-14T16:25:48.167Z
Learning: User Flo4604 requested creation of a GitHub issue to track converting all test files to use table-driven test patterns as a broader codebase improvement, following the suggestion made during review of go/apps/api/routes/v2_keys_reroll_key/401_test.go.

Applied to files:

  • go/pkg/db/retry_test.go
📚 Learning: 2025-08-04T07:44:39.438Z
Learnt from: CR
PR: unkeyed/unkey#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-04T07:44:39.438Z
Learning: Applies to **/*_test.go : Organize Go integration tests with real dependencies

Applied to files:

  • go/pkg/db/retry_test.go
🧬 Code graph analysis (1)
go/pkg/db/retry_test.go (10)
go/pkg/db/retry.go (1)
  • WithRetry (28-59)
go/pkg/retry/retry.go (1)
  • New (63-74)
go/pkg/db/handle_err_no_rows.go (1)
  • IsNotFound (8-10)
go/pkg/db/handle_err_duplicate_key.go (1)
  • IsDuplicateKeyError (7-13)
go/pkg/testutil/containers/containers.go (1)
  • MySQL (88-99)
go/pkg/uid/uid.go (3)
  • WorkspacePrefix (20-20)
  • KeyAuthPrefix (21-21)
  • KeyPrefix (16-16)
go/pkg/db/workspace_insert.sql_generated.go (1)
  • InsertWorkspaceParams (37-42)
go/pkg/db/key_insert.sql_generated.go (1)
  • InsertKeyParams (51-67)
go/pkg/hash/sha256.go (1)
  • Sha256 (28-33)
go/pkg/db/key_find_for_verification.sql_generated.go (1)
  • FindKeyForVerificationRow (92-119)
🔇 Additional comments (1)
go/pkg/db/retry_test.go (1)

139-156: No custom Keyring prefix exists; KeyAuthPrefix is correct

I searched the UID definitions and found no KeyringPrefix in go/pkg/uid—only KeyAuthPrefix is defined for the key_auth table. All existing call sites (including bulk and single inserts into key_auth) consistently use uid.KeyAuthPrefix. Since the SQL table is named key_auth, using KeyAuthPrefix aligns with the schema and there are no FK or check constraints expecting a different prefix. No changes are required here.

Copy link
Collaborator

@chronark chronark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments

@Flo4604 Flo4604 requested a review from chronark August 21, 2025 16:47
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
go/apps/api/routes/v2_ratelimit_limit/handler.go (1)

193-203: Harden X-Test-Time parsing (alias + bounds) to reduce flaky tests.

Current behavior silently ignores bad values. For test hygiene, add a bounded range check and accept a namespaced header alias while keeping the old one for backward-compat.

Apply this diff within the test-mode block:

-  header := s.Request().Header.Get("X-Test-Time")
-  if header != "" {
-    i, parseErr := strconv.ParseInt(header, 10, 64)
-    if parseErr != nil {
-      h.Logger.Warn("invalid test time", "header", header)
-    } else {
-      limitReq.Time = time.UnixMilli(i)
-    }
-  }
+  // Prefer namespaced header but keep legacy for compatibility
+  var header string
+  if v := s.Request().Header.Get("X-Unkey-Test-Time-Ms"); v != "" {
+    header = v
+  } else {
+    header = s.Request().Header.Get("X-Test-Time")
+  }
+  if header != "" {
+    i, parseErr := strconv.ParseInt(header, 10, 64)
+    if parseErr != nil {
+      h.Logger.Warn("invalid test time", "header", header)
+    } else {
+      // Guard against absurd values to avoid flaky test behavior
+      const hundredYearsMs = int64(100 * 365 * 24 * 60 * 60 * 1000)
+      nowMs := time.Now().UnixMilli()
+      if i < 0 || i > nowMs+hundredYearsMs {
+        h.Logger.Warn("test time out of reasonable range", "value", i)
+      } else {
+        limitReq.Time = time.UnixMilli(i)
+      }
+    }
+  }
♻️ Duplicate comments (2)
go/pkg/db/retry.go (2)

59-66: API ergonomics: consider a DoWithResult to avoid closure capture.

This was suggested earlier and tracked; repeating here for visibility—no action required in this PR.


43-56: Classify context cancellation/deadline as non-retryable errors.

Prevents retry loops after the caller’s context is done. Safe and common practice.

-		retry.ShouldRetry(func(err error) bool {
+		retry.ShouldRetry(func(err error) bool {
+			// Respect caller cancellation/deadline
+			if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
+				return false
+			}
 			// Don't retry if resource is not found - this is a valid response
 			if IsNotFound(err) {
 				return false
 			}
 
 			// Don't retry duplicate key errors - these won't succeed on retry
 			if IsDuplicateKeyError(err) {
 				return false
 			}
 
 			// Retry all other errors (network issues, timeouts, deadlocks, etc.)
 			return true
 		}),
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e86f734 and ca88e57.

📒 Files selected for processing (3)
  • go/apps/api/routes/v2_ratelimit_limit/handler.go (3 hunks)
  • go/internal/services/usagelimiter/limit.go (1 hunks)
  • go/pkg/db/retry.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Follow comprehensive documentation guidelines for Go code as described in go/GO_DOCUMENTATION_GUIDELINES.md
Every public function/type in Go code must be documented
Prefer interfaces for testability in Go code
Use AIDEV-* comments for complex/important code in Go services

Files:

  • go/pkg/db/retry.go
  • go/apps/api/routes/v2_ratelimit_limit/handler.go
  • go/internal/services/usagelimiter/limit.go
**/*.{env,js,ts,go}

📄 CodeRabbit inference engine (CLAUDE.md)

All environment variables must follow the format: UNKEY_<SERVICE_NAME>_VARNAME

Files:

  • go/pkg/db/retry.go
  • go/apps/api/routes/v2_ratelimit_limit/handler.go
  • go/internal/services/usagelimiter/limit.go
🧠 Learnings (8)
📓 Common learnings
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/workflows/runbook-freshness-check.yaml:157-173
Timestamp: 2025-08-08T14:59:52.283Z
Learning: Repo unkeyed/unkey: When a CI/workflow fix is deferred, imeyer prefers a thorough GitHub issue be opened with sections (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References) and assigned to imeyer, including backlinks to the originating PR and comment.
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/workflows/job_detect_changes.yaml:111-114
Timestamp: 2025-08-08T16:10:00.224Z
Learning: Repo: unkeyed/unkey — When imeyer writes “make me an issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: Repo: unkeyed/unkey — Preference: If imeyer comments “issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
📚 Learning: 2025-08-08T16:10:00.224Z
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/workflows/job_detect_changes.yaml:111-114
Timestamp: 2025-08-08T16:10:00.224Z
Learning: Repo: unkeyed/unkey — When imeyer writes “make me an issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.

Applied to files:

  • go/pkg/db/retry.go
  • go/internal/services/usagelimiter/limit.go
📚 Learning: 2025-08-08T15:10:46.436Z
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: Repo: unkeyed/unkey — Preference: If imeyer comments “issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.

Applied to files:

  • go/pkg/db/retry.go
  • go/internal/services/usagelimiter/limit.go
📚 Learning: 2025-08-08T14:59:52.283Z
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/workflows/runbook-freshness-check.yaml:157-173
Timestamp: 2025-08-08T14:59:52.283Z
Learning: Repo unkeyed/unkey: When a CI/workflow fix is deferred, imeyer prefers a thorough GitHub issue be opened with sections (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References) and assigned to imeyer, including backlinks to the originating PR and comment.

Applied to files:

  • go/pkg/db/retry.go
  • go/internal/services/usagelimiter/limit.go
📚 Learning: 2025-08-08T15:10:46.436Z
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: For repo unkeyed/unkey and PR review workflows: When imeyer comments "issue" on a thread, automatically create a thorough GitHub issue (sections: Summary, Impact, Where, Repro/Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and the specific comment, and assign the issue to imeyer.

Applied to files:

  • go/pkg/db/retry.go
  • go/internal/services/usagelimiter/limit.go
📚 Learning: 2025-08-14T16:25:48.167Z
Learnt from: Flo4604
PR: unkeyed/unkey#3785
File: go/apps/api/routes/v2_keys_reroll_key/401_test.go:52-61
Timestamp: 2025-08-14T16:25:48.167Z
Learning: User Flo4604 requested creation of a GitHub issue to track converting all test files to use table-driven test patterns as a broader codebase improvement, following the suggestion made during review of go/apps/api/routes/v2_keys_reroll_key/401_test.go.

Applied to files:

  • go/pkg/db/retry.go
📚 Learning: 2025-08-21T15:54:45.183Z
Learnt from: chronark
PR: unkeyed/unkey#3825
File: go/internal/services/usagelimiter/limit.go:38-0
Timestamp: 2025-08-21T15:54:45.183Z
Learning: In go/internal/services/usagelimiter/limit.go, the UpdateKeyCreditsDecrement operation cannot be safely wrapped with db.WithRetry due to the lack of idempotency mechanisms in the current tech stack. Retrying this non-idempotent write operation risks double-charging users if the first attempt commits but the client sees a transient error.

Applied to files:

  • go/apps/api/routes/v2_ratelimit_limit/handler.go
  • go/internal/services/usagelimiter/limit.go
📚 Learning: 2025-08-19T09:42:40.919Z
Learnt from: Flo4604
PR: unkeyed/unkey#3800
File: go/internal/services/keys/validation.go:45-52
Timestamp: 2025-08-19T09:42:40.919Z
Learning: In go/internal/services/keys/validation.go, keys with unlimited usage (RemainingRequests.Valid = false) have an early return that bypasses the usage limiter entirely. The usage limiter is only called for keys with finite remaining request counts.

Applied to files:

  • go/internal/services/usagelimiter/limit.go
🧬 Code graph analysis (3)
go/pkg/db/retry.go (3)
go/pkg/retry/retry.go (4)
  • New (63-74)
  • Attempts (81-86)
  • Backoff (91-97)
  • ShouldRetry (116-121)
go/pkg/db/handle_err_no_rows.go (1)
  • IsNotFound (8-10)
go/pkg/db/handle_err_duplicate_key.go (1)
  • IsDuplicateKeyError (7-13)
go/apps/api/routes/v2_ratelimit_limit/handler.go (2)
go/pkg/db/retry.go (1)
  • WithRetry (28-67)
go/pkg/db/ratelimit_namespace_find.sql_generated.go (2)
  • FindRatelimitNamespaceRow (37-45)
  • FindRatelimitNamespaceParams (32-35)
go/internal/services/usagelimiter/limit.go (1)
go/pkg/db/retry.go (1)
  • WithRetry (28-67)
🔇 Additional comments (1)
go/internal/services/usagelimiter/limit.go (1)

38-44: Confirmed: No retry wrappers around non-idempotent decrement operation

Verified via ripgrep that none of the UpdateKeyCreditsDecrement call sites are enclosed in db.WithRetry, ensuring the decrement stays outside retry loops and avoids double-charging. The primary call sites are:

  • apps/api/routes/v2_keys_update_credits/handler.go
  • internal/services/usagelimiter/limit.go
  • internal/services/usagelimiter/redis.go

With no unexpected retry wrappers found, these changes are approved. For future enhancements, introduce a CAS-based or idempotency token design (see the already-opened issue) before enabling retries on this operation.

@vercel vercel bot temporarily deployed to Preview – engineering August 22, 2025 11:01 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard August 22, 2025 11:01 Inactive
@graphite-app
Copy link

graphite-app bot commented Aug 22, 2025

Graphite Automations

"Notify author when CI fails" took an action on this PR • (08/22/25)

1 teammate was notified to this PR based on Andreas Thomas's automation.

"Post a GIF when PR approved" took an action on this PR • (08/25/25)

1 gif was posted to this PR based on Andreas Thomas's automation.

@vercel vercel bot temporarily deployed to Preview – dashboard August 25, 2025 08:56 Inactive
Copy link
Collaborator

@chronark chronark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogzhanolguncu have a look, we need a 2nd review

@Flo4604 Flo4604 requested a review from ogzhanolguncu August 25, 2025 14:39
Copy link
Contributor

yes sir

@graphite-app
Copy link

graphite-app bot commented Aug 25, 2025

Sports gif. A WWE official avatar swings a rubbery arm around and gives an elongated thumbs up. (Added via Giphy)

@chronark chronark added this pull request to the merge queue Aug 25, 2025
Merged via the queue into main with commit 84aefd8 Aug 25, 2025
17 checks passed
@chronark chronark deleted the feat/db_retries branch August 25, 2025 16:02
@coderabbitai coderabbitai bot mentioned this pull request Oct 6, 2025
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants