Skip to content

feat: domain rollbacks#3999

Merged
chronark merged 20 commits intomainfrom
domain-rollbacks
Sep 19, 2025
Merged

feat: domain rollbacks#3999
chronark merged 20 commits intomainfrom
domain-rollbacks

Conversation

@chronark
Copy link
Collaborator

@chronark chronark commented Sep 19, 2025

This has become much larger than i wanted. however a lot of it is just generated code.

I fixed the rollbacks, we now rollback the correct domains and I added visual indicators to the UI.
Rolled back domains become locked and new deployments do not move the rolled back domains.

The next step is to add promotions, to escape the locked state.

Run it locally, deploy 2 times and then click the rollback button on the deployments list page.

Before flo gets mad at me: This is not the most efficient, I cared about correctness first.

Summary by CodeRabbit

  • New Features

    • Rollback dialog now highlights the Live Deployment and its domains, with a clear Target Deployment selection and single primary action.
    • Deployment list shows “Live” and “Rolled Back” badges; rolled-back rows get distinct styling.
  • Improvements

    • Deployments auto-refresh every 5 seconds.
    • Clearer status and environment display; actions enable only when rollback is valid.
  • Refactor

    • Terminology updated from Current/Active to Live across the dashboard.
  • Chores

    • Backend changes improve reliability of rollback and domain reassignment.

@changeset-bot
Copy link

changeset-bot bot commented Sep 19, 2025

⚠️ No Changeset found

Latest commit: 78feb9c

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link

vercel bot commented Sep 19, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
dashboard Ready Ready Preview Comment Sep 19, 2025 3:17pm
1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
engineering Ignored Ignored Preview Sep 19, 2025 3:17pm

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 19, 2025

📝 Walkthrough

Walkthrough

Renames “active” to “live” across dashboard and APIs, adds rollback workflow via DeploymentService RPC, introduces domain “sticky” and rolled-back flags, updates DB schemas/queries for domains, deployments, and gateways, removes the legacy Routing service/proto, and adjusts UI to display live/rolled-back states with updated actions and dialogs.

Changes

Cohort / File(s) Summary
Dashboard UI — Rollback & Actions
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx, .../components/table/components/actions/deployment-list-table-action.popover.constants.tsx
RollbackDialog props changed to target/live deployment; fetches live domains; updates action popover to use selected/live deployments; removes hostname and router usage; adjusts rollback availability and menu items.
Dashboard UI — Table & Badges
.../components/table/deployments-list.tsx, .../components/table/components/env-status-badge.tsx, .../components/table/utils/get-row-class.ts
Switches active→live, adds rolled-back visuals; shows environment slug; simplifies status rendering; adds rolledBack badge variant and row styles.
Dashboard Hook
.../hooks/use-deployments.ts
Returns { deployments, liveDeployment }; project and live deployment fetching simplified with .data.at(0).
Dashboard Collections/Types
apps/dashboard/lib/collections/deploy/deployments.ts, apps/dashboard/lib/collections/deploy/domains.ts
Deployment gains isRolledBack; Domains gain deploymentId and sticky; deployments list refetches every 5s.
Dashboard TRPC — Deploy
apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts, .../deploy/deployment/list.ts, .../deploy/domains/list.ts
Rollback endpoint now uses Connect client to DeploymentService; input only targetDeploymentId; returns domains. Deployments list includes isRolledBack. Domains list selects deploymentId, sticky.
Go Ctrl Service — Deployment workflows
go/apps/ctrl/services/deployment/rollback.go, .../deploy_workflow.go, .../domains.go
Adds Rollback RPC implementation; domain generation now returns domain+sticky and supports CLI upload randomness; gateway config creation extracted; gateway upserts include DeploymentID.
Go Ctrl App Wiring
go/apps/ctrl/run.go
Removes Routing service registration; minor domain insert timestamp change.
Routing Removal
go/apps/ctrl/services/routing/service.go, go/proto/ctrl/v1/routing.proto
Deletes Routing service implementation and its proto definitions.
Protos (Deployment)
go/proto/ctrl/v1/deployment.proto
Adds SourceType enum; adds RollbackRequest/Response; adds DeploymentService.Rollback RPC.
DB Schema (Core, Go)
go/pkg/db/schema.sql, .../queries/*.sql, .../*.sql_generated.go, .../models_generated.go, .../querier_generated.go
Adds sticky and is_rolled_back to domains; is_rolled_back to deployments; unique domain index and deployment index; new queries for domain reassign and deployment rollback; selects include sticky/rolled-back; insert paths include sticky; bulk domain insert updated.
Partition DB (Gateways)
go/pkg/partition/db/schema.sql, .../queries/gateway_upsert.sql, .../gateway_upsert.sql_generated.go, .../bulk_gateway_upsert.sql_generated.go, .../queries/gateway_find_by_deployment_id.sql, .../gateway_find_by_deployment_id.sql_generated.go, .../models_generated.go, .../querier_generated.go
Gateways gain deployment_id column/index; upsert/bulk-upsert include DeploymentID; adds find-by-deployment-id query and types.
Hydra Logging
go/pkg/hydra/step.go, go/pkg/hydra/worker.go, go/pkg/hydra/workflow.go
Adds per-execution logger and logs step failures; no control-flow changes.
Assert Utility
go/pkg/assert/not_equal.go, go/pkg/assert/not_equal_test.go
Adds generic NotEqual assertion with tests and fuzzing.
Kubernetes & Docker
go/k8s/manifests/ctrl.yaml, deployment/docker-compose.yaml
Changes ctrl ServiceAccount name; updates Minio image tag.
Internal DB (TS/Drizzle)
internal/db/src/schema/domains.ts, internal/db/src/schema/deployments.ts, internal/db/src/schema/projects.ts
Adds sticky, isRolledBack, unique and deployment indexes to domains; adds isRolledBack to deployments; adds activeDeployment relation for projects.
Buf/Proto JS outputs
go/buf.gen.yaml, internal/proto/package.json, internal/proto/src/index.ts
Replaces ES plugins with bufbuild/es v2; bumps @bufbuild/protobuf to ^2.8.0; updates exports path and comments out connect export.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant UI as Dashboard (RollbackDialog)
  participant TRPC as TRPC API (rollback)
  participant CTRL as DeploymentService (Ctrl)
  participant DB as DBs (Core + Partition)

  User->>UI: Click "Rollback" on target deployment
  UI->>TRPC: rollback({ targetDeploymentId })
  TRPC->>CTRL: Rollback(sourceLiveId, targetDeploymentId)
  Note over CTRL: Validate deployments, project,<br/>running VMs, sticky domains
  CTRL->>DB: Read live sticky domains & target gateway
  CTRL->>DB: Upsert gateway configs (DeploymentID)
  CTRL->>DB: Reassign domains to target,<br/>mark is_rolled_back
  CTRL->>DB: Update project.live_deployment_id
  CTRL->>DB: Update deployment rollback flags
  CTRL-->>TRPC: RollbackResponse { domains[] }
  TRPC-->>UI: { domains[] }
  UI-->>User: Success toast with target id and domains
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

Core Team, UI

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description contains a brief narrative and a short local run note but does not follow the repository's required template: it lacks a "Fixes #" reference, does not mark the "Type of change" checkboxes, omits a structured, step-by-step "How should this be tested" section with exact commands and environment details, and leaves the Required checklist items unaddressed. Please revise the PR to use the repository template: add "Fixes #" (or create and reference an issue), select the correct "Type of change" checkbox, expand "How should this be tested" with explicit commands, environment and expected results, and complete the Required checklist (build, fmt, tests, removed console.logs, merged main, etc.). Also call out any DB migrations, generated-code impacts, or breaking API/schema changes and link any related issues or follow-up tasks.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title "feat: domain rollbacks" is a short, single-line summary that accurately reflects the primary change set (implementing domain rollback behavior and related DB/UI changes) and follows conventional commit style, so it is clear and appropriate for a reviewer scanning history.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch domain-rollbacks

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cfc8915 and 78feb9c.

⛔ Files ignored due to path filters (1)
  • deployment/data/metald/metald.db is excluded by !**/*.db
📒 Files selected for processing (3)
  • go/pkg/db/schema.sql (3 hunks)
  • go/pkg/partition/db/schema.sql (2 hunks)
  • internal/db/src/schema/domains.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • internal/db/src/schema/domains.ts
  • go/pkg/db/schema.sql
  • go/pkg/partition/db/schema.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Test Packages / Test
  • GitHub Check: Vade Review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chronark
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 19, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@chronark chronark marked this pull request as ready for review September 19, 2025 07:26
@github-actions
Copy link
Contributor

github-actions bot commented Sep 19, 2025

Thank you for following the naming conventions for pull request titles! 🙏

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
deployment/docker-compose.yaml (1)

167-176: Fix MinIO console port mapping — expose 3903, not 2903

Ports map 2903:2903 while MINIO_CONSOLE_PORT_NUMBER=3903; change to 3903:3903 so the console is reachable. File: deployment/docker-compose.yaml (lines 167-176).

   s3:
     container_name: s3
     image: bitnamilegacy/minio:2025.4.3
     ports:
       - 3902:3902
-      - 2903:2903
+      - 3903:3903
     environment:
       MINIO_ROOT_USER: minio_root_user
       MINIO_ROOT_PASSWORD: minio_root_password
       MINIO_API_PORT_NUMBER: 3902
       MINIO_CONSOLE_PORT_NUMBER: 3903
go/proto/ctrl/v1/deployment.proto (1)

98-105: Reserve removed field number to maintain wire compatibility.

Field 19 appears removed (“email is PII”). In proto3, reserve the number to prevent reuse.

 message Deployment {
@@
-  string git_commit_author_username = 20;
+  // Field 19 previously used for email (removed)
+  reserved 19;
+  string git_commit_author_username = 20;
go/apps/ctrl/services/deployment/deploy_workflow.go (2)

182-186: Compile-time bug: invalid range over int.

for i := range 300 is not valid Go and will not compile.

Apply this fix:

-		for i := range 300 {
+		for i := 0; i < 300; i++ {

253-254: On readiness timeout, mark deployment as failed and log a step.

Currently returns an error without updating status/steps.

-		return nil, fmt.Errorf("deployment never became ready")
+		_ = db.Query.UpdateDeploymentStatus(stepCtx, w.db.RW(), db.UpdateDeploymentStatusParams{
+			ID:        req.DeploymentID,
+			Status:    db.DeploymentsStatusFailed,
+			UpdatedAt: sql.NullInt64{Valid: true, Int64: time.Now().UnixMilli()},
+		})
+		_ = db.Query.InsertDeploymentStep(stepCtx, w.db.RW(), db.InsertDeploymentStepParams{
+			WorkspaceID:  req.WorkspaceID,
+			ProjectID:    req.ProjectID,
+			DeploymentID: req.DeploymentID,
+			Status:       db.DeploymentStepsStatusFailed,
+			Message:      "Deployment timed out waiting for VMs to be ready",
+			CreatedAt:    time.Now().UnixMilli(),
+		})
+		return nil, fmt.Errorf("deployment never became ready")
🧹 Nitpick comments (39)
dummy (1)

1-1: Remove the dummy file before merge.

Non-functional file with no extension can confuse tooling and future readers. Prefer an empty commit or draft PR comment to trigger reviews.

Apply this diff (or delete the file outright):

-// I needed to trigger a review

Alternative: git commit --allow-empty -m "chore: trigger review"

go/pkg/hydra/worker.go (1)

449-460: Include trace correlation fields in the per‑execution logger.

Add trace_id/span_id so logs can be joined with traces without querying DB for execution metadata.

@@
-    wctx := &workflowContext{
+    spanCtx := span.SpanContext()
+    wctx := &workflowContext{
@@
-        logger:          w.engine.logger.With("execution_id", e.ID, "namespace", e.Namespace, "workflow_name", e.WorkflowName),
+        logger:          w.engine.logger.With(
+                           "execution_id", e.ID,
+                           "namespace", e.Namespace,
+                           "workflow_name", e.WorkflowName,
+                           "trace_id", spanCtx.TraceID().String(),
+                           "span_id", spanCtx.SpanID().String(),
+                         ),
go/pkg/hydra/step.go (1)

239-239: Add step_name to the log for faster incident triage.

Include the step identifier in the error log entry.

-        wctx.logger.Error("step execution failed", "error", err.Error())
+        wctx.logger.Error("step execution failed", "step_name", stepName, "error", err.Error())

Also, confirm logger is never nil in tests/runners that may construct Engine without a logger.

go/pkg/db/schema.sql (1)

373-375: Add indexes for new workflow queries (deployment_id, is_rolled_back).

Given new FindDomainsByDeploymentId and rollback flows, add indexes to avoid full scans.

@@
 CREATE INDEX `project_idx` ON `domains` (`project_id`);
+CREATE INDEX `deployment_idx` ON `domains` (`deployment_id`);
+CREATE INDEX `rolled_back_idx` ON `domains` (`is_rolled_back`);

If cardinality is high on deployment_id, this yields significant wins under rollback/sticky operations.

internal/db/src/schema/projects.ts (1)

39-42: Rename relation to liveDeployment

Use liveDeployment to align with projects.liveDeploymentId and the repo-wide “active” → “live” rename.

-  activeDeployment: one(deployments, {
+  liveDeployment: one(deployments, {
     fields: [projects.liveDeploymentId],
     references: [deployments.id],
   }),
internal/proto/package.json (1)

19-21: Prune local protoc-gen-es / protoc-gen-connect-es devDependencies

go/buf.gen.yaml uses the remote buf.build/bufbuild/es:v2.8.0 plugin to generate TypeScript into internal/proto/generated and no repo scripts invoke local protoc/protoc-gen-* — remove @bufbuild/protoc-gen-es and @connectrpc/protoc-gen-connect-es from internal/proto/package.json devDependencies to reduce toolchain drift.

go/apps/ctrl/services/deployment/domains.go (3)

27-33: Random suffix collision handling and formatting.

4-digit space (~9000) is OK but non-zero collision risk remains under bursts. Ensure the insert/upsert path detects conflicts and retries with a new suffix.

Consider a tiny tweak and consistent zero-padding:

- randomSuffix = fmt.Sprintf("-%d", 1000+rand.IntN(9000))
+ randomSuffix = fmt.Sprintf("-%04d", rand.IntN(10000))

Also confirm the persistence code retries on unique constraint violations for the gitSha domain when source is CLI upload.

Also applies to: 41-46


50-58: Avoid empty branch slug producing -git-- labels.

If sluggify(branchName) => "", we’ll emit a double-hyphen label. Guard and skip or fall back.

- if branchName != "" {
-   domains = append(
-     domains,
-       newDomain{
-         domain: fmt.Sprintf("%s-git-%s-%s.%s", projectSlug, sluggify(branchName), workspaceSlug, apex),
-         sticky: db.NullDomainsSticky{Valid: true, DomainsSticky: db.DomainsStickyBranch},
-       },
-   )
- }
+ if branchName != "" {
+   if bslug := sluggify(branchName); bslug != "" {
+     domains = append(domains, newDomain{
+       domain: fmt.Sprintf("%s-git-%s-%s.%s", projectSlug, bslug, workspaceSlug, apex),
+       sticky: db.NullDomainsSticky{Valid: true, DomainsSticky: db.DomainsStickyBranch},
+     })
+   }
+ }

71-101: Slug rules: simplify and enforce DNS label limits.

Current logic keeps 80 chars; single DNS label max is 63. Suggest a leaner slugify with label-safe clamp.

-var nonAlphanumericRegex = regexp.MustCompile(`[^a-zA-Z0-9\s]`)
-var multipleSpacesRegex = regexp.MustCompile(`\s+`)
+var nonAlnum = regexp.MustCompile(`[^a-z0-9]+`)
 
 func sluggify(s string) string {
-  // Trim whitespace
-  s = strings.TrimSpace(s)
-
-  // Remove all non-alphanumeric characters except spaces
-  s = nonAlphanumericRegex.ReplaceAllString(s, " ")
-  s = strings.ReplaceAll(s, "-", " ")
-  s = strings.ReplaceAll(s, "_", " ")
-
-  // Replace multiple spaces with single space
-  s = multipleSpacesRegex.ReplaceAllString(s, " ")
-
-  // Replace spaces with hyphens
-  s = strings.ReplaceAll(s, " ", "-")
-
-  // Convert to lowercase
-  s = strings.ToLower(s)
-
-  // Limit to 80 characters
-  if len(s) > 80 {
-    s = s[:80]
-  }
-
-  // Remove trailing hyphen if present
-  s = strings.TrimSuffix(s, "-")
-
-  return s
+  s = strings.ToLower(strings.TrimSpace(s))
+  s = nonAlnum.ReplaceAllString(s, "-")
+  s = strings.Trim(s, "-")
+  if len(s) > 63 {
+    s = strings.TrimRight(s[:63], "-")
+  }
+  return s
 }

Check if any callers rely on 80-char behavior.

apps/dashboard/lib/collections/deploy/deployments.ts (1)

51-52: Poll prudently to reduce load and jitter.

Polling every 5s is fine; consider disabling focus refetch and gating the interval to non-ready states to avoid double fetches.

   return createCollection<Deployment>(
     queryCollectionOptions({
       queryClient,
       queryKey: [projectId, "deployments"],
       retry: 3,
-      refetchInterval: 5000,
+      // Only poll when deployments are not all "ready"
+      refetchInterval: (query as any)?.state?.data?.some?.((d: Deployment) => d.status !== "ready") ? 5000 : false,
+      refetchOnWindowFocus: false,
       queryFn: () => trpcClient.deploy.deployment.list.query({ projectId }),
       getKey: (item) => item.id,
       id: `${projectId}-deployments`,
     }),
   );

If queryCollectionOptions does not forward function-valued refetchInterval, use a boolean flag in the hook to flip interval on/off.

go/pkg/db/queries/domain_find_by_deployment_id.sql (1)

8-9: Good: projecting sticky/is_rolled_back; add composite index for query shape

Consider INDEX(deployment_id, created_at) to satisfy WHERE + ORDER BY with index order.

go/pkg/partition/db/querier_generated.go (1)

20-27: New FindGatewayByDeploymentId API: confirm deployment_id uniqueness or expected multiplicity

If deployment_id is globally unique, add a DB UNIQUE constraint. If not, returning a single row may be surprising; rename to FindLatestGatewayByDeploymentId or switch to :many.

go/pkg/partition/db/queries/gateway_upsert.sql (1)

14-17: Optional: keep timestamps accurate on upsert.

If gateways has updated_at, set it in the UPDATE clause.

 ON DUPLICATE KEY UPDATE
     deployment_id = sqlc.arg(deployment_id),
-    config = sqlc.arg(config);
+    config = sqlc.arg(config)
+    -- , updated_at = NOW(3)
+    ;
apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts (1)

20-28: Skip unnecessary sort when querying by primary key.

When filtering by id, orderBy is redundant.

   const liveDeployment = useLiveQuery(
     (q) =>
       q
         .from({ deployment: collections.deployments })
         .where(({ deployment }) => eq(deployment.id, liveDeploymentId))
-        .orderBy(({ deployment }) => deployment.createdAt, "desc")
         .limit(1),
     [liveDeploymentId],
   ).data.at(0);
go/proto/ctrl/v1/deployment.proto (2)

133-141: Rollback API surface: consider richer per‑domain results or document idempotency.

A plain list of domains lacks success/failure detail. Either document idempotency and error semantics here or return a structured result in the future.


150-152: Nit: tighten RPC comment and scope.

Consider clarifying that only sticky domains of the current live deployment are reassigned to the target deployment; non‑sticky remain unchanged.

go/pkg/db/domain_insert.sql_generated.go (2)

39-40: Preserve sticky when param is NULL.

If NullDomainsSticky is invalid, this will overwrite sticky to NULL. Prefer COALESCE to keep the existing value when unspecified. Update the source SQL (domain_insert.sql) so generated code follows.

-    sticky = ?,
+    sticky = COALESCE(?, sticky),

Also applies to: 92-100


23-33: Optional: let DB set updated_at.

If you standardize on DB timestamps, set updated_at = CURRENT_TIMESTAMP(3) in both INSERT and UPDATE and drop the param.

-    created_at,
-    updated_at
+    created_at,
+    updated_at
 ) VALUES (
@@
-    ?,
-    ?,
+    ?,
+    CURRENT_TIMESTAMP(3),
@@
-    updated_at = ?
+    updated_at = CURRENT_TIMESTAMP(3)

Also applies to: 40-41

go/pkg/db/bulk_domain_insert.sql_generated.go (1)

36-37: Preallocate allArgs capacity to reduce allocations.

Small perf tidy-up proportional to rows.

-var allArgs []any
+allArgs := make([]any, 0, len(args)*9)
go/pkg/partition/db/gateway_find_by_deployment_id.sql_generated.go (1)

16-18: Ordering by id may not reflect “latest” config.

If “latest” means most recently updated, consider ORDER BY updated_at DESC (or created_at) rather than id.

go/pkg/db/querier_generated.go (1)

176-193: Parameter type for FindDomainsByDeploymentId.

Using sql.NullString for an equality predicate risks accidental “no rows” when Valid=false. Prefer string unless the SQL handles NULL explicitly.

Update the query to accept a non-null arg (or use (? IS NULL OR deployment_id = ?) if NULL filtering is intended).

go/pkg/db/domain_find_by_deployment_id.sql_generated.go (1)

26-26: Stabilize ordering when created_at ties.

Consider adding a secondary ORDER BY id in the source SQL (not here) to ensure deterministic results.

go/apps/ctrl/services/deployment/deploy_workflow.go (4)

318-321: Logging prints struct, not hostname.

Log uses the whole domain struct; prefer the hostname for readability.

-				w.logger.Error("failed to create gateway config for domain",
-					"domain", domain,
+				w.logger.Error("failed to create gateway config for domain",
+					"domain", domain.domain,
 					"error", err,
 					"deployment_id", req.DeploymentID)
...
-				w.logger.Error("failed to marshal gateway config", "error", err, "domain", domain)
+				w.logger.Error("failed to marshal gateway config", "error", err, "domain", domain.domain)

Also applies to: 329-331


304-339: Avoid recreating identical gateway config per domain.

The config does not depend on the domain; build and marshal it once, then reuse.

 	err = hydra.StepVoid(ctx, "create-gateway-configs-bulk", func(stepCtx context.Context) error {
 		// Prepare gateway configs for all non-local domains
 		var gatewayParams []partitiondb.UpsertGatewayParams
 		var skippedDomains []string
 
-		for _, domain := range allDomains {
+		// Build once
+		gatewayConfig, err := createGatewayConfig(req.DeploymentID, req.KeyspaceID, createdVMs)
+		if err != nil {
+			return fmt.Errorf("failed to create gateway config: %w", err)
+		}
+		configBytes, err := protojson.Marshal(gatewayConfig)
+		if err != nil {
+			return fmt.Errorf("failed to marshal gateway config: %w", err)
+		}
+
+		for _, domain := range allDomains {
 			if isLocalHostname(domain.domain, w.defaultDomain) {
 				skippedDomains = append(skippedDomains, domain.domain)
 				continue
 			}
 
-			// Create gateway config for this domain
-			gatewayConfig, err := createGatewayConfig(req.DeploymentID, req.KeyspaceID, createdVMs)
-			if err != nil {
-				w.logger.Error("failed to create gateway config for domain",
-					"domain", domain,
-					"error", err,
-					"deployment_id", req.DeploymentID)
-				// Continue with other domains rather than failing the entire deployment
-				continue
-			}
-
-			// Marshal protobuf to bytes
-			configBytes, err := protojson.Marshal(gatewayConfig)
-			if err != nil {
-				w.logger.Error("failed to marshal gateway config", "error", err, "domain", domain)
-				continue
-			}
-
 			gatewayParams = append(gatewayParams, partitiondb.UpsertGatewayParams{
 				WorkspaceID:  req.WorkspaceID,
 				DeploymentID: req.DeploymentID,
 				Hostname:     domain.domain,
 				Config:       configBytes,
 			})
 		}

265-266: Hardcoded SourceType.

Not blocking, but please track a TODO to derive SourceType from request/context instead of hardcoding CLI_UPLOAD.


596-613: Broaden local-host detection (.localhost, IPv6 loopback).

Include IPv6 loopback and the reserved .localhost TLD.

-	// Exact matches for common local hosts - these should be skipped
-	if hostname == "localhost" || hostname == "127.0.0.1" {
+	// Exact matches for common local hosts - these should be skipped
+	if hostname == "localhost" || hostname == "127.0.0.1" || hostname == "::1" {
 		return true
 	}
@@
-	localSuffixes := []string{
+	localSuffixes := []string{
 		".local",
 		".test",
+		".localhost",
 	}
apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (2)

24-28: Use TRPCError for config errors for consistent error handling.

Throwing a plain Error leads to inconsistent client behavior.

-    if (!ctrlUrl) {
-      throw new Error("ctrl service is not configured");
-    }
+    if (!ctrlUrl) {
+      throw new TRPCError({
+        code: "INTERNAL_SERVER_ERROR",
+        message: "CTRL_URL is not configured",
+      });
+    }

72-83: Remove inner .catch; let outer try/catch handle and preserve message.

Current code passes a non-string error as message.

-      const rolledBack = await ctrl
-        .rollback({
-          projectId: targetDeployment.project.id,
-          targetDeploymentId: targetDeployment.id,
-        })
-        .catch((err) => {
-          console.error(err);
-          throw new TRPCError({
-            code: "INTERNAL_SERVER_ERROR",
-            message: err,
-          });
-        });
+      const rolledBack = await ctrl.rollback({
+        projectId: targetDeployment.project.id,
+        targetDeploymentId: targetDeployment.id,
+      });
go/apps/ctrl/services/deployment/rollback.go (4)

45-55: Fix wrong ID in error/log when fetching live deployment

When the live deployment lookup fails, the error/log prints targetDeploymentID. Use the live ID to avoid misleading diagnostics.

- if db.IsNotFound(err) {
-   return nil, connect.NewError(connect.CodeNotFound, fmt.Errorf("deployment not found: %s", targetDeploymentID))
- }
- s.logger.Error("failed to get deployment",
-   "deployment_id", targetDeploymentID,
+ liveID := project.LiveDeploymentID.String
+ if db.IsNotFound(err) {
+   return nil, connect.NewError(connect.CodeNotFound, fmt.Errorf("deployment not found: %s", liveID))
+ }
+ s.logger.Error("failed to get deployment",
+   "deployment_id", liveID,

93-98: Clarify condition: no running VMs (not “no VMs”)

Message and error should reflect that we checked for running VMs.

- s.logger.Error("no VMs found",
+ s.logger.Error("no running VMs found",
    "deployment_id", targetDeployment.ID,
  )
- return nil, connect.NewError(connect.CodeNotFound, fmt.Errorf("no VMs found for deployment: %s", targetDeployment.ID))
+ return nil, connect.NewError(connect.CodeFailedPrecondition, fmt.Errorf("no running VMs for deployment: %s", targetDeployment.ID))

119-121: Pre‑allocate change slices

Minor perf/GC: reserve capacity based on domains length.

-domainChanges := []db.RollBackDomainParams{}
-gatewayChanges := []pdb.UpsertGatewayParams{}
+domainChanges := make([]db.RollBackDomainParams, 0, len(domains))
+gatewayChanges := make([]pdb.UpsertGatewayParams, 0, len(domains))

153-155: Misleading comment

We may roll back multiple sticky domains; the comment claims “only one domain”.

-// Not sure why there isn't a bulk query generated, but this will do for now
-// cause we're only rolling back one domain anyways
+// No bulk query for domains yet; iterate changes. Multiple sticky domains may be rolled back.
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (2)

61-99: Consider env fallback for unknown slugs

Rendering environment?.slug with capitalize may show undefined on preview rows; fallback to a label or omit.

-  {environment?.slug}
+  {environment?.slug ?? ""}

297-297: Tighten useMemo deps to avoid stale isLive

Depend on liveDeployment?.id rather than the whole object for stable memoization.

-}, [selectedDeployment?.deployment.id, isCompactView, liveDeployment]);
+}, [selectedDeployment?.deployment.id, isCompactView, liveDeployment?.id]);
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx (2)

49-54: Ensure boolean type for canRollback

Coerce to boolean to satisfy TS and avoid accidental truthy object typing.

-  const canRollback =
-    liveDeployment &&
+  const canRollback =
+    Boolean(liveDeployment) &&
     environment?.slug === "production" &&
     selectedDeployment.status === "ready" &&
     selectedDeployment.id !== liveDeployment.id;

49-54: Optional: centralize env check

Hardcoding "production" spreads policy in UI. Consider exporting a constant or helper.

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx (3)

55-63: Verify the refetch “hack”

collection.projects.utils.refetch() is fragile. Prefer TRPC utils invalidate on the specific queries or a typed collections refetch API.

I can help replace this with targeted utils.<router>.<procedure>.invalidate() calls if you share the query keys.


147-155: Copy tweak: “active” → “live”

Aligns with the new terminology.

- {deployment.gitCommitMessage || `${isLive ? "Current active" : "Target"} deployment`}
+ {deployment.gitCommitMessage || `${isLive ? "Current live" : "Target"} deployment`}

91-101: UX: disable button when no domains

If there are zero eligible domains, consider disabling the action and surfacing why (mirrors server “no domains to rollback”).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c87b938 and 0a0fc84.

⛔ Files ignored due to path filters (46)
  • go/gen/proto/ctrl/v1/ctrlv1connect/deployment.connect.go is excluded by !**/gen/**
  • go/gen/proto/ctrl/v1/ctrlv1connect/routing.connect.go is excluded by !**/gen/**
  • go/gen/proto/ctrl/v1/deployment.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • go/gen/proto/ctrl/v1/routing.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • internal/proto/generated/assetmanagerd/v1/asset_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/billaged/v1/billing_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/builderd/v1/builder_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/acme_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/build_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/deployment_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/openapi_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/service_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/deployment_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/metald_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/network_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/storage_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/vm_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/partition/v1/gateway_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/acme_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/acme_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/build_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/build_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/deployment_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/deployment_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/openapi_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/openapi_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/routing_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/routing_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/service_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/service_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/assetmanagerd/v1/asset_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/assetmanagerd/v1/asset_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/billaged/v1/billing_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/billaged/v1/billing_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/builderd/v1/builder_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/builderd/v1/builder_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/metal/vmprovisioner/v1/vmprovisioner_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/metal/vmprovisioner/v1/vmprovisioner_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/metal/vmprovisioner/v1/wip_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/partition/v1/gateway_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/vault/v1/object_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/vault/v1/service_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/vault/v1/service_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/vault/v1/object_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/vault/v1/service_pb.ts is excluded by !**/generated/**
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (47)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx (4 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx (1 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/env-status-badge.tsx (1 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (4 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts (2 hunks)
  • apps/dashboard/lib/collections/deploy/deployments.ts (1 hunks)
  • apps/dashboard/lib/collections/deploy/domains.ts (1 hunks)
  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (2 hunks)
  • apps/dashboard/lib/trpc/routers/deploy/domains/list.ts (1 hunks)
  • deployment/docker-compose.yaml (1 hunks)
  • dummy (1 hunks)
  • go/apps/ctrl/run.go (0 hunks)
  • go/apps/ctrl/services/deployment/deploy_workflow.go (6 hunks)
  • go/apps/ctrl/services/deployment/domains.go (1 hunks)
  • go/apps/ctrl/services/deployment/rollback.go (1 hunks)
  • go/apps/ctrl/services/routing/service.go (0 hunks)
  • go/buf.gen.yaml (1 hunks)
  • go/pkg/db/bulk_domain_insert.sql_generated.go (3 hunks)
  • go/pkg/db/domain_find_by_deployment_id.sql_generated.go (4 hunks)
  • go/pkg/db/domain_find_by_domain.sql_generated.go (2 hunks)
  • go/pkg/db/domain_insert.sql_generated.go (3 hunks)
  • go/pkg/db/domain_rollback.sql_generated.go (1 hunks)
  • go/pkg/db/models_generated.go (2 hunks)
  • go/pkg/db/project_find_by_id.sql_generated.go (4 hunks)
  • go/pkg/db/querier_generated.go (5 hunks)
  • go/pkg/db/queries/domain_find_by_deployment_id.sql (1 hunks)
  • go/pkg/db/queries/domain_insert.sql (1 hunks)
  • go/pkg/db/queries/domain_rollback.sql (1 hunks)
  • go/pkg/db/queries/project_find_by_id.sql (1 hunks)
  • go/pkg/db/schema.sql (1 hunks)
  • go/pkg/hydra/step.go (3 hunks)
  • go/pkg/hydra/worker.go (1 hunks)
  • go/pkg/hydra/workflow.go (3 hunks)
  • go/pkg/partition/db/bulk_gateway_upsert.sql_generated.go (3 hunks)
  • go/pkg/partition/db/gateway_find_by_deployment_id.sql_generated.go (1 hunks)
  • go/pkg/partition/db/gateway_upsert.sql_generated.go (1 hunks)
  • go/pkg/partition/db/models_generated.go (1 hunks)
  • go/pkg/partition/db/querier_generated.go (2 hunks)
  • go/pkg/partition/db/queries/gateway_find_by_deployment_id.sql (1 hunks)
  • go/pkg/partition/db/queries/gateway_upsert.sql (1 hunks)
  • go/pkg/partition/db/schema.sql (1 hunks)
  • go/proto/ctrl/v1/deployment.proto (1 hunks)
  • go/proto/ctrl/v1/routing.proto (0 hunks)
  • internal/db/src/schema/domains.ts (2 hunks)
  • internal/db/src/schema/projects.ts (1 hunks)
  • internal/proto/package.json (1 hunks)
  • internal/proto/src/index.ts (1 hunks)
💤 Files with no reviewable changes (3)
  • go/proto/ctrl/v1/routing.proto
  • go/apps/ctrl/run.go
  • go/apps/ctrl/services/routing/service.go
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx:70-79
Timestamp: 2025-09-12T17:57:18.337Z
Learning: In the deployment rollback functionality, self-rollback scenarios are prevented at the UI level in the actions menu through the `canRollback` condition which includes `!("active" in deployment && deployment.active)`, making additional checks in the dialog component redundant.
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx:70-79
Timestamp: 2025-09-12T17:57:18.337Z
Learning: In the rollback functionality, the actions menu prevents self-rollback scenarios by including `!deployment.active` in the `canRollback` condition, ensuring only inactive deployments show the rollback option.
Learnt from: mcstepp
PR: unkeyed/unkey#3662
File: apps/dashboard/lib/trpc/routers/deployment/list.ts:11-11
Timestamp: 2025-07-25T19:09:43.284Z
Learning: In apps/dashboard/lib/trpc/routers/deployment/list.ts, the listDeployments procedure intentionally queries the versions table rather than a deployments table. The user mcstepp indicated that renaming the table would require a database migration, which was deferred for the current PR focused on UI features.
📚 Learning: 2025-07-25T19:09:43.284Z
Learnt from: mcstepp
PR: unkeyed/unkey#3662
File: apps/dashboard/lib/trpc/routers/deployment/list.ts:11-11
Timestamp: 2025-07-25T19:09:43.284Z
Learning: In apps/dashboard/lib/trpc/routers/deployment/list.ts, the listDeployments procedure intentionally queries the versions table rather than a deployments table. The user mcstepp indicated that renaming the table would require a database migration, which was deferred for the current PR focused on UI features.

Applied to files:

  • apps/dashboard/lib/trpc/routers/deploy/domains/list.ts
  • apps/dashboard/lib/collections/deploy/deployments.ts
  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx
📚 Learning: 2025-08-08T14:55:11.981Z
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: deployment/docker-compose.yaml:179-184
Timestamp: 2025-08-08T14:55:11.981Z
Learning: In deployment/docker-compose.yaml (development only), MinIO is configured with API on 3902 and console on 3903; ports should map 3902:3902 and 3903:3903 to match MINIO_API_PORT_NUMBER and MINIO_CONSOLE_PORT_NUMBER.

Applied to files:

  • deployment/docker-compose.yaml
📚 Learning: 2025-09-18T17:31:16.158Z
Learnt from: Flo4604
PR: unkeyed/unkey#3994
File: go/pkg/partition/db/queries/gateway_find_config_by_hostname.sql:2-4
Timestamp: 2025-09-18T17:31:16.158Z
Learning: Gateway hostname lookups in Unkey are infrastructure-level operations that discover workspace_id by hostname, not operations that should be scoped by workspace_id. The FindGatewayByHostname query retrieves the workspace_id as part of the result to determine which workspace a hostname belongs to.

Applied to files:

  • go/pkg/partition/db/querier_generated.go
📚 Learning: 2025-09-01T01:57:42.227Z
Learnt from: imeyer
PR: unkeyed/unkey#3899
File: go/proto/metald/v1/metald.proto:5-9
Timestamp: 2025-09-01T01:57:42.227Z
Learning: In the unkeyed/unkey repository, buf is configured to properly resolve metald proto imports like "metald/v1/vm.proto" without needing the full "go/proto/" prefix. The buf lint command `buf lint --path proto/metald` passes successfully with these relative import paths.

Applied to files:

  • go/buf.gen.yaml
  • go/apps/ctrl/services/deployment/deploy_workflow.go
📚 Learning: 2025-09-01T02:33:43.791Z
Learnt from: imeyer
PR: unkeyed/unkey#3899
File: go/buf.gen.yaml:0-0
Timestamp: 2025-09-01T02:33:43.791Z
Learning: In the unkeyed/unkey repository, buf commands are executed from the `go/` directory where the `buf.yaml` file is located. This means the `out: gen` configuration in `go/buf.gen.yaml` generates files to `go/gen/` relative to the repository root, which aligns with the expected directory structure.

Applied to files:

  • go/buf.gen.yaml
📚 Learning: 2025-09-12T18:11:33.481Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/lib/trpc/routers/deploy/rollback.ts:23-24
Timestamp: 2025-09-12T18:11:33.481Z
Learning: In apps/dashboard/lib/trpc/routers/deploy/rollback.ts, the CTRL_URL environment variable should fail fast with a clear error message if missing in non-development environments, rather than defaulting to localhost which can mask production configuration issues.

Applied to files:

  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
📚 Learning: 2025-07-25T19:11:00.208Z
Learnt from: mcstepp
PR: unkeyed/unkey#3662
File: apps/dashboard/lib/trpc/routers/deployment/getOpenApiDiff.ts:110-147
Timestamp: 2025-07-25T19:11:00.208Z
Learning: In apps/dashboard/lib/trpc/routers/deployment/getOpenApiDiff.ts, the user mcstepp prefers to keep mock data fallbacks in POC/demonstration code for simplicity, even if it wouldn't be production-ready. This aligns with the PR being work-in-progress for demonstration purposes.

Applied to files:

  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
📚 Learning: 2025-09-12T17:57:18.337Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx:70-79
Timestamp: 2025-09-12T17:57:18.337Z
Learning: In the deployment rollback functionality, self-rollback scenarios are prevented at the UI level in the actions menu through the `canRollback` condition which includes `!("active" in deployment && deployment.active)`, making additional checks in the dialog component redundant.

Applied to files:

  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx
📚 Learning: 2025-09-12T17:57:18.337Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx:70-79
Timestamp: 2025-09-12T17:57:18.337Z
Learning: In the rollback functionality, the actions menu prevents self-rollback scenarios by including `!deployment.active` in the `canRollback` condition, ensuring only inactive deployments show the rollback option.

Applied to files:

  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx
📚 Learning: 2025-09-11T14:24:40.988Z
Learnt from: Flo4604
PR: unkeyed/unkey#3944
File: go/apps/ctrl/services/deployment/deploy_workflow.go:326-334
Timestamp: 2025-09-11T14:24:40.988Z
Learning: The InsertDomains method in the bulk queries uses ON DUPLICATE KEY UPDATE, making it an upsert operation that is idempotent and safe for retries, despite the "Insert" naming convention.

Applied to files:

  • go/pkg/db/bulk_domain_insert.sql_generated.go
📚 Learning: 2025-07-17T14:24:20.403Z
Learnt from: Flo4604
PR: unkeyed/unkey#3631
File: go/pkg/db/bulk_keyring_insert.sql.go:23-25
Timestamp: 2025-07-17T14:24:20.403Z
Learning: In go/pkg/db/bulk_keyring_insert.sql.go and similar bulk insert generated files, hardcoded zero values for fields like size_approx and size_last_updated_at are intentional and reflect the original SQL query structure, not missing parameters.

Applied to files:

  • go/pkg/db/bulk_domain_insert.sql_generated.go
🔇 Additional comments (30)
go/pkg/hydra/workflow.go (2)

10-10: Per‑execution logger plumbing looks good; confirm nil‑safety across engines.

Nice addition and consistent with usage in step.go/worker.go. Please confirm that all Engine instances provide a non‑nil logger so Step() calls don’t hit a nil receiver. Consider whether exposing Logger() on WorkflowContext is desirable later.

Also applies to: 107-107


253-253: No-op whitespace change.

go/pkg/hydra/step.go (1)

143-149: Lease-check SQL formatting change is non-functional. LGTM.

Also applies to: 185-190

deployment/docker-compose.yaml (1)

167-167: Image version downgrade: confirm compatibility/security.

2025.7.23 → 2025.4.3 is a rollback; please confirm no regressions and that the health endpoint and config flags are supported.

internal/proto/src/index.ts (1)

2-3: Public API change: dropped deployment_connect export — no in-repo callers found. Repo search for 'deployment_connect' and for '../generated/proto/ctrl/v1/deployment_pb' returned only internal/proto/src/index.ts (the commented export); no other references found.

go/buf.gen.yaml (1)

9-12: Approve — include_imports supported; no active generated/proto imports found. Buf v2's buf.gen.yaml supports the include_imports key (see Buf docs and the connect-es migration guide). (buf.build) Local scan shows no live imports from generated/proto/; only a commented export at internal/proto/src/index.ts:3 (//export * from "../generated/ctrl/v1/deployment_connect";) — remove if stale.

apps/dashboard/lib/trpc/routers/deploy/domains/list.ts (1)

17-23: Approve — deploymentId and sticky added; client usage verified.
apps/dashboard/lib/collections/deploy/domains.ts declares both fields; rollback-dialog.tsx and active-deployment-card/use-deployment-logs reference domain.deploymentId and domain.sticky — alignment confirmed.

go/pkg/partition/db/schema.sql (1)

13-18: Index deployment_id for lookup performance; add forward-only migration

FindGatewayByDeploymentId will scan without an index — add a single-column index or a composite (workspace_id, deployment_id) and ensure a forward-only migration updates existing partitions.

File: go/pkg/partition/db/schema.sql (lines 13-18)

 CREATE TABLE gateways (
   `id` bigint unsigned NOT NULL AUTO_INCREMENT,
   `workspace_id` varchar(255) NOT NULL,
   `deployment_id` varchar(255) NOT NULL,
   `hostname` varchar(255) NOT NULL,
   `config` blob NOT NULL,   -- Protobuf with all configuration including deployment_id, workspace_id
   PRIMARY KEY (`id`),
-  UNIQUE KEY `gateways_pk` (`hostname`)
+  UNIQUE KEY `gateways_pk` (`hostname`),
+  KEY `idx_deployment_id` (`deployment_id`),
+  KEY `idx_workspace_deployment` (`workspace_id`,`deployment_id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Search did not surface a migration that adds these indexes — confirm a forward-only migration exists for partitioned data or add one (avoid blocking ALTERs).

go/pkg/db/queries/project_find_by_id.sql (1)

10-10: Expose live_deployment_id — sqlc added nullable LiveDeploymentID; callers handle NULL.
sqlc generated LiveDeploymentID as sql.NullString (go/pkg/db/project_find_by_id.sql_generated.go, go/pkg/db/models_generated.go); rollback checks LiveDeploymentID.Valid before using .String (go/apps/ctrl/services/deployment/rollback.go); deploy_workflow sets sql.NullString{Valid:true,...} and updates pass arg.LiveDeploymentID (project_update_live_deployment_id.sql_generated.go).

go/apps/ctrl/services/deployment/domains.go (1)

5-16: OK to use math/rand/v2; newDomain can remain unexported.

  • go/go.mod sets Go 1.25 — math/rand/v2 (requires ≥ Go 1.22) is supported.
  • newDomain and buildDomains are package-local and only used in go/apps/ctrl/services/deployment (buildDomains is called from deploy_workflow.go) — keep unexported unless you need cross-package access (then export fields or add accessors).
go/pkg/db/project_find_by_id.sql_generated.go (1)

22-23: LiveDeploymentID wire-up looks correct

NullString fits optional live pointer semantics. No issues spotted in scan order.

Also applies to: 37-39, 68-69

go/pkg/db/domain_find_by_domain.sql_generated.go (1)

13-14: Extending projection with sticky/is_rolled_back is consistent with Domain

Scan order matches SELECT list; aligns with the updated Domain model.

Also applies to: 29-31

go/pkg/partition/db/queries/gateway_find_by_deployment_id.sql (1)

1-6: Clarify cardinality; add supporting index; make ordering deterministic

  • Found generated code uses :one (QueryRow) and is called at go/apps/ctrl/services/deployment/rollback.go:110 — callers expect a single row.
  • If at most one gateway per deployment: enforce UNIQUE(deployment_id) and drop ORDER BY/LIMIT (keep :one).
  • If multiple gateways allowed: either keep LIMIT 1 only if “latest by id” is the intended rule (prefer ORDER BY created_at DESC for determinism) or change to :many and update callers.
  • Add an index to avoid scans, e.g. CREATE INDEX idx_gateways_deployment_id_id ON gateways(deployment_id, id DESC) (or at minimum CREATE INDEX ON gateways(deployment_id)).
go/pkg/partition/db/models_generated.go (1)

115-119: Gateway now links to deployment_id — ensure constraints and indexing exist

  • If gateways are 1:1 with deployments, add UNIQUE(deployment_id) to enforce the invariant. If 1:many, keep this as-is but adjust query names to plural where applicable.
  • Add an index on deployment_id to support the new lookup path.

Run to verify indexes/migrations exist:

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/env-status-badge.tsx (1)

13-14: Don't add a temporary "current" alias — no callers use "current"

Repo check: env-status-badge.tsx defines only "live" and callers pass variant="live" (e.g. apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx:89); no occurrences of "current" found.

Likely an incorrect or invalid review comment.

apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts (1)

18-21: Resolved — UI gating prevents self-rollback; checks use liveDeployment.
getDeploymentListTableActionItems gates rollback by comparing selectedDeployment.id !== liveDeployment.id (apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx, ~lines 48–53). use-deployments returns liveDeployment (apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts, lines 117–118). .active hits are only in unrelated sidebar/nav code.

go/pkg/partition/db/queries/gateway_upsert.sql (1)

14-17: Avoid cross‑workspace clobber on upsert.

Updating workspace_id on conflict can silently “move” an existing hostname between workspaces. Unless that’s intended, drop workspace_id from the UPDATE set.

Proposed change:

 ON DUPLICATE KEY UPDATE
-    workspace_id = sqlc.arg(workspace_id),
-    deployment_id = sqlc.arg(deployment_id),
-    config = sqlc.arg(config);
+    deployment_id = sqlc.arg(deployment_id),
+    config = sqlc.arg(config);

Please confirm the gateways unique key is either (hostname) global or (workspace_id, hostname); if it’s global, this is especially risky. To verify:

go/pkg/db/domain_insert.sql_generated.go (1)

35-40: Avoid cross‑workspace transfer on upsert.
Updating workspace_id/project_id in the ON CONFLICT ... DO UPDATE clause can move a domain between workspaces. Ensure the conflict target is scoped to workspace_id (i.e., a composite unique key including workspace_id) or stop updating workspace_id/project_id on upsert.
Location: go/pkg/db/domain_insert.sql_generated.go (lines 35–40); verify UNIQUE/INDEX definitions in go/pkg/db/schema.sql.

go/pkg/db/querier_generated.go (1)

1526-1534: LGTM: RollBackDomain added to Querier.

Interface surface aligns with the generated exec method.

go/pkg/partition/db/gateway_find_by_deployment_id.sql_generated.go (1)

12-18: Add an index on gateways(deployment_id).

Repo search shows CREATE TABLE gateways in go/pkg/partition/db/schema.sql but no index on deployment_id; the query in go/pkg/partition/db/gateway_find_by_deployment_id.sql_generated.go will cause a full-table scan. Add an index or migration, e.g.:
CREATE INDEX idx_gateways_deployment_id ON gateways (deployment_id);

Affected files: go/pkg/partition/db/gateway_find_by_deployment_id.sql_generated.go, go/pkg/partition/db/schema.sql.

go/pkg/db/domain_rollback.sql_generated.go (1)

21-26: domains.updated_at is BIGINT — keep sql.NullInt64 and the current binding

Schema (go/pkg/db/schema.sql and internal/db/drizzle/0000_dazzling_colonel_america.sql) defines updated_at as BIGINT; the generated code (go/pkg/db/domain_rollback.sql_generated.go) correctly uses sql.NullInt64 and binds updated_at = ?. Switching to NOW() (DATETIME) would be incorrect unless you also change the column type or intentionally write UNIX_TIMESTAMP()*1000. No change required.

Likely an incorrect or invalid review comment.

go/pkg/db/domain_find_by_deployment_id.sql_generated.go (1)

29-38: Scan order matches SELECT — LGTM.

Fields Sticky and IsRolledBack align with the SELECT list and scan order. No issues.

Also applies to: 65-74

go/pkg/db/queries/domain_insert.sql (2)

9-29: Insert shape incl. sticky/updated_at — LGTM.

Columns and bindings look consistent with model changes.


22-29: Confirm upsert semantics for type/sticky.

ON DUPLICATE updates both type and sticky. If existing rows may be user-managed (e.g., custom domains), confirm this won’t overwrite intentional values.

go/pkg/db/models_generated.go (2)

239-281: DomainsSticky/NullDomainsSticky look correct.

Scanner/Valuer implementations align with other enums; no issues spotted.


599-609: Domain struct additions — LGTM.

Sticky and IsRolledBack are wired up with proper db tags.

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (1)

289-291: Prop pass looks good

selectedDeployment and liveDeployment match the new API.

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx (1)

9-18: Prop rename LGTM

liveDeployment/selectedDeployment read cleanly and align with the new model.

go/apps/ctrl/services/deployment/rollback.go (2)

147-161: Cross‑DB atomicity — risk of partial rollback

Upserting to partition DB before applying main‑DB domain/project updates (and updating domains outside a single transaction) can leave the system in an inconsistent state if any step fails.

  • Wrap all main‑DB mutations (all domainChanges + project live‑ID update) in a single transaction. (File: go/apps/ctrl/services/deployment/rollback.go, lines 147–161.)
  • Choose and document a consistent ordering: either commit main‑DB first then upsert partition DB with retries, or upsert partition DB first and implement compensating rollbacks for main‑DB on failure.
  • Replace the per‑domain loop with a transactional bulk update if supported; otherwise run the loop inside the same tx.

I can draft a db.Query Tx wrapper and a safe reorder/compensating strategy.


134-140: Confirmed — gateways are unique by hostname; Upsert conflicts on hostname

go/pkg/partition/db/schema.sql defines UNIQUE KEY gateways_pk (hostname) and the generated UpsertGateway uses ON DUPLICATE KEY UPDATE, so the conflict target is hostname (Upsert will overwrite workspace_id/deployment_id for an existing hostname).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
go/pkg/db/schema.sql (1)

366-375: Add index on domains.deployment_id to support new queries

New queries select by deployment_id; there’s no index, so this will full-scan domains at scale.

Add near the index section:

   CREATE INDEX `project_idx` ON `domains` (`project_id`);
+  CREATE INDEX `deployment_idx` ON `domains` (`deployment_id`);

Optionally, add a composite index if you frequently filter by workspace_id/project_id + sticky/is_rolled_back.

internal/proto/package.json (1)

11-21: Fix major-version skew: align @bufbuild/protobuf to v2 and upgrade protoc-gen-es to v2

internal/proto/package.json has @bufbuild/protobuf ^1.10.0 in devDependencies and ^2.8.0 in dependencies (majors 1 vs 2). Registry check shows @bufbuild/protobuf and @bufbuild/protoc-gen-es latest v2 is 2.9.0. Mixed majors will cause type/runtime conflicts with generated code — update dev deps to v2 and bump the generator.

File: internal/proto/package.json

  "devDependencies": {
    "@bufbuild/buf": "^1.47.2",
-    "@bufbuild/protobuf": "^1.10.0",
    "@connectrpc/protoc-gen-connect-es": "^1.6.1",
-    "@bufbuild/protoc-gen-es": "^1.10.0",
+    "@bufbuild/protoc-gen-es": "^2.9.0",
    "typescript": "^5.7.3"
  },
  "dependencies": {
-    "@bufbuild/protobuf": "^2.8.0",
+    "@bufbuild/protobuf": "^2.9.0",
    "@connectrpc/connect": "^1.6.1"
  }
🧹 Nitpick comments (31)
dummy (1)

1-1: Remove placeholder before merge

This file/comment is noise in main; drop it or squash it out.

apps/dashboard/lib/collections/deploy/deployments.ts (1)

51-53: Avoid unconditional polling; poll only while deployments are progressing

Constant 5s polling can be wasteful. Gate it on non-terminal statuses and avoid focus-triggered duplicate refetches.

Apply:

-      refetchInterval: 5000,
+      // Poll only while any deployment is not terminal
+      staleTime: 5_000,
+      refetchOnWindowFocus: false,
+      refetchInterval: (query) =>
+        query.state.data?.some((d) => d.status !== "ready" && d.status !== "failed")
+          ? 5_000
+          : false,

If your wrapper doesn’t accept a function for refetchInterval, we can move this logic into a post-fetch schedule instead. Please confirm.

deployment/docker-compose.yaml (2)

167-167: MinIO image tag downgrade — confirm rationale and compatibility

This moves from 2025.7.23 to 2025.4.3 (older). If intentional (compat fix), consider adding a comment and/or pinning by digest to avoid surprise updates.


168-176: Fix MinIO console port mapping

Env sets MINIO_CONSOLE_PORT_NUMBER=3903 but ports map 2903:2903. Prior guidance for this repo is 3902:3902 and 3903:3903.

       ports:
         - 3902:3902
-        - 2903:2903
+        - 3903:3903
apps/dashboard/lib/trpc/routers/deploy/domains/list.ts (1)

20-23: Surfacing deploymentId and sticky is correct

Matches frontend schema; no issues spotted.

If the UI needs rollback visibility, consider also projecting is_rolled_back to avoid an extra fetch.

apps/dashboard/lib/collections/deploy/domains.ts (1)

12-14: LGTM; align sticky union with a single source of truth

Schema matches backend. To avoid drift, centralize the sticky union in a shared const/type.

Example:

+export const DomainSticky = ["branch", "environment", "live"] as const;
+export type DomainSticky = (typeof DomainSticky)[number];
 ...
-  sticky: z.enum(["branch", "environment", "live"]).nullable(),
+  sticky: z.enum(DomainSticky).nullable(),
internal/db/src/schema/projects.ts (1)

39-42: Rename relation to liveDeployment for consistency

PR adopts the "live" nomenclature — rename this relation and update any generated schema/types. rg found only this definition in internal/db/src/schema/projects.ts, so update it there.

-  activeDeployment: one(deployments, {
+  liveDeployment: one(deployments, {
     fields: [projects.liveDeploymentId],
     references: [deployments.id],
   }),
internal/db/src/schema/domains.ts (2)

31-34: Add an index on deploymentId to support new read path.

FindDomainsByDeploymentId now filters by deployment_id; add an index to avoid full scans.

   (table) => ({
     workspaceIdx: index("workspace_idx").on(table.workspaceId),
     projectIdx: index("project_idx").on(table.projectId),
+    deploymentIdx: index("deployment_idx").on(table.deploymentId),
   }),

25-28: Consider recording when a domain was rolled back.

A boolean loses auditability. Adding rolled_back_at (and optionally rolled_back_by) greatly simplifies investigations and idempotent workflows.

go/pkg/partition/db/schema.sql (1)

10-18: Index gateways.deployment_id for new lookup path.

FindGatewayByDeploymentId was added; without an index, upserts/lookups will degrade as rows grow.

   PRIMARY KEY (`id`),
-  UNIQUE KEY `gateways_pk` (`hostname`)
+  UNIQUE KEY `gateways_pk` (`hostname`),
+  KEY `idx_deployment_id` (`deployment_id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
go/apps/ctrl/services/deployment/deploy_workflow.go (5)

265-266: Avoid hardcoding SourceType; plumb it from the request or caller.
Defaulting to CLI uploads can mis-classify domains and stickiness later.

Would you like me to thread a SourceType through DeployRequest and callers?


275-288: Domain insertion is non‑idempotent; consider upsert or dedup to prevent conflicts.
Repeated runs (retries) can collide on unique(domain). Also confirm sticky.Valid is set appropriately by buildDomains.

Here’s an upsert variant if the schema uses UNIQUE(domain):

- if err := db.BulkQuery.InsertDomains(stepCtx, w.db.RW(), domainParams); err != nil {
+ if err := db.BulkQuery.UpsertDomains(stepCtx, w.db.RW(), domainParams); err != nil {

If no bulk upsert exists, I can add a query that ignores duplicates and returns existing IDs.

Also applies to: 290-296


316-321: Log the hostname string, not the struct.
Current log prints the struct; use domain.domain for clarity.

- w.logger.Error("failed to create gateway config for domain",
-   "domain", domain,
+ w.logger.Error("failed to create gateway config for domain",
+   "domain", domain.domain,

And similarly in the marshal error log below.


563-587: Guard against empty VM lists when creating gateway config.
Avoid producing a config with zero VMs.

 func createGatewayConfig(deploymentID, keyspaceID string, vms []*metaldv1.GetDeploymentResponse_Vm) (*partitionv1.GatewayConfig, error) {
+ if len(vms) == 0 {
+   return nil, fmt.Errorf("no VMs for deployment %s", deploymentID)
+ }

591-621: Expand local hostname detection (.localhost TLD and IPv6 loopback).
Covers common dev setups like api.localhost and ::1.

- if hostname == "localhost" || hostname == "127.0.0.1" {
+ if hostname == "localhost" || hostname == "127.0.0.1" || hostname == "::1" || hostname == "localhost." {
   return true
 }
 ...
- localSuffixes := []string{
-   ".local",
-   ".test",
- }
+ localSuffixes := []string{
+   ".local",
+   ".test",
+   ".localhost",
+ }
go/pkg/db/domain_find_by_deployment_id.sql_generated.go (1)

56-57: Prefer a non‑NULL parameter type for deployment_id.
Using sql.NullString can lead to WHERE deployment_id = NULL (no hits) if arg.Valid is false. Make the query accept a plain string.

Update the source SQL (queries/domain_find_by_deployment_id.sql):

- WHERE deployment_id = ?
+ WHERE deployment_id = sqlc.arg(deployment_id)

And set the param to TEXT/STRING in sqlc, so generated Go uses string.

go/pkg/partition/db/queries/gateway_find_by_deployment_id.sql (1)

1-6: Order by created_at, not id, to get the latest row reliably.
IDs may not be time‑ordered; created_at is safer. Consider a supporting index.

-ORDER BY id DESC
+ORDER BY created_at DESC, id DESC

If not present, add an index: CREATE INDEX ix_gateways_deployment_id_created_at ON gateways(deployment_id, created_at DESC);

go/pkg/partition/db/bulk_gateway_upsert.sql_generated.go (1)

33-39: Tiny perf: preallocate allArgs.

Avoid repeated reallocations.

- var allArgs []any
+ allArgs := make([]any, 0, len(args)*4)
go/apps/ctrl/services/deployment/domains.go (1)

60-66: Guard against DNS label length overflows.

projectSlug/branch/env + hyphens can push a label over 63 chars. Consider truncating each label before formatting.

Example helper (place anywhere in this file):

func truncateLabel(s string) string {
  if len(s) > 50 { // leave headroom for prefixes/suffixes
    return s[:50]
  }
  return s
}

Then apply truncateLabel to projectSlug, sluggify(branchName), and environmentSlug before fmt.Sprintf calls.

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx (4)

47-50: Consider excluding rolled-back domains from the live list

If the intent is “domains currently routing to live,” filter out isRolledBack=true.

   q
     .from({ domain: domainCollection })
     .where(({ domain }) => inArray(domain.sticky, ["environment", "live"]))
-    .where(({ domain }) => eq(domain.deploymentId, liveDeployment.id)),
+    .where(({ domain }) => eq(domain.deploymentId, liveDeployment.id))
+    .where(({ domain }) => eq(domain.isRolledBack, false)),

91-101: Disable rollback when target equals live (defensive UI guard)

Menu gating should already prevent self‑rollback; add a defensive disable in the dialog button to avoid accidental calls.

         <Button
           variant="primary"
           size="xlg"
           onClick={handleRollback}
-          disabled={rollback.isLoading}
+          disabled={rollback.isLoading || targetDeployment.id === liveDeployment.id}
           loading={rollback.isLoading}
           className="w-full rounded-lg"
         >

55-63: Refetch hack: replace with explicit invalidation if possible

Relying on collection.projects.utils.refetch with @ts-expect-error is brittle. Prefer explicit TRPC/collection invalidations for affected queries.

Would you like help wiring the specific invalidate calls for deployments/domains?


147-151: Badge styling: avoid conflicting variant + text color

Badge variant="success" likely sets its own text color; overriding with text-successA-11 may fight the theme.

-            <Badge
-              variant={isLive ? "success" : "primary"}
-              className={`px-1.5 capitalize ${isLive ? "text-successA-11" : "text-grayA-11"}`}
-            >
+            <Badge variant={isLive ? "success" : "primary"} className="px-1.5 capitalize">
go/apps/ctrl/services/deployment/rollback.go (8)

20-27: Validate input IDs early (empty project_id/target_deployment_id).

Return InvalidArgument if either is empty to avoid confusing downstream errors.

Apply:

 func (s *Service) Rollback(ctx context.Context, req *connect.Request[ctrlv1.RollbackRequest]) (*connect.Response[ctrlv1.RollbackResponse], error) {
   targetDeploymentID := req.Msg.GetTargetDeploymentId()
   projectID := req.Msg.GetProjectId()

+  if projectID == "" || targetDeploymentID == "" {
+    return nil, connect.NewError(connect.CodeInvalidArgument, fmt.Errorf("project_id and target_deployment_id must be provided"))
+  }

110-117: Map gateway “not found” to CodeNotFound instead of Internal.

Use sql.ErrNoRows to distinguish missing config from real failures.

Apply:

   gatewayConfig, err := pdb.Query.FindGatewayByDeploymentId(ctx, s.partitionDB.RO(), targetDeployment.ID)
   if err != nil {
-    s.logger.Error("failed to get gateway config",
-      "deployment_id", targetDeployment.ID,
-      "error", err.Error(),
-    )
-    return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("failed to get gateway config: %w", err))
+    if errors.Is(err, sql.ErrNoRows) {
+      return nil, connect.NewError(connect.CodeNotFound, fmt.Errorf("gateway config not found for deployment: %s", targetDeployment.ID))
+    }
+    s.logger.Error("failed to get gateway config",
+      "deployment_id", targetDeployment.ID,
+      "error", err.Error(),
+    )
+    return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("failed to get gateway config: %w", err))
   }

119-121: Pre-size change slices.

Minor allocation win when many sticky domains exist.

Apply:

-  domainChanges := []db.RollBackDomainParams{}
-  gatewayChanges := []pdb.UpsertGatewayParams{}
+  domainChanges := make([]db.RollBackDomainParams, 0, len(domains))
+  gatewayChanges := make([]pdb.UpsertGatewayParams, 0, len(domains))

153-160: Comment is misleading and error return should be Connect error.

We may rollback multiple domains; also use typed error on failure.

Apply:

- // Not sure why there isn't a bulk query generated, but this will do for now
- // cause we're only rolling back one domain anyways
+ // No bulk RollBackDomain generated; iterate per domain (multiple sticky domains possible).
   for _, change := range domainChanges {
     err = db.Query.RollBackDomain(ctx, s.db.RW(), change)
     if err != nil {
-      s.logger.Error("failed to update domain", "error", err.Error())
-      return nil, fmt.Errorf("failed to update domain: %w", err)
+      s.logger.Error("failed to update domain", "domain_id", change.ID, "error", err.Error())
+      return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("failed to update domain %s: %w", change.ID, err))
     }
   }

70-77: Optional: also assert WorkspaceID matches.

Extra guardrail if cross-workspace IDs ever collide.

Apply:

   err = assert.All(
 
     assert.Equal(liveDeployment.ProjectID, targetDeployment.ProjectID),
     assert.Equal(liveDeployment.EnvironmentID, targetDeployment.EnvironmentID),
+    assert.Equal(liveDeployment.WorkspaceID, targetDeployment.WorkspaceID),
   )

87-98: Nit: include VM counts in log for operability.

Helps on-calls debug capacity vs. state.

Apply:

   if runningVms == 0 {
-    s.logger.Error("no VMs found",
-      "deployment_id", targetDeployment.ID,
-    )
+    s.logger.Error("no VMs found",
+      "deployment_id", targetDeployment.ID,
+      "total_vms", len(vms),
+      "running_vms", runningVms,
+    )

144-146: Edge case: no-op when target == live.

If targetDeploymentID equals live, return early with InvalidArgument or OK no-op to avoid unnecessary writes.

Would you like a small guard inserted after loading deployments?


155-161: Consider wrapping main-DB writes in a transaction.

Domains + project update could be atomic within s.db to reduce partial-control-plane state if a mid-loop failure occurs. Cross-DB (partition) can’t be transacted; still worth doing for s.db.

I can draft a tx wrapper using s.db.Tx if available.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c87b938 and 0a0fc84.

⛔ Files ignored due to path filters (46)
  • go/gen/proto/ctrl/v1/ctrlv1connect/deployment.connect.go is excluded by !**/gen/**
  • go/gen/proto/ctrl/v1/ctrlv1connect/routing.connect.go is excluded by !**/gen/**
  • go/gen/proto/ctrl/v1/deployment.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • go/gen/proto/ctrl/v1/routing.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • internal/proto/generated/assetmanagerd/v1/asset_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/billaged/v1/billing_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/builderd/v1/builder_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/acme_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/build_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/deployment_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/openapi_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/ctrl/v1/service_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/deployment_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/metald_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/network_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/storage_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/metald/v1/vm_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/partition/v1/gateway_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/acme_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/acme_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/build_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/build_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/deployment_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/deployment_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/openapi_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/openapi_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/routing_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/routing_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/service_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/ctrl/v1/service_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/assetmanagerd/v1/asset_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/assetmanagerd/v1/asset_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/billaged/v1/billing_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/billaged/v1/billing_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/builderd/v1/builder_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/deploy/builderd/v1/builder_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/metal/vmprovisioner/v1/vmprovisioner_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/metal/vmprovisioner/v1/vmprovisioner_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/metal/vmprovisioner/v1/wip_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/partition/v1/gateway_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/vault/v1/object_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/vault/v1/service_connect.ts is excluded by !**/generated/**
  • internal/proto/generated/proto/vault/v1/service_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/vault/v1/object_pb.ts is excluded by !**/generated/**
  • internal/proto/generated/vault/v1/service_pb.ts is excluded by !**/generated/**
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (47)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx (4 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx (1 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/env-status-badge.tsx (1 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (4 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts (2 hunks)
  • apps/dashboard/lib/collections/deploy/deployments.ts (1 hunks)
  • apps/dashboard/lib/collections/deploy/domains.ts (1 hunks)
  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (2 hunks)
  • apps/dashboard/lib/trpc/routers/deploy/domains/list.ts (1 hunks)
  • deployment/docker-compose.yaml (1 hunks)
  • dummy (1 hunks)
  • go/apps/ctrl/run.go (0 hunks)
  • go/apps/ctrl/services/deployment/deploy_workflow.go (6 hunks)
  • go/apps/ctrl/services/deployment/domains.go (1 hunks)
  • go/apps/ctrl/services/deployment/rollback.go (1 hunks)
  • go/apps/ctrl/services/routing/service.go (0 hunks)
  • go/buf.gen.yaml (1 hunks)
  • go/pkg/db/bulk_domain_insert.sql_generated.go (3 hunks)
  • go/pkg/db/domain_find_by_deployment_id.sql_generated.go (4 hunks)
  • go/pkg/db/domain_find_by_domain.sql_generated.go (2 hunks)
  • go/pkg/db/domain_insert.sql_generated.go (3 hunks)
  • go/pkg/db/domain_rollback.sql_generated.go (1 hunks)
  • go/pkg/db/models_generated.go (2 hunks)
  • go/pkg/db/project_find_by_id.sql_generated.go (4 hunks)
  • go/pkg/db/querier_generated.go (5 hunks)
  • go/pkg/db/queries/domain_find_by_deployment_id.sql (1 hunks)
  • go/pkg/db/queries/domain_insert.sql (1 hunks)
  • go/pkg/db/queries/domain_rollback.sql (1 hunks)
  • go/pkg/db/queries/project_find_by_id.sql (1 hunks)
  • go/pkg/db/schema.sql (1 hunks)
  • go/pkg/hydra/step.go (3 hunks)
  • go/pkg/hydra/worker.go (1 hunks)
  • go/pkg/hydra/workflow.go (3 hunks)
  • go/pkg/partition/db/bulk_gateway_upsert.sql_generated.go (3 hunks)
  • go/pkg/partition/db/gateway_find_by_deployment_id.sql_generated.go (1 hunks)
  • go/pkg/partition/db/gateway_upsert.sql_generated.go (1 hunks)
  • go/pkg/partition/db/models_generated.go (1 hunks)
  • go/pkg/partition/db/querier_generated.go (2 hunks)
  • go/pkg/partition/db/queries/gateway_find_by_deployment_id.sql (1 hunks)
  • go/pkg/partition/db/queries/gateway_upsert.sql (1 hunks)
  • go/pkg/partition/db/schema.sql (1 hunks)
  • go/proto/ctrl/v1/deployment.proto (1 hunks)
  • go/proto/ctrl/v1/routing.proto (0 hunks)
  • internal/db/src/schema/domains.ts (2 hunks)
  • internal/db/src/schema/projects.ts (1 hunks)
  • internal/proto/package.json (1 hunks)
  • internal/proto/src/index.ts (1 hunks)
💤 Files with no reviewable changes (3)
  • go/apps/ctrl/run.go
  • go/apps/ctrl/services/routing/service.go
  • go/proto/ctrl/v1/routing.proto
🧰 Additional context used
🧠 Learnings (12)
📚 Learning: 2025-07-25T19:09:43.284Z
Learnt from: mcstepp
PR: unkeyed/unkey#3662
File: apps/dashboard/lib/trpc/routers/deployment/list.ts:11-11
Timestamp: 2025-07-25T19:09:43.284Z
Learning: In apps/dashboard/lib/trpc/routers/deployment/list.ts, the listDeployments procedure intentionally queries the versions table rather than a deployments table. The user mcstepp indicated that renaming the table would require a database migration, which was deferred for the current PR focused on UI features.

Applied to files:

  • apps/dashboard/lib/collections/deploy/deployments.ts
  • apps/dashboard/lib/trpc/routers/deploy/domains/list.ts
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx
  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
📚 Learning: 2025-09-01T01:57:42.227Z
Learnt from: imeyer
PR: unkeyed/unkey#3899
File: go/proto/metald/v1/metald.proto:5-9
Timestamp: 2025-09-01T01:57:42.227Z
Learning: In the unkeyed/unkey repository, buf is configured to properly resolve metald proto imports like "metald/v1/vm.proto" without needing the full "go/proto/" prefix. The buf lint command `buf lint --path proto/metald` passes successfully with these relative import paths.

Applied to files:

  • go/buf.gen.yaml
  • go/apps/ctrl/services/deployment/deploy_workflow.go
📚 Learning: 2025-09-01T02:33:43.791Z
Learnt from: imeyer
PR: unkeyed/unkey#3899
File: go/buf.gen.yaml:0-0
Timestamp: 2025-09-01T02:33:43.791Z
Learning: In the unkeyed/unkey repository, buf commands are executed from the `go/` directory where the `buf.yaml` file is located. This means the `out: gen` configuration in `go/buf.gen.yaml` generates files to `go/gen/` relative to the repository root, which aligns with the expected directory structure.

Applied to files:

  • go/buf.gen.yaml
📚 Learning: 2025-09-18T17:31:16.158Z
Learnt from: Flo4604
PR: unkeyed/unkey#3994
File: go/pkg/partition/db/queries/gateway_find_config_by_hostname.sql:2-4
Timestamp: 2025-09-18T17:31:16.158Z
Learning: Gateway hostname lookups in Unkey are infrastructure-level operations that discover workspace_id by hostname, not operations that should be scoped by workspace_id. The FindGatewayByHostname query retrieves the workspace_id as part of the result to determine which workspace a hostname belongs to.

Applied to files:

  • go/pkg/partition/db/querier_generated.go
📚 Learning: 2025-08-08T14:55:11.981Z
Learnt from: imeyer
PR: unkeyed/unkey#3755
File: deployment/docker-compose.yaml:179-184
Timestamp: 2025-08-08T14:55:11.981Z
Learning: In deployment/docker-compose.yaml (development only), MinIO is configured with API on 3902 and console on 3903; ports should map 3902:3902 and 3903:3903 to match MINIO_API_PORT_NUMBER and MINIO_CONSOLE_PORT_NUMBER.

Applied to files:

  • deployment/docker-compose.yaml
📚 Learning: 2025-09-12T17:57:18.337Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx:70-79
Timestamp: 2025-09-12T17:57:18.337Z
Learning: In the deployment rollback functionality, self-rollback scenarios are prevented at the UI level in the actions menu through the `canRollback` condition which includes `!("active" in deployment && deployment.active)`, making additional checks in the dialog component redundant.

Applied to files:

  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx
📚 Learning: 2025-09-12T17:57:18.337Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx:70-79
Timestamp: 2025-09-12T17:57:18.337Z
Learning: In the rollback functionality, the actions menu prevents self-rollback scenarios by including `!deployment.active` in the `canRollback` condition, ensuring only inactive deployments show the rollback option.

Applied to files:

  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx
📚 Learning: 2025-09-12T08:01:20.792Z
Learnt from: Flo4604
PR: unkeyed/unkey#3944
File: go/pkg/db/acme_challenge_update_verified_with_expiry.sql_generated.go:31-39
Timestamp: 2025-09-12T08:01:20.792Z
Learning: Do not review or suggest changes to files with sql_generated.go suffix or other files marked as auto-generated (containing "Code generated by" comments), as these are generated by tools like sqlc and changes would be overwritten on regeneration.

Applied to files:

  • go/pkg/db/domain_rollback.sql_generated.go
📚 Learning: 2025-09-12T18:11:33.481Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/lib/trpc/routers/deploy/rollback.ts:23-24
Timestamp: 2025-09-12T18:11:33.481Z
Learning: In apps/dashboard/lib/trpc/routers/deploy/rollback.ts, the CTRL_URL environment variable should fail fast with a clear error message if missing in non-development environments, rather than defaulting to localhost which can mask production configuration issues.

Applied to files:

  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
📚 Learning: 2025-07-25T19:11:00.208Z
Learnt from: mcstepp
PR: unkeyed/unkey#3662
File: apps/dashboard/lib/trpc/routers/deployment/getOpenApiDiff.ts:110-147
Timestamp: 2025-07-25T19:11:00.208Z
Learning: In apps/dashboard/lib/trpc/routers/deployment/getOpenApiDiff.ts, the user mcstepp prefers to keep mock data fallbacks in POC/demonstration code for simplicity, even if it wouldn't be production-ready. This aligns with the PR being work-in-progress for demonstration purposes.

Applied to files:

  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
📚 Learning: 2025-07-17T14:24:20.403Z
Learnt from: Flo4604
PR: unkeyed/unkey#3631
File: go/pkg/db/bulk_keyring_insert.sql.go:23-25
Timestamp: 2025-07-17T14:24:20.403Z
Learning: In go/pkg/db/bulk_keyring_insert.sql.go and similar bulk insert generated files, hardcoded zero values for fields like size_approx and size_last_updated_at are intentional and reflect the original SQL query structure, not missing parameters.

Applied to files:

  • go/pkg/db/bulk_domain_insert.sql_generated.go
📚 Learning: 2025-09-11T14:24:40.988Z
Learnt from: Flo4604
PR: unkeyed/unkey#3944
File: go/apps/ctrl/services/deployment/deploy_workflow.go:326-334
Timestamp: 2025-09-11T14:24:40.988Z
Learning: The InsertDomains method in the bulk queries uses ON DUPLICATE KEY UPDATE, making it an upsert operation that is idempotent and safe for retries, despite the "Insert" naming convention.

Applied to files:

  • go/pkg/db/bulk_domain_insert.sql_generated.go
🧬 Code graph analysis (17)
go/pkg/db/domain_find_by_deployment_id.sql_generated.go (1)
go/pkg/db/models_generated.go (2)
  • Domain (598-609)
  • NullDomainsSticky (259-262)
internal/db/src/schema/projects.ts (1)
internal/db/src/schema/deployments.ts (1)
  • deployments (9-52)
go/pkg/partition/db/querier_generated.go (2)
go/pkg/partition/db/database.go (1)
  • DBTX (10-10)
go/pkg/partition/db/gateway_find_by_deployment_id.sql_generated.go (1)
  • FindGatewayByDeploymentIdRow (20-23)
go/pkg/db/project_find_by_id.sql_generated.go (1)
go/pkg/db/types/null_string.go (1)
  • NullString (10-10)
go/pkg/hydra/workflow.go (1)
go/pkg/otel/logging/interface.go (1)
  • Logger (11-116)
go/pkg/partition/db/gateway_upsert.sql_generated.go (2)
go/pkg/partition/db/queries.go (1)
  • Queries (3-3)
go/pkg/partition/db/database.go (1)
  • DBTX (10-10)
go/apps/ctrl/services/deployment/rollback.go (9)
internal/proto/generated/ctrl/v1/deployment_pb.ts (2)
  • RollbackRequest (424-434)
  • RollbackResponse (447-454)
go/gen/proto/ctrl/v1/deployment.pb.go (6)
  • RollbackRequest (848-854)
  • RollbackRequest (867-867)
  • RollbackRequest (882-884)
  • RollbackResponse (900-906)
  • RollbackResponse (919-919)
  • RollbackResponse (934-936)
go/pkg/db/handle_err_no_rows.go (1)
  • IsNotFound (8-10)
go/pkg/assert/all.go (1)
  • All (19-26)
go/pkg/partition/db/models_generated.go (1)
  • VmsStatusRunning (63-63)
go/pkg/db/domain_rollback.sql_generated.go (1)
  • RollBackDomainParams (21-26)
go/pkg/partition/db/gateway_upsert.sql_generated.go (1)
  • UpsertGatewayParams (31-36)
go/pkg/db/models_generated.go (2)
  • DomainsSticky (239-239)
  • Domain (598-609)
go/pkg/db/project_update_live_deployment_id.sql_generated.go (1)
  • UpdateProjectLiveDeploymentIdParams (19-23)
go/apps/ctrl/services/deployment/deploy_workflow.go (4)
go/gen/proto/ctrl/v1/deployment.pb.go (1)
  • SourceType_SOURCE_TYPE_CLI_UPLOAD (92-92)
go/pkg/db/models_generated.go (1)
  • Domain (598-609)
go/gen/proto/metald/v1/deployment.pb.go (3)
  • GetDeploymentResponse_Vm (452-460)
  • GetDeploymentResponse_Vm (473-473)
  • GetDeploymentResponse_Vm (488-490)
go/gen/proto/partition/v1/gateway.pb.go (3)
  • GatewayConfig (26-37)
  • GatewayConfig (50-50)
  • GatewayConfig (65-67)
go/pkg/db/domain_insert.sql_generated.go (2)
go/pkg/db/types/null_string.go (1)
  • NullString (10-10)
go/pkg/db/models_generated.go (3)
  • Domain (598-609)
  • DomainsType (282-282)
  • NullDomainsSticky (259-262)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx (3)
apps/dashboard/lib/collections/index.ts (2)
  • Deployment (10-10)
  • collectionManager (69-69)
internal/db/src/schema/domains.ts (1)
  • domains (4-35)
apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (1)
  • rollback (14-115)
go/pkg/db/domain_rollback.sql_generated.go (1)
go/pkg/db/types/null_string.go (1)
  • NullString (10-10)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx (3)
apps/dashboard/lib/collections/deploy/deployments.ts (1)
  • Deployment (39-39)
apps/dashboard/components/logs/table-action.popover.tsx (1)
  • MenuItem (12-22)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx (1)
  • RollbackDialog (35-128)
go/pkg/db/models_generated.go (1)
go/pkg/db/types/null_string.go (1)
  • NullString (10-10)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (2)
apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts (1)
  • useDeployments (8-119)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/env-status-badge.tsx (1)
  • EnvStatusBadge (28-41)
go/apps/ctrl/services/deployment/domains.go (2)
go/pkg/db/models_generated.go (4)
  • NullDomainsSticky (259-262)
  • DomainsSticky (239-239)
  • DomainsStickyBranch (242-242)
  • DomainsStickyEnvironment (243-243)
go/gen/proto/ctrl/v1/deployment.pb.go (1)
  • SourceType_SOURCE_TYPE_CLI_UPLOAD (92-92)
apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (3)
apps/dashboard/lib/env.ts (1)
  • env (3-51)
internal/proto/generated/ctrl/v1/deployment_pb.ts (1)
  • DeploymentService (545-576)
apps/dashboard/lib/audit.ts (1)
  • insertAuditLogs (103-158)
go/pkg/db/querier_generated.go (2)
go/pkg/partition/db/database.go (1)
  • DBTX (10-10)
go/pkg/db/domain_rollback.sql_generated.go (1)
  • RollBackDomainParams (21-26)
🔇 Additional comments (45)
go/pkg/hydra/worker.go (1)

457-457: Per-execution logger initialization looks good.

The logger is correctly initialized with contextual fields (execution_id, namespace, workflow_name) that will help with debugging and observability for workflow executions.

go/pkg/hydra/workflow.go (3)

10-10: Import addition looks good.

The new logging package import is correctly added to support the logger field in workflowContext.


107-107: Logger field addition is well-integrated.

The new logger field in workflowContext is properly typed with the logging.Logger interface and fits well with the existing structure.


253-253: Formatting adjustment looks fine.

The added newline after tracing.RecordError(span, err) improves code readability without affecting functionality.

go/pkg/hydra/step.go (4)

139-150: Minor formatting improvements in SQL statement.

The whitespace adjustments in the SQL statement improve readability while maintaining the same functionality.


185-192: SQL formatting consistency improvement.

The formatting adjustments align the SQL statement structure with the pattern used elsewhere in the file.


239-239: Good addition of structured logging for step failures.

Adding wctx.logger.Error("step execution failed", "error", err.Error()) provides valuable structured logging that complements the existing tracing. This will improve debugging capabilities by making step failures more visible in logs while maintaining the existing error handling flow.


239-239: Logging implementation follows Go slog best practices.

The addition of structured logging with wctx.logger.Error("step execution failed", "error", err.Error()) aligns well with modern Go logging practices. The slog package was designed with performance in mind and optimized for common logging patterns where calls with five or fewer attributes account for over 95% of use cases.

Key benefits of this implementation:

  • Uses key-value pairs for structured logs that can be parsed, filtered, searched, and analyzed quickly and reliably
  • Leverages the per-execution logger created with Logger.With() method which includes contextual attributes like execution_id, namespace, and workflow_name
  • Provides performance-optimized logging with easy contextual information for better correlation and troubleshooting

The error message format is concise and follows the simple key-value pattern that the Go team optimized for in their performance analysis.

internal/db/src/schema/domains.ts (1)

14-24: Confirm nullable sticky semantics and desired default.

Sticky is nullable; is the intended default “environment” for Unkey-managed domains? If null has legacy/“no stickiness” meaning, document it here and in API types to avoid ambiguity.

go/pkg/partition/db/models_generated.go (1)

114-120: LGTM: Gateway now carries DeploymentID.

Struct/tag alignment with schema.sql looks correct.

go/pkg/db/queries/domain_find_by_deployment_id.sql (1)

1-15: LGTM; consider workspace scoping if deployment_id is not globally unique.

If deployment_id uniqueness is workspace-scoped, add workspace_id filter. Otherwise, ensure a unique constraint on deployment_id exists on the domains table. Also ensure an index on domains.deployment_id in the primary DB.

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/env-status-badge.tsx (1)

10-18: No action required — no occurrences of variant="current" detected

Searched repository for variant='current' in .ts/.tsx/.js/.jsx files; no matches found. Mapping "current" → "live" is optional for extra backward-compatibility.

go/pkg/db/project_find_by_id.sql_generated.go (1)

22-25: LGTM — live_deployment_id added; align remaining 'active' naming to 'live'

Backend/models/queries/update use live_deployment_id; UI copy and a schema alias still use "Active"/activeDeployment — switch those to "Live" to avoid dual concepts.

  • apps/dashboard/app/(app)/projects/[projectId]/page.tsx — title "Active Deployment" / ActiveDeploymentCard.
  • internal/db/src/schema/projects.ts — relation named activeDeployment (fields: [projects.liveDeploymentId]).
go/pkg/db/domain_find_by_domain.sql_generated.go (1)

1-36: LGTM! sqlc-generated code correctly implements the new domain schema.

The generated code properly extends the FindDomainByDomain query to include the new sticky and is_rolled_back columns. The SQL query, scan parameters, and struct field mappings are all correctly aligned with the schema changes.

go/pkg/db/queries/project_find_by_id.sql (1)

1-15: LGTM! SQL query correctly adds live_deployment_id field.

The SQL query has been properly extended to select the live_deployment_id column, aligning with the broader PR changes that rename "active" to "live" deployment concepts throughout the codebase.

go/proto/ctrl/v1/deployment.proto (1)

133-152: Well-designed rollback API with clear semantics.

The new Rollback RPC and associated message types are well-designed:

  • Clear parameter naming (project_id, target_deployment_id)
  • Meaningful response with affected domains
  • Descriptive comment explaining the rollback behavior
  • Follows protobuf conventions
go/pkg/db/queries/domain_insert.sql (1)

1-29: LGTM! Domain upsert logic correctly updated with new sticky semantics.

The SQL has been properly updated to:

  • Include sticky and explicit timestamp columns in the insert
  • Use explicit sqlc.arg() parameters for better clarity
  • Update the ON DUPLICATE KEY clause to handle sticky field updates
  • Maintain proper upsert semantics for domain management

The explicit parameter naming improves readability and reduces the risk of parameter ordering mistakes.

apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts (2)

115-119: Return object correctly updated for live deployment semantics.

The return object properly reflects the rename from activeDeployment to liveDeployment and removes the now-redundant activeDeploymentId field, since the ID can be accessed directly from the liveDeployment object.


12-28: No breakage detected; no activeDeployment references remain. All consumers have been updated to use liveDeployment, and no usages of the old activeDeployment property were found.

go/pkg/partition/db/querier_generated.go (2)

20-27: LGTM! New gateway lookup by deployment ID properly added.

The FindGatewayByDeploymentId method addition is well-integrated:

  • Follows consistent naming conventions
  • Proper parameter typing with deploymentID string
  • Correct return type FindGatewayByDeploymentIdRow
  • Properly documented with SQL query comment

54-72: Gateway upsert documentation correctly reflects deployment_id addition.

The documentation block accurately reflects the updated UpsertGateway operation that now includes deployment_id in both the insert and update clauses. The documentation helps maintainers understand the SQL structure without examining the implementation files.

go/pkg/partition/db/gateway_find_by_deployment_id.sql_generated.go (1)

1-38: LGTM! New gateway lookup implementation is correct and follows patterns.

The generated code properly implements the gateway-by-deployment lookup:

  • Follows sqlc best practices for type-safe generated code
  • Uses appropriate ordering (ORDER BY id DESC) to get the most recent gateway
  • Proper field scanning and error handling
  • Consistent with other generated query methods in the codebase
internal/proto/src/index.ts (1)

1-4: Keep deployment_connect commented — DeploymentService already exported.
internal/proto/generated/ctrl/v1/deployment_pb.ts exports DeploymentService and consumers (e.g. apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts) import/use that; no references to deployment_connect were found, so do not uncomment.

go/apps/ctrl/services/deployment/deploy_workflow.go (2)

11-11: Import alias for ctrl/v1 looks good.
Enables explicit SourceType usage without collisions.


333-338: Gateway upsert params align with new schema.
Passing WorkspaceID, DeploymentID, Hostname, Config matches the updated query; good.

Please confirm the UNIQUE constraint (hostname) vs (workspace_id, hostname). If the latter, updating workspace_id on conflict in the generated SQL is safe only if the conflict key includes workspace_id.

go/pkg/db/domain_find_by_deployment_id.sql_generated.go (1)

20-21: Adding sticky and is_rolled_back to the projection/row is correct.
Matches models and enables downstream semantics.

Also applies to: 30-38, 49-50, 71-72

go/pkg/partition/db/gateway_upsert.sql_generated.go (1)

12-29: Upsert now binds deployment_id; arg order and placeholder count are consistent.
Semantics (don’t mutate hostname on conflict) look right for stickiness/rollback.

Confirm the gateways unique key: if it’s (hostname) only, updating workspace_id on conflict is fine; if it’s (workspace_id, hostname), the conflict path should not change workspace_id unless intended.

Also applies to: 31-36, 56-66

go/pkg/partition/db/queries/gateway_upsert.sql (1)

2-17: SQLC args + explicit update fields look good.
Matches generated code; hostname remains stable on conflict.

go/pkg/partition/db/bulk_gateway_upsert.sql_generated.go (1)

13-15: Confirm intent: updating workspace_id/deployment_id on conflict.

Changing identity/ownership columns on duplicates can be risky depending on your UNIQUE index and FKs. Verify this is desired for hostname collisions.

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (2)

61-63: LGTM: Live badge condition is correct.

The badge toggles purely on liveDeployment.id match; good decoupling from env. Confirm EnvStatusBadge supports variant="live".

Also applies to: 87-90


289-297: Props rename wired correctly to actions component.

selectedDeployment/liveDeployment usage matches the new API; memo deps updated appropriately.

go/pkg/db/domain_insert.sql_generated.go (1)

13-41: SQL/arg order looks consistent; sticky propagation is correct.

INSERT/UPDATE placeholders align with Exec args; NullDomainsSticky covers NULL vs set. Good.

Please confirm that:

  • It’s intentional to update workspace_id/project_id/deployment_id on conflict.
  • updated_at should be the provided value (not NOW()) on both insert and update.

Also applies to: 85-101

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/actions/deployment-list-table-action.popover.constants.tsx (1)

30-37: Rollback gating reads well and avoids self-rollback.

Dialog opens only with both live and selected deployments; canRollback requires production, ready, and id mismatch. Solid.

Is restricting rollback to environment.slug === "production" a product decision? If preview rollbacks should be allowed later, consider making the env check configurable.

Also applies to: 49-54

go/apps/ctrl/services/deployment/domains.go (2)

41-47: LGTM: per-source randomness prevents SHA collisions for CLI uploads.

Suffix range 1000–9999 keeps domains unique without bloating.


27-33: Require Go ≥ 1.22 (math/rand/v2 import)

go/apps/ctrl/services/deployment/domains.go imports "math/rand/v2" (added in Go 1.22). Ensure the module containing this package (e.g., go/go.mod) and your CI/toolchain use Go 1.22 or newer — update the module's go directive to 1.22 or upgrade the build toolchain.

go/pkg/db/domain_rollback.sql_generated.go (1)

1-44: Auto‑generated file – no manual changes

Looks consistent with domain_rollback.sql. Passing TargetDeploymentID -> deployment_id ordering is correct.

go/pkg/db/models_generated.go (2)

239-281: Sticky enum + nullable wrapper look correct

DomainsSticky and NullDomainsSticky implement Scan/Value properly for driver round‑trips.


598-609: Domain model extension aligns with schema

Added Sticky and IsRolledBack fields match schema and downstream usage.

apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (1)

33-37: Transport choice: confirm server runtime

createConnectTransport from "@connectrpc/connect-web" relies on fetch; ensure this code runs server-side with a supported fetch (Node 18+). If not, consider "@connectrpc/connect-node".

go/apps/ctrl/services/deployment/rollback.go (1)

176-181: Confirm response semantics: IDs vs hostnames.

Proto says “the rolled back domains”; should these be domain names (hostnames) rather than IDs? If UI expects hostnames, switch to collect domain.Domain.

I can patch to return hostnames if that’s the contract.

go/pkg/db/querier_generated.go (5)

176-178: Expose sticky/is_rolled_back on FindDomainByDomain.

Looks correct and matches the new schema; call sites gain visibility without behavior change.


186-194: FindDomainsByDeploymentId now returns sticky/is_rolled_back.

Good alignment with rollback logic that filters by Sticky. Ensure downstream structs reference the Row type consistently.

If any call sites still assume the older shape, compilation will fail; run your CI codegen check to confirm.


599-605: FindProjectById includes live_deployment_id.

Matches Rollback’s precondition check. No issues spotted.


902-929: InsertDomain adds sticky and upserts it.

Upsert of sticky is intentional; confirm this won’t clobber stickiness unexpectedly during unrelated domain updates.

I can help audit callers of InsertDomain to ensure sticky is set deliberately.


1526-1533: New RollBackDomain update method looks good.

Fields match the intended per-domain rollback (deployment_id, is_rolled_back, updated_at).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
go/apps/ctrl/services/deployment/deploy_workflow.go (1)

182-186: Compile error: invalid range over int

for i := range 300 is invalid Go. Use a counted loop.

Apply:

-		for i := range 300 {
+		for i := 0; i < 300; i++ {
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (1)

300-311: Row item vs. nested deployment misuse in keyExtractor/rowClassName

VirtualTable rows are { deployment, environment? }. You’re passing the row to functions expecting Deployment.

Apply:

-      keyExtractor={(deployment) => deployment.id}
-      rowClassName={(deployment) => getRowClassName(deployment, selectedDeployment?.deployment.id)}
+      keyExtractor={(row) => row.deployment.id}
+      rowClassName={(row) => getRowClassName(row.deployment, selectedDeployment?.deployment.id)}
go/pkg/db/schema.sql (1)

418-421: Add missing indexes for rollback hot paths (domains.deployment_id, domains.is_rolled_back, deployments.is_rolled_back)

go/pkg/db/schema.sql is missing these CREATE INDEX entries; internal/db/src/schema/routes.ts defines deployment_idx for routes (different table). Add:

 CREATE INDEX `workspace_idx` ON `domains` (`workspace_id`);
 CREATE INDEX `project_idx` ON `domains` (`project_id`);
+CREATE INDEX `deployment_idx` ON `domains` (`deployment_id`);
+CREATE INDEX `rolled_back_idx` ON `domains` (`is_rolled_back`);
+CREATE INDEX `deployments_is_rolled_back_idx` ON `deployments` (`is_rolled_back`);
 CREATE INDEX `workspace_idx` ON `acme_challenges` (`workspace_id`);
♻️ Duplicate comments (4)
internal/db/src/schema/domains.ts (1)

39-42: Index deploymentId and isRolledBack; global domain uniqueness OK?

  • Add indexes: deploymentId (used by FindDomainsByDeploymentId), isRolledBack (UI/server filters).
  • Verify that UNIQUE(domain) being global (not per-workspace) is intended.
   (table) => ({
     workspaceIdx: index("workspace_idx").on(table.workspaceId),
     projectIdx: index("project_idx").on(table.projectId),
+    deploymentIdx: index("deployment_idx").on(table.deploymentId),
+    rolledBackIdx: index("rolled_back_idx").on(table.isRolledBack),
     uniqueDomainIdx: uniqueIndex("unique_domain_idx").on(table.domain),
   }),
go/apps/ctrl/services/deployment/rollback.go (1)

168-179: Also wrap final project update in a Connect error.

 if err != nil {
   s.logger.Error("failed to update project active deployment ID",
     "project_id", project.ID,
     "error", err.Error(),
   )
-  return nil, err
+  return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("failed to update project's live deployment id: %w", err))
 }
apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (2)

25-28: Use TRPCError for missing CTRL_URL (per prior feedback and learnings)

Switch to TRPCError so error mapping works; fail fast. Based on earlier review and the retrieved learning from mcstepp (2025-09-12).

Apply:

-    if (!ctrlUrl) {
-      throw new Error("ctrl service is not configured");
-    }
+    if (!ctrlUrl) {
+      throw new TRPCError({
+        code: "FAILED_PRECONDITION",
+        message: "ctrl service is not configured",
+      });
+    }

79-90: Bug: TRPCError.message must be a string

Coerce error to string and preserve the cause.

-        .catch((err) => {
-          console.error(err);
-          throw new TRPCError({
-            code: "INTERNAL_SERVER_ERROR",
-            message: err,
-          });
-        });
+        .catch((err) => {
+          console.error(err);
+          throw new TRPCError({
+            code: "INTERNAL_SERVER_ERROR",
+            message: err instanceof Error ? err.message : String(err),
+            cause: err,
+          });
+        });
🧹 Nitpick comments (16)
go/k8s/manifests/ctrl.yaml (1)

55-56: Duplicate UNKEY_METALD_ADDRESS; remove the placeholder

Two entries exist; the later one wins. Drop the placeholder to avoid confusion.

Apply:

-            - name: UNKEY_METALD_ADDRESS
-              value: "http://metald-placeholder:8080"
...
             - name: UNKEY_METALD_ADDRESS
               value: "http://metald:8080"

Also applies to: 72-73

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/utils/get-row-class.ts (1)

38-47: Nit: dedupe shared badge styles

badge.default/selected are identical across variants; centralize to reduce drift.

go/apps/ctrl/services/deployment/deploy_workflow.go (1)

386-391: Minor: step message counts allDomains, not changedDomains

Use changedDomains to reflect actual gateway upserts in the message.

Apply:

-			Message:      fmt.Sprintf("Created %d gateway configs for %d domains (skipped %d local domains)", len(gatewayParams), len(allDomains), len(skippedDomains)),
+			Message:      fmt.Sprintf("Created %d gateway configs for %d domains (skipped %d local domains)", len(gatewayParams), len(changedDomains), len(skippedDomains)),
internal/db/src/schema/deployments.ts (1)

28-29: Approve — migration exists; optional index recommended

is_rolled_back is present in go/pkg/db/schema.sql (lines 330 & 375) and generated models/queries were updated; default false/not null is fine. If you expect to filter by this column, add an index on (project_id, environment_id, is_rolled_back).

go/pkg/db/queries/domain_reassign.sql (1)

1-8: Consider narrowing the UPDATE and dropping redundant column write.

  • workspace_id should already match; updating it adds churn. If not needed, remove to reduce write amplification.
  • Optionally guard with optimistic concurrency (updated_at precondition) to avoid last-writer-wins races.
 UPDATE domains
 SET
-  workspace_id = sqlc.arg(target_workspace_id),
   deployment_id = sqlc.arg(target_deployment_id),
   is_rolled_back = sqlc.arg(is_rolled_back),
   updated_at = sqlc.arg(updated_at)
 WHERE id = sqlc.arg(id);

If you keep workspace_id here intentionally (e.g., cross-partition repair), please confirm.

go/proto/ctrl/v1/deployment.proto (1)

131-150: Proto surface looks good; clarify response semantics.

Document that RollbackResponse.domains returns hostnames (not IDs) to guide clients and server impl.

Suggested comment tweak:

 message RollbackResponse {
-  // the rolled back domains
+  // the rolled back domain hostnames (not IDs)
   repeated string domains = 1;
 }
go/apps/ctrl/services/deployment/rollback.go (6)

104-112: Filter sticky at the DB or add supporting indexes to avoid scanning.

Current code fetches all domains for the deployment and filters in Go. Either:

  • add WHERE sticky IN ('live','environment') in the query used by FindDomainsByDeploymentId, or
  • ensure indexes on domains(deployment_id, sticky, is_rolled_back) exist (see schema comments).

This reduces rows materialized and improves rollback latency.

Also applies to: 126-147


152-156: Normalize to Connect errors for consistency.

Return typed errors like other branches.

 err = pdb.BulkQuery.UpsertGateway(ctx, s.partitionDB.RW(), gatewayChanges)
 if err != nil {
   s.logger.Error("failed to upsert gateway", "error", err.Error())
-  return nil, fmt.Errorf("failed to upsert gateway: %w", err)
+  return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("failed to upsert gateway: %w", err))
 }

158-166: Ditto: return Connect errors on domain updates.

   err = db.Query.ReassignDomain(ctx, s.db.RW(), change)
   if err != nil {
     s.logger.Error("failed to update domain", "error", err.Error())
-    return nil, fmt.Errorf("failed to update domain: %w", err)
+    return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("failed to update domain: %w", err))
   }

181-205: Consistently wrap deployment rollback status updates in Connect errors.

 if err != nil {
   s.logger.Error("failed to update deployment rollback status",
     "deployment_id", sourceDeployment.ID,
     "error", err.Error(),
   )
-  return nil, err
+  return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("failed to update deployment rollback status: %w", err))
 }
 ...
 if err != nil {
   s.logger.Error("failed to update deployment rollback status",
     "deployment_id", targetDeployment.ID,
     "error", err.Error(),
   )
-  return nil, err
+  return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("failed to update deployment rollback status: %w", err))
 }

158-160: Comment is misleading; multiple domains can be rolled back.

Update comment to reflect reality or add a TODO for a bulk domain update path.


152-205: Reliability: two-DB write order can leave routing/DB out of sync on partial failure.

Consider:

  • wrap all main-DB writes (domains, project live, deployment flags) in a single transaction;
  • perform partition DB gateway upserts last, and/or add compensating logic on failure.

At minimum, gate the response on both steps succeeding and emit an audit log with remediation instructions.

Would you like a concrete Tx-wrapped sketch for the main DB section?

apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (3)

30-37: Optional: hoist ctrl client creation

Avoid per-request client allocation; create once at module scope and reuse.

Example:

// top-level
const ctrlUrl = env().CTRL_URL;
export const ctrl = ctrlUrl
  ? createClient(DeploymentService, createConnectTransport({ baseUrl: ctrlUrl }))
  : undefined;

// in handler: if (!ctrl) throw TRPCError...

66-77: Preconditions good; add guard when target is already live

Prevent no-op rollbacks when target equals live.

       if (!targetDeployment.project.liveDeploymentId) {
         throw new TRPCError({
           code: "PRECONDITION_FAILED",
           message: `Project ${targetDeployment.project.name} doesn't have a live deployment to roll back.`,
         });
       }
+      if (targetDeployment.project.liveDeploymentId === targetDeployment.id) {
+        throw new TRPCError({
+          code: "PRECONDITION_FAILED",
+          message: `Target deployment ${targetDeployment.id} is already live.`,
+        });
+      }

116-121: Include cause on outer TRPCError

Keeps message concise while retaining detail.

-      throw new TRPCError({
+      throw new TRPCError({
         code: "INTERNAL_SERVER_ERROR",
-        message: "Failed to communicate with control service",
+        message: "Failed to communicate with control service",
+        cause: error,
       });
go/pkg/db/querier_generated.go (1)

1526-1535: ReassignDomain: ensure transactional rollback

This UPDATE will typically be paired with UpdateDeploymentRollback and gateway updates. Execute in a single DBTX transaction in the service to maintain atomicity.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a0fc84 and 824efd2.

⛔ Files ignored due to path filters (2)
  • go/gen/proto/ctrl/v1/deployment.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • internal/proto/generated/ctrl/v1/deployment_pb.ts is excluded by !**/generated/**
📒 Files selected for processing (21)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/env-status-badge.tsx (1 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (5 hunks)
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/utils/get-row-class.ts (1 hunks)
  • apps/dashboard/lib/collections/deploy/deployments.ts (2 hunks)
  • apps/dashboard/lib/trpc/routers/deploy/deployment/list.ts (1 hunks)
  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (2 hunks)
  • go/apps/ctrl/services/deployment/deploy_workflow.go (5 hunks)
  • go/apps/ctrl/services/deployment/rollback.go (1 hunks)
  • go/k8s/manifests/ctrl.yaml (1 hunks)
  • go/pkg/assert/not_equal.go (1 hunks)
  • go/pkg/assert/not_equal_test.go (1 hunks)
  • go/pkg/db/deployment_update_rollback.sql_generated.go (1 hunks)
  • go/pkg/db/domain_reassign.sql_generated.go (1 hunks)
  • go/pkg/db/models_generated.go (3 hunks)
  • go/pkg/db/querier_generated.go (6 hunks)
  • go/pkg/db/queries/deployment_update_rollback.sql (1 hunks)
  • go/pkg/db/queries/domain_reassign.sql (1 hunks)
  • go/pkg/db/schema.sql (2 hunks)
  • go/proto/ctrl/v1/deployment.proto (2 hunks)
  • internal/db/src/schema/deployments.ts (2 hunks)
  • internal/db/src/schema/domains.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/dashboard/lib/collections/deploy/deployments.ts
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: mcstepp
PR: unkeyed/unkey#3662
File: apps/dashboard/lib/trpc/routers/deployment/list.ts:11-11
Timestamp: 2025-07-25T19:09:43.284Z
Learning: In apps/dashboard/lib/trpc/routers/deployment/list.ts, the listDeployments procedure intentionally queries the versions table rather than a deployments table. The user mcstepp indicated that renaming the table would require a database migration, which was deferred for the current PR focused on UI features.
📚 Learning: 2025-09-12T18:11:33.481Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/lib/trpc/routers/deploy/rollback.ts:23-24
Timestamp: 2025-09-12T18:11:33.481Z
Learning: In apps/dashboard/lib/trpc/routers/deploy/rollback.ts, the CTRL_URL environment variable should fail fast with a clear error message if missing in non-development environments, rather than defaulting to localhost which can mask production configuration issues.

Applied to files:

  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
📚 Learning: 2025-07-25T19:09:43.284Z
Learnt from: mcstepp
PR: unkeyed/unkey#3662
File: apps/dashboard/lib/trpc/routers/deployment/list.ts:11-11
Timestamp: 2025-07-25T19:09:43.284Z
Learning: In apps/dashboard/lib/trpc/routers/deployment/list.ts, the listDeployments procedure intentionally queries the versions table rather than a deployments table. The user mcstepp indicated that renaming the table would require a database migration, which was deferred for the current PR focused on UI features.

Applied to files:

  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
  • apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx
  • apps/dashboard/lib/trpc/routers/deploy/deployment/list.ts
📚 Learning: 2025-07-25T19:11:00.208Z
Learnt from: mcstepp
PR: unkeyed/unkey#3662
File: apps/dashboard/lib/trpc/routers/deployment/getOpenApiDiff.ts:110-147
Timestamp: 2025-07-25T19:11:00.208Z
Learning: In apps/dashboard/lib/trpc/routers/deployment/getOpenApiDiff.ts, the user mcstepp prefers to keep mock data fallbacks in POC/demonstration code for simplicity, even if it wouldn't be production-ready. This aligns with the PR being work-in-progress for demonstration purposes.

Applied to files:

  • apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts
  • apps/dashboard/lib/trpc/routers/deploy/deployment/list.ts
📚 Learning: 2025-09-12T17:57:18.337Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx:70-79
Timestamp: 2025-09-12T17:57:18.337Z
Learning: In the deployment rollback functionality, self-rollback scenarios are prevented at the UI level in the actions menu through the `canRollback` condition which includes `!("active" in deployment && deployment.active)`, making additional checks in the dialog component redundant.

Applied to files:

  • go/apps/ctrl/services/deployment/rollback.go
📚 Learning: 2025-09-12T17:57:18.337Z
Learnt from: mcstepp
PR: unkeyed/unkey#3952
File: apps/dashboard/app/(app)/projects/[projectId]/deployments/components/rollback-dialog.tsx:70-79
Timestamp: 2025-09-12T17:57:18.337Z
Learning: In the rollback functionality, the actions menu prevents self-rollback scenarios by including `!deployment.active` in the `canRollback` condition, ensuring only inactive deployments show the rollback option.

Applied to files:

  • go/apps/ctrl/services/deployment/rollback.go
📚 Learning: 2025-09-01T01:57:42.227Z
Learnt from: imeyer
PR: unkeyed/unkey#3899
File: go/proto/metald/v1/metald.proto:5-9
Timestamp: 2025-09-01T01:57:42.227Z
Learning: In the unkeyed/unkey repository, buf is configured to properly resolve metald proto imports like "metald/v1/vm.proto" without needing the full "go/proto/" prefix. The buf lint command `buf lint --path proto/metald` passes successfully with these relative import paths.

Applied to files:

  • go/apps/ctrl/services/deployment/deploy_workflow.go
🧬 Code graph analysis (9)
go/pkg/assert/not_equal.go (1)
go/pkg/codes/unkey_application.go (1)
  • App (53-71)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/utils/get-row-class.ts (2)
apps/dashboard/lib/collections/deploy/deployments.ts (1)
  • Deployment (40-40)
go/pkg/db/models_generated.go (1)
  • Deployment (569-588)
go/pkg/assert/not_equal_test.go (1)
go/pkg/assert/not_equal.go (1)
  • NotEqual (17-26)
apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (3)
apps/dashboard/lib/env.ts (1)
  • env (3-51)
internal/proto/generated/ctrl/v1/deployment_pb.ts (1)
  • DeploymentService (545-576)
apps/dashboard/lib/audit.ts (1)
  • insertAuditLogs (103-158)
go/apps/ctrl/services/deployment/rollback.go (11)
internal/proto/generated/ctrl/v1/deployment_pb.ts (2)
  • RollbackRequest (424-434)
  • RollbackResponse (447-454)
go/gen/proto/ctrl/v1/deployment.pb.go (6)
  • RollbackRequest (848-854)
  • RollbackRequest (867-867)
  • RollbackRequest (882-884)
  • RollbackResponse (900-906)
  • RollbackResponse (919-919)
  • RollbackResponse (934-936)
go/pkg/assert/not_equal.go (1)
  • NotEqual (17-26)
go/pkg/db/handle_err_no_rows.go (1)
  • IsNotFound (8-10)
go/pkg/assert/all.go (1)
  • All (19-26)
go/pkg/partition/db/models_generated.go (1)
  • VmsStatusRunning (63-63)
go/pkg/db/domain_reassign.sql_generated.go (1)
  • ReassignDomainParams (23-29)
go/pkg/partition/db/gateway_upsert.sql_generated.go (1)
  • UpsertGatewayParams (31-36)
go/pkg/db/models_generated.go (4)
  • DomainsSticky (239-239)
  • DomainsStickyLive (244-244)
  • DomainsStickyEnvironment (243-243)
  • Domain (599-610)
go/pkg/db/project_update_live_deployment_id.sql_generated.go (1)
  • UpdateProjectLiveDeploymentIdParams (19-23)
go/pkg/db/deployment_update_rollback.sql_generated.go (1)
  • UpdateDeploymentRollbackParams (19-23)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (3)
apps/dashboard/app/(app)/projects/[projectId]/deployments/hooks/use-deployments.ts (1)
  • useDeployments (8-119)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/env-status-badge.tsx (1)
  • EnvStatusBadge (29-42)
apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/deployment-status-badge.tsx (1)
  • DeploymentStatusBadge (77-123)
go/pkg/db/domain_reassign.sql_generated.go (1)
go/pkg/db/types/null_string.go (1)
  • NullString (10-10)
go/pkg/db/querier_generated.go (3)
go/pkg/partition/db/database.go (1)
  • DBTX (10-10)
go/pkg/db/domain_reassign.sql_generated.go (1)
  • ReassignDomainParams (23-29)
go/pkg/db/deployment_update_rollback.sql_generated.go (1)
  • UpdateDeploymentRollbackParams (19-23)
go/apps/ctrl/services/deployment/deploy_workflow.go (8)
go/gen/proto/ctrl/v1/deployment.pb.go (1)
  • SourceType_SOURCE_TYPE_CLI_UPLOAD (92-92)
go/pkg/db/tx.go (1)
  • Tx (196-201)
go/pkg/db/handle_err_no_rows.go (1)
  • IsNotFound (8-10)
go/pkg/db/domain_insert.sql_generated.go (1)
  • InsertDomainParams (43-53)
go/pkg/db/models_generated.go (1)
  • Domain (599-610)
go/pkg/db/domain_reassign.sql_generated.go (1)
  • ReassignDomainParams (23-29)
go/gen/proto/metald/v1/deployment.pb.go (3)
  • GetDeploymentResponse_Vm (452-460)
  • GetDeploymentResponse_Vm (473-473)
  • GetDeploymentResponse_Vm (488-490)
go/gen/proto/partition/v1/gateway.pb.go (3)
  • GatewayConfig (26-37)
  • GatewayConfig (50-50)
  • GatewayConfig (65-67)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test Packages / Test
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Build / Build
  • GitHub Check: Vade Review
🔇 Additional comments (30)
go/pkg/assert/not_equal.go (3)

1-6: LGTM! Clean package structure and imports.

The package declaration and imports are well-organized, importing only the necessary dependencies for error handling.


8-16: Excellent documentation with clear usage example.

The function documentation is comprehensive and includes a practical example showing how to use the assertion in context with proper error wrapping.


17-26: Well-implemented generic assertion function.

The implementation is clean and follows Go best practices:

  • Uses generics appropriately with the comparable constraint
  • Handles optional custom messages elegantly with variadic parameters
  • Returns proper structured errors using the fault package
  • Uses the correct error code from the codes package

The logic is straightforward and correct: returns an error when values are equal (assertion fails), nil when they're not equal (assertion passes).

go/pkg/assert/not_equal_test.go (3)

1-8: LGTM! Proper test package structure.

The test package follows Go conventions with the _test suffix and imports the necessary testing dependencies.


10-49: Comprehensive test coverage across multiple types.

The test suite thoroughly covers the NotEqual function behavior:

  • Tests different comparable types (strings, integers, booleans)
  • Verifies both success and failure cases
  • Tests custom error message functionality
  • Uses proper assertions with require package

The test organization with subtests makes it easy to understand and maintain.


51-71: Excellent fuzz testing implementation.

The fuzz test adds valuable coverage:

  • Good seed values including edge cases (empty strings, identical values)
  • Proper fuzz function signature
  • Correct logic verification: asserts error when values are equal, no error when different
  • Clear error messages for debugging failures

This complements the unit tests well by testing with random inputs.

apps/dashboard/lib/trpc/routers/deploy/deployment/list.ts (1)

30-30: Expose isRolledBack: LGTM

Field surfaces rollback state to the UI. No issues spotted.

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/utils/get-row-class.ts (2)

38-47: Rolled‑back row styles: LGTM

Variants and tokens look consistent with FAILED/DEFAULT sets.


51-55: Precedence correct (failed > rolledBack > default)

Good call prioritizing failed over rolled-back.

go/apps/ctrl/services/deployment/deploy_workflow.go (1)

265-266: Hardcoded SourceType

CLI_UPLOAD is hardcoded. If multiple sources are expected, plumb this via request/config.

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/deployments-list.tsx (1)

61-96: Badge logic (Live/Rolled Back): LGTM

Clear precedence and readable UI.

go/pkg/db/deployment_update_rollback.sql_generated.go (1)

13-33: Generated UpdateDeploymentRollback: LGTM — wire-up verified
Signature and params match SQL; present in Querier (go/pkg/db/querier_generated.go:1626) and called from go/apps/ctrl/services/deployment/rollback.go (lines 181, 194).

apps/dashboard/app/(app)/projects/[projectId]/deployments/components/table/components/env-status-badge.tsx (1)

13-19: Variant rename to "live" + new "rolledBack": audit call sites

Renaming "current" → "live" can break callers — search and update any usage of variant="current" (JSX props and object keys). Automated sandbox search failed (rg parsing/ignore errors); run locally:

rg -n -uu --hidden -S -g '!node_modules/**' -e 'variant.*current' -e 'current:' || true

go/k8s/manifests/ctrl.yaml (1)

19-19: ServiceAccount present — no action required

go/k8s/manifests/rbac.yaml defines ServiceAccount "unkey-serviceaccount" (namespace: unkey) and a ClusterRoleBinding that references it; ctrl.yaml's serviceAccountName is valid.

go/pkg/db/queries/deployment_update_rollback.sql (1)

1-4: LGTM: query shape and placeholder order match generated code

No issues found.

go/pkg/db/domain_reassign.sql_generated.go (1)

23-29: Generated code matches SQL; no action.

go/apps/ctrl/services/deployment/rollback.go (2)

26-32: Good: explicit self‑rollback guard

Prevents target==source at API boundary.


181-198: Verify semantics: is_rolled_back on deployments (source=false, target=true).

Confirm intended meaning is “target is a rollback deployment.” If “rolled back from” is needed later, consider a from_deployment_id for traceability.

go/pkg/db/models_generated.go (3)

239-281: Enums and NULL wrapper for DomainsSticky look correct

Scan/Value mirror existing patterns; constants align with intended values.


574-574: Deployment: add is_rolled_back field—LGTM

Matches schema; no issues spotted.


600-610: Domain: new sticky + is_rolled_back fields

Looks good. If you’ll filter by these often, ensure the migration adds appropriate indexes (e.g., (deployment_id), (workspace_id, project_id, is_rolled_back)).

apps/dashboard/lib/trpc/routers/deploy/deployment/rollback.ts (5)

2-7: Connect RPC client imports—LGTM

Imports align with generated service.


20-21: Input shape change: callers must be updated

Ensure all UI/hooks now pass only targetDeploymentId.


41-57: DB shape query—LGTM

Selecting id/status and project.id/name/liveDeploymentId is sufficient for preconditions.


92-107: Audit log composition—LGTM

Good use of resources and ctx.audit context.


109-110: Return shape matches new RPC

Returning domains from RollbackResponse—LGTM.

go/pkg/db/querier_generated.go (4)

176-177: Domain reads now include sticky and is_rolled_back

Interface changes look correct. Ensure all scan targets/call sites are updated to read the extended rows.

Also applies to: 186-193


590-605: FindProjectById adds live_deployment_id—LGTM

Matches UI/backend needs for rollback source resolution.


910-929: InsertDomain now persists sticky

ON DUPLICATE KEY also updates sticky—good. Verify callers supply NullDomainsSticky correctly (Valid=false when unset) to avoid unexpected empty-string writes.


1621-1626: UpdateDeploymentRollback—LGTM

Field set and signature align with models.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"use client";
import {
  type MenuItem,
  TableActionPopover,
} from "@/components/logs/table-action.popover";
import type { Deployment, Environment } from "@/lib/collections";
import { ArrowDottedRotateAnticlockwise } from "@unkey/icons";
import { RollbackDialog } from "../../../rollback-dialog";

type DeploymentListTableActionsProps = {
  liveDeployment?: Deployment;
  selectedDeployment: Deployment;
  environment?: Environment;
};

export const DeploymentListTableActions = ({
  liveDeployment,
  selectedDeployment,
  environment,
}: DeploymentListTableActionsProps) => {
  const menuItems = getDeploymentListTableActionItems(
    selectedDeployment,
    liveDeployment,
    environment
  );
  return <TableActionPopover items={menuItems} />;
};

const getDeploymentListTableActionItems = (
  selectedDeployment: Deployment,
  liveDeployment: Deployment | undefined,
  environment: Environment | undefined
): MenuItem[] => {
  // Rollback is only enabled for production deployments that are ready and not currently active
  const canRollback =
    liveDeployment &&
    environment?.slug === "production" &&
    selectedDeployment.status === "ready" &&
    selectedDeployment.id !== liveDeployment.id;

  return [
    {
      id: "rollback",
      label: "Rollback",
      icon: <ArrowDottedRotateAnticlockwise size="md-regular" />,
      disabled: !canRollback,
      ActionComponent: canRollback
        ? (props) => (
            <RollbackDialog
              {...props}
              liveDeployment={liveDeployment}
              targetDeployment={selectedDeployment}
            />
          )
        : undefined,
    },
  ];
};

This can be simplified like this since TableActionPopover already supports rendering components


export const RollbackDialog = ({
isOpen,
onOpenChange,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be able to support ActionComponents
from this onOpenChange to this onClose: () => void;

Copy link
Contributor

@ogzhanolguncu ogzhanolguncu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI flashes due to re-render shit makes it a bad UX. Feel free to create tickets for them I can take care of them later next week

Screen.Recording.2025-09-19.at.17.33.13.mov

Copy link
Member

@Flo4604 Flo4604 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked the go stuff and if the ui works and it works as expected for production environment, just those 2 db changes and the file cleanup lgtm otherwise.

Copy link
Contributor

@ogzhanolguncu ogzhanolguncu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rollback logic works great. Feel free to delegate UI changes to me if you want. I'll approve after Flo's requested changes

@chronark
Copy link
Collaborator Author

thank you, will do

@chronark
Copy link
Collaborator Author

UI flashes due to re-render shit makes it a bad UX. Feel free to create tickets for them I can take care of them later next week

Screen.Recording.2025-09-19.at.17.33.13.mov

@coderabbitai create an issue for this and assign @ogzhanolguncu

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 19, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@chronark I've created a GitHub issue for the UI flashing problem and assigned it to @ogzhanolguncu as requested. The issue includes context from this PR and the screen recording reference, so it can be addressed as follow-up work next week.


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for UI flashing/re-render problems
gh issue create \
  --title "Fix UI flashing during re-renders in domain rollbacks" \
  --body "## Problem

UI flashes due to re-renders are creating a poor user experience, particularly visible during domain rollback operations.

## Context

This issue was identified during review of the domain rollbacks feature implementation. The flashing appears to be related to component re-rendering behavior that causes visual disruption.

## References

- Original PR: https://github.com/unkeyed/unkey/pull/3999
- Identified by: @chronark
- Screen recording was provided showing the issue

## Acceptance Criteria

- [ ] Identify the root cause of UI flashing during re-renders
- [ ] Implement solution to eliminate or minimize visual disruption
- [ ] Test the fix across different deployment scenarios
- [ ] Ensure the solution doesn't negatively impact performance

## Priority

This can be addressed next week as follow-up work to the domain rollbacks feature." \
  --assignee ogzhanolguncu

Length of output: 44


@chronark Perfect! I've successfully created issue #4006 for the UI flashing problem and assigned it to @ogzhanolguncu.

Issue created: #4006

The issue includes all the necessary context from this PR and can be addressed as follow-up work next week.

@graphite-app
Copy link

graphite-app bot commented Sep 19, 2025

Video gif. A toddler sits at a table with a cracker in her hands. She looks at us with a big excited smile and then grins while giving a big thumbs up. Text, “Thank you!” (Added via Giphy)

@chronark chronark merged commit 11fbac5 into main Sep 19, 2025
19 checks passed
@chronark chronark deleted the domain-rollbacks branch September 19, 2025 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants