Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis PR removes tenant context, isolation, quotas, and related authentication from builderd across client, server, service, and proto layers. Requests and configs no longer carry tenant IDs; quotas and tenant storage/isolation subsystems are deleted. Observability metrics drop tenant attributes. CLI, docs, and assets handling are updated accordingly. Dependencies are bumped and local replaces added. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI as CLI
participant Client as Builderd Client
participant SVC as Builderd Service
participant Exec as Executor
participant AM as Asset Manager
Note over CLI,Client: New flow (no tenant context/auth)
CLI->>Client: CreateBuild(BuildConfig: source/target/strategy, labels)
Client->>SVC: CreateBuild(request without tenant_id)
SVC->>Exec: Start build (no tenant quotas/isolation)
Exec->>AM: Upload artifact (no X-Tenant-ID/X-Customer-ID)
Exec-->>SVC: Build result
SVC-->>Client: CreateBuildResponse
Client-->>CLI: Build ID
CLI->>Client: StreamBuildLogs(build_id, follow)
Client->>SVC: StreamBuildLogs(build_id)
SVC-->>Client: Log stream
Client-->>CLI: Logs
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes Possibly related PRs
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Thank you for following the naming conventions for pull request titles! 🙏 |
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (26)
go/deploy/builderd/README.md (3)
172-185: Purge “Multi-Tenancy” env vars from docs; feature removed in this PR.The section advertises config that no longer exists and will confuse operators.
Apply:
-### Multi-Tenancy -```bash -UNKEY_BUILDERD_TENANT_ISOLATION_ENABLED=true # Enable tenant isolation -UNKEY_BUILDERD_TENANT_DEFAULT_TIER=free # Default service tier -UNKEY_BUILDERD_TENANT_QUOTA_CHECK_INTERVAL=5m # Quota check frequency - -# Default resource limits -UNKEY_BUILDERD_TENANT_DEFAULT_MAX_MEMORY_BYTES=2147483648 # 2GB -UNKEY_BUILDERD_TENANT_DEFAULT_MAX_CPU_CORES=2 -UNKEY_BUILDERD_TENANT_DEFAULT_MAX_DISK_BYTES=10737418240 # 10GB -UNKEY_BUILDERD_TENANT_DEFAULT_TIMEOUT_SECONDS=900 # 15min -UNKEY_BUILDERD_TENANT_DEFAULT_MAX_CONCURRENT_BUILDS=3 -UNKEY_BUILDERD_TENANT_DEFAULT_MAX_DAILY_BUILDS=100 -```
264-268: Drop TenantId from StreamBuildLogs example.Aligns with StreamBuildLogsRequest now lacking tenant_id.
Apply:
stream, err := client.StreamBuildLogs(ctx, connect.NewRequest(&builderv1.StreamBuildLogsRequest{ BuildId: buildId, - TenantId: tenantId, Follow: true, }))
1-4: Update branding and links that still claim “Multi-Tenant”.The service is no longer multi-tenant. Please strip the term from header/overview and remove references to internal/tenant/* and quota features that were deleted.
Also applies to: 14-27, 93-103
go/deploy/builderd/go.mod (1)
3-3: Invalid go directive (patch versions not allowed).Use major.minor only; pin patch via toolchain.
Apply:
-go 1.24.4 +go 1.24 +toolchain go1.24.4go/deploy/builderd/client/types.go (1)
35-39: Cap and validate page size.Add a server-side guard and document defaults (e.g., max 1000).
go/proto/deploy/builderd/v1/builder.proto (2)
307-311: Remove tenant_id from GetBuildRequest but reserve field number.Client types no longer set it; keep wire-compat.
Apply:
message GetBuildRequest { string build_id = 1; - string tenant_id = 2; // For authorization + reserved 2; // tenant_id removed }
9-19: Service/docs still say “multi-tenant”.Update comments to reflect single-tenant/global service, or clearly state tenancy was removed.
go/deploy/builderd/cmd/builderd/main.go (5)
449-456: Update CLI help text: remove “Multi-Tenant” branding and tenant env var; keep TLS defaults consistent.Current help contradicts the code and the PR.
Apply:
-func printUsage() { - fmt.Printf("Builderd - Multi-Tenant Build Service\n\n") +func printUsage() { + fmt.Printf("Builderd - Build Service\n\n") @@ - fmt.Printf(" UNKEY_BUILDERD_TENANT_ISOLATION_ENABLED Enable tenant isolation (default: true)\n") @@ - fmt.Printf("\nTLS Configuration:\n") - fmt.Printf(" UNKEY_BUILDERD_TLS_MODE TLS mode: disabled, file, spiffe (default: disabled)\n") + fmt.Printf("\nTLS Configuration:\n") + fmt.Printf(" UNKEY_BUILDERD_TLS_MODE TLS mode: disabled, file, spiffe (default: disabled)\n") @@ - fmt.Printf(" It supports multi-tenant isolation, resource quotas, and comprehensive\n") - fmt.Printf(" observability with OpenTelemetry.\n\n") + fmt.Printf(" It provides comprehensive observability with OpenTelemetry.\n\n") @@ -func printVersion() { - fmt.Printf("Builderd - Multi-Tenant Build Service\n") +func printVersion() { + fmt.Printf("Builderd - Build Service\n")Also applies to: 461-476, 483-491, 500-503
353-359: Prometheus bind address: guard empty interface.If PrometheusInterface is empty, addr becomes “:port”. Either ensure config default or set a safe default here.
Apply:
- promAddr := fmt.Sprintf("%s:%s", cfg.OpenTelemetry.PrometheusInterface, cfg.OpenTelemetry.PrometheusPort) + iface := cfg.OpenTelemetry.PrometheusInterface + if iface == "" { iface = "127.0.0.1" } + promAddr := fmt.Sprintf("%s:%s", iface, cfg.OpenTelemetry.PrometheusPort)Also applies to: 368-396
535-574: Minor: validate dirs using stat first to avoid unnecessary chmod on existing dirs.Not critical, but avoids touching perms unnecessarily.
- if err := os.MkdirAll(dir, 0o755); err != nil { + if _, err := os.Stat(dir); os.IsNotExist(err) { + if err := os.MkdirAll(dir, 0o755); err != nil { + return fmt.Errorf("cannot create/access directory %s: %w", dir, err) + } + } else if err != nil { return fmt.Errorf("cannot create/access directory %s: %w", dir, err) - } + }
260-266: Consider basic rate limiting on main mux as well (not only /health on Prometheus server).Protects the RPC surface from bursts when TLS is off (h2c).
135-151: Align TLS default policy comments and docs
In go/deploy/builderd/cmd/builderd/main.go remove the “defaults to disabled” comment (lines 135–136) and replace it with “SPIFFE/mTLS is required by default (no disabled fallback)”, then update printUsage and any README references to eliminate “default: disabled” messaging so all docs and flag defaults match the enforced policy.go/deploy/builderd/internal/config/config.go (1)
79-89: Remove or integrate unusedResourceLimits
ResourceLimitsdefined at go/deploy/builderd/internal/config/config.go:79 is never referenced elsewhere; remove it or wire it into your config/validation to avoid dead code.go/deploy/builderd/internal/executor/docker.go (2)
554-568: Avoid shell invocation for file removals (pattern handling, injection, spaces)Using sh -c with an interpolated path is brittle (spaces, special chars) even after validation. Replace with Glob + os.RemoveAll.
- // Use rm command to remove files matching pattern - // Note: fullPattern is now validated and sanitized - //nolint:gosec // G204: Path is validated and sanitized above to prevent injection - cmd := exec.CommandContext(ctx, "sh", "-c", fmt.Sprintf("rm -rf %s", fullPattern)) - if err := cmd.Run(); err != nil { + // Expand pattern and remove each match safely without a shell + matches, globErr := filepath.Glob(fullPattern) + if globErr != nil { + logger.DebugContext(ctx, "glob failed", + slog.String("pattern", pattern), + slog.String("error", globErr.Error()), + ) + lastError = globErr + continue + } + if len(matches) == 0 { + continue + } + for _, m := range matches { + if err := os.RemoveAll(m); err != nil { + logger.DebugContext(ctx, "failed to remove path", + slog.String("path", m), + slog.String("error", err.Error()), + ) + lastError = err + } else { + removedPatterns++ + } + } - } else { - removedPatterns++ - }
205-208: Log the actual artifact pathYou log rootfs_dir while the artifact returned is the ext4 image. Log ext4Path to avoid confusion.
- logger.InfoContext(ctx, "Docker image extraction successful", - slog.String("rootfs_path", rootfsDir), + logger.InfoContext(ctx, "Docker image extraction successful", + slog.String("rootfs_path", ext4Path), slog.Duration("total_duration", time.Since(start)), )go/deploy/builderd/internal/service/builder.go (8)
183-191: Fix data race: BuildJob is mutated asynchronously while GetBuild returns a shared pointer
buildJobis updated in the background goroutine whileGetBuildreturns the same pointer without copying. This is a data race and can yield inconsistent responses. Deep-copy the job under lock before returning it (and do the same in list APIs later).Apply:
@@ - // Retrieve build from memory storage - s.buildsMutex.RLock() - build, exists := s.builds[req.Msg.GetBuildId()] - s.buildsMutex.RUnlock() - - if !exists { - return nil, connect.NewError(connect.CodeNotFound, - fmt.Errorf("build not found: %s", req.Msg.GetBuildId())) - } - - resp := &builderv1.GetBuildResponse{ - Build: build, - } + // Retrieve build from memory storage and deep-copy to avoid races + s.buildsMutex.RLock() + build, exists := s.builds[req.Msg.GetBuildId()] + if !exists { + s.buildsMutex.RUnlock() + return nil, connect.NewError(connect.CodeNotFound, + fmt.Errorf("build not found: %s", req.Msg.GetBuildId())) + } + buildCopy := proto.Clone(build).(*builderv1.BuildJob) + s.buildsMutex.RUnlock() + + resp := &builderv1.GetBuildResponse{ + Build: buildCopy, + }Add import:
import "google.golang.org/protobuf/proto"Also applies to: 213-220, 340-353
155-173: Implement real cancellation and merge request/shutdown contextsBuilds ignore request cancellations (
buildCtx := s.shutdownCtx) andCancelBuildreports success without actually canceling. Merge request ctx with service shutdown and store per-build cancel funcs; use them inCancelBuild.CreateBuild changes:
@@ - s.buildWg.Add(1) - go func() { + s.buildWg.Add(1) + // derive build context from request; also react to service shutdown + buildCtx, buildCancel := context.WithCancel(ctx) + // store cancel for CancelBuild + s.buildsMutex.Lock() + if s.buildCancels == nil { + s.buildCancels = make(map[string]context.CancelFunc) + } + s.buildCancels[buildJob.BuildId] = buildCancel + s.buildsMutex.Unlock() + + // cancel on service shutdown as well + go func() { + select { + case <-s.shutdownCtx.Done(): + buildCancel() + case <-buildCtx.Done(): + } + }() + + go func() { defer s.buildWg.Done() + defer func() { + s.buildsMutex.Lock() + delete(s.buildCancels, buildJob.BuildId) + s.buildsMutex.Unlock() + }() @@ - buildCtx := s.shutdownCtx + // use merged buildCtxCancelBuild changes:
@@ - // TODO: Cancel the running build process - // TODO: Update build state in database + // Attempt to cancel the running build + s.buildsMutex.Lock() + cancel, ok := s.buildCancels[req.Msg.GetBuildId()] + s.buildsMutex.Unlock() + if !ok { + return nil, connect.NewError(connect.CodeNotFound, + fmt.Errorf("build not running or not found: %s", req.Msg.GetBuildId())) + } + cancel()Add struct field (outside selected range):
type BuilderService struct { ... buildCancels map[string]context.CancelFunc }Also applies to: 392-401
63-66: Use collision-resistant build IDsTime-based IDs can collide across processes; prefer UUID.
-func generateBuildID() string { - return fmt.Sprintf("build-%d", time.Now().UnixNano()) -} +func generateBuildID() string { + return "build-" + uuid.NewString() +}Add import:
import "github.com/google/uuid"
320-326: Return the job’s CreatedAt for consistencyAvoid skew between the job and response timestamps.
resp := &builderv1.CreateBuildResponse{ BuildId: buildJob.BuildId, State: builderv1.BuildState_BUILD_STATE_BUILDING, - CreatedAt: timestamppb.Now(), + CreatedAt: buildJob.CreatedAt, RootfsPath: "", }
207-210: Avoid stringly-typed status comparisonsRelying on
"failed"is brittle. Expose a typed status from the executor or return a boolean/enum.
393-395: Improve cancellation metrics dimensionsHardcoding
"unknown"loses signal. Includebuild_idandsource_typewhen available.
147-151: In-memory build store: plan persistence and retentionMap-based storage will lose builds on restart and can leak memory. Persist jobs and add GC/retention.
70-75: Remove AIDEV- breadcrumbs before merge*Internal notes should not ship.
Also applies to: 153-161
go/deploy/builderd/cmd/builderd-cli/main.go (3)
18-20: Update CLI comment to remove “tenant isolation” wordingDocs still reference tenants; align with the PR’s removal.
262-266: Make force parsing more robustPositional
--forceis brittle. At minimum accept-forcetoo; ideally use a sub-FlagSet.-if flag.NArg() > 2 && flag.Arg(2) == "--force" { +if flag.NArg() > 2 && (flag.Arg(2) == "--force" || flag.Arg(2) == "-force") { force = true }
318-321: Treat EOF as a clean stream terminationAvoid fatal on normal stream close.
- if err := stream.Err(); err != nil { - log.Fatalf("Stream error: %v", err) - } + if err := stream.Err(); err != nil && !errors.Is(err, io.EOF) { + log.Fatalf("Stream error: %v", err) + }Add imports:
import ( "errors" "io" )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (5)
go/deploy/builderd/go.sumis excluded by!**/*.sumgo/gen/proto/deploy/builderd/v1/builder.pb.gois excluded by!**/*.pb.go,!**/gen/**go/gen/proto/deploy/builderd/v1/builderdv1connect/builder.connect.gois excluded by!**/gen/**go/gen/proto/metal/vmprovisioner/v1/vmprovisioner.pb.gois excluded by!**/*.pb.go,!**/gen/**go/gen/proto/metal/vmprovisioner/v1/vmprovisionerv1connect/vmprovisioner.connect.gois excluded by!**/gen/**
📒 Files selected for processing (19)
go/deploy/builderd/README.md(2 hunks)go/deploy/builderd/client/client.go(6 hunks)go/deploy/builderd/client/types.go(4 hunks)go/deploy/builderd/cmd/builderd-cli/main.go(6 hunks)go/deploy/builderd/cmd/builderd/main.go(1 hunks)go/deploy/builderd/go.mod(2 hunks)go/deploy/builderd/internal/assetmanager/client.go(0 hunks)go/deploy/builderd/internal/assets/base.go(0 hunks)go/deploy/builderd/internal/config/config.go(1 hunks)go/deploy/builderd/internal/executor/docker.go(1 hunks)go/deploy/builderd/internal/executor/docker_pipeline.go(1 hunks)go/deploy/builderd/internal/executor/types.go(0 hunks)go/deploy/builderd/internal/observability/metrics.go(1 hunks)go/deploy/builderd/internal/service/builder.go(3 hunks)go/deploy/builderd/internal/tenant/isolation.go(0 hunks)go/deploy/builderd/internal/tenant/manager.go(0 hunks)go/deploy/builderd/internal/tenant/storage.go(0 hunks)go/deploy/builderd/internal/tenant/types.go(0 hunks)go/proto/deploy/builderd/v1/builder.proto(11 hunks)
💤 Files with no reviewable changes (7)
- go/deploy/builderd/internal/assets/base.go
- go/deploy/builderd/internal/assetmanager/client.go
- go/deploy/builderd/internal/executor/types.go
- go/deploy/builderd/internal/tenant/isolation.go
- go/deploy/builderd/internal/tenant/storage.go
- go/deploy/builderd/internal/tenant/manager.go
- go/deploy/builderd/internal/tenant/types.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test Go API Local / Test
- GitHub Check: Test API / API Test Local
- GitHub Check: Test Packages / Test
- GitHub Check: Build / Build
🔇 Additional comments (6)
go/deploy/builderd/go.mod (1)
22-26: Adjust verification script to run in the module directory and rerun
Change intogo/deploy/builderdbefore running the checks so that the correctgo.modis found. For example:cd go/deploy/builderd go mod tidy go list ./... >/dev/null go list -m -u all | rg 'connectrpc|grpc|genproto|opentelemetry|golang.org/x/'Repeat the same steps for the dependency block at lines 48–55.
go/proto/deploy/builderd/v1/builder.proto (1)
352-356: GetBuildStatsRequest still has tenant_id; remove and reserve or keep it consistently.Client types dropped it. Make a call and align.
Suggested if removing:
message GetBuildStatsRequest { - string tenant_id = 1; - google.protobuf.Timestamp start_time = 2; - google.protobuf.Timestamp end_time = 3; + reserved 1; // tenant_id removed + google.protobuf.Timestamp start_time = 2; + google.protobuf.Timestamp end_time = 3; }go/deploy/builderd/cmd/builderd/main.go (1)
250-252: ACK: tenant auth interceptor removed.Interceptors now only metrics/logging — matches PR intent.
go/deploy/builderd/internal/observability/metrics.go (1)
295-305: RecordBuildStart API change verified
Only 2-arg calls remain (in docker_pipeline.go and docker.go) and there are no references to tenant_tier or similar attributes in the Go code.go/deploy/builderd/client/client.go (1)
124-125: Proto field usage LGTMBuildId/Follow/Force mappings align with tenant-free requests.
Also applies to: 160-161, 177-179, 194-196
go/deploy/builderd/cmd/builderd-cli/main.go (1)
183-184: API surface updates LGTMBuildID-only requests and removal of tenant args are correctly reflected across handlers.
Also applies to: 237-240, 269-271, 295-297
|
I'll clean up any issues here. |
|
@imeyer is it fine to merge? or do you want to do any changes in this branch |
|
nope go ahead and approve 😄 |
Graphite Automations"Post a GIF when PR approved" took an action on this PR • (09/08/25)1 gif was posted to this PR based on Andreas Thomas's automation. |


What does this PR do?
Fixes # (issue)
If there is not an issue for this, please create one first. This is used to tracking purposes and also helps use understand why this PR exists
Type of change
How should this be tested?
Checklist
Required
pnpm buildpnpm fmtconsole.logsgit pull origin mainAppreciated
Summary by CodeRabbit