Skip to content

fix: more metald cleanup#3919

Merged
chronark merged 1 commit intomainfrom
push-yvnylzlopxok
Sep 5, 2025
Merged

fix: more metald cleanup#3919
chronark merged 1 commit intomainfrom
push-yvnylzlopxok

Conversation

@imeyer
Copy link
Contributor

@imeyer imeyer commented Sep 5, 2025

This PR is more cleanup to realign the proper package names and get the client back into working order. I've removed more of the superfluous Claude nonsense and simplified a lot of the interactions based on the previous proto changes too.

Summary by CodeRabbit

  • New Features

    • Added deployment management commands: create, update, delete, get.
    • Streamlined VM workflows (create/boot) with expanded CLI options: config file, template, CPU, memory, image, force build.
    • VM list/info now shows state, vCPU count, and memory; updated JSON output.
  • Refactor

    • Migrated to a new backend API and reorganized CLI around deployments.
    • Removed legacy config-gen/validate and older VM config types.
  • Chores

    • Upgraded Go toolchain and dependencies.
    • Simplified telemetry/logging interceptors and removed default tenant-context propagation.

@changeset-bot
Copy link

changeset-bot bot commented Sep 5, 2025

⚠️ No Changeset found

Latest commit: d1b273d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 5, 2025

📝 Walkthrough

Walkthrough

Migrates metald client and CLI from metal/vmprovisioner/v1 to metald/v1 protos, replaces tenant/project/environment scoping with deployment-centric APIs, adds deployment RPCs, removes VM config file helpers, refactors CLI commands, and strips tenant-auth forwarding/interceptors from client/server and observability stacks. Updates Go toolchains and dependencies.

Changes

Cohort / File(s) Summary
Client API migration to metald/v1
go/deploy/metald/client/client.go, go/deploy/metald/client/types.go, go/deploy/metald/client/config.go
Switches client to metald/v1 protos and connect client; replaces Tenant/Project/Environment with DeploymentID; rewrites VM methods to new request/response types; adds Create/Update/Delete/Get Deployment methods; deletes local VM wrapper/types and VM config file conversion utilities.
CLI overhaul to deployment-centric commands
go/deploy/metald/client/cmd/metald-cli/main.go
Updates imports/types to metald/v1; replaces config-gen/validate with deployment commands (create/update/delete/get); revises VM flows (create/boot/list/info) to new proto fields; introduces expanded VMConfigOptions.
Client module deps
go/deploy/metald/client/go.mod
Bumps indirect deps (x/crypto, x/net, x/sys, x/text, jose, genproto, grpc); no public API changes.
Server service changes
go/deploy/metald/internal/service/deployment.go, go/deploy/metald/internal/service/vm.go, go/deploy/metald/cmd/metald/main.go
Adds deployment handlers (CreateDeployment, GetDeployment) returning static data; removes CreateDeployment from vm.go; adjusts main to rely on default interceptors (drops explicit AuthenticationInterceptor append).
Auth and tenant context removal
go/deploy/metald/internal/service/auth.go, go/deploy/pkg/observability/interceptors/tenant.go, go/deploy/pkg/observability/interceptors/client.go, go/deploy/pkg/observability/interceptors/interceptors.go, go/deploy/pkg/observability/interceptors/logging.go, go/deploy/pkg/observability/interceptors/metrics.go
Removes tenant auth/context types and interceptors; strips tenant header forwarding and related metrics/logging enrichment; default interceptor chain drops TenantAuth; server-side auth code commented out.
Server module/toolchain updates
go/deploy/metald/go.mod
Upgrades to Go 1.25; refreshes OTEL/GRPC/Prometheus; removes Docker deps; adds local replaces for internal modules.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as metald-cli
  participant Client as Client (metald/v1)
  participant Server as metald VMService
  Note over CLI,Client: Deployment lifecycle (new APIs)
  User->>CLI: metald create-deployment --deployment-id X
  CLI->>Client: CreateDeployment(req: CreateDeploymentRequest)
  Client->>Server: CreateDeployment (connect RPC)
  Server-->>Client: CreateDeploymentResponse (vm_ids)
  Client-->>CLI: resp
  CLI-->>User: print vm_ids / JSON
Loading
sequenceDiagram
  autonumber
  participant Caller as Client
  participant Interceptors as Default Interceptors
  participant Server as metald VMService
  Note right of Interceptors: After changes<br/>- No TenantAuth<br/>- No Tenant forwarding
  Caller->>Interceptors: RPC (e.g., BootVm)
  Interceptors->>Server: Forward request (trace+metrics only)
  Server-->>Interceptors: Response
  Interceptors-->>Caller: Response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Possibly related PRs

Suggested reviewers

  • perkinsjr
  • chronark
  • ogzhanolguncu
  • Flo4604
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch push-yvnylzlopxok

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@vercel
Copy link

vercel bot commented Sep 5, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
dashboard Ready Ready Preview Comment Sep 5, 2025 2:36pm
engineering Ready Ready Preview Comment Sep 5, 2025 2:36pm

@github-actions
Copy link
Contributor

github-actions bot commented Sep 5, 2025

Thank you for following the naming conventions for pull request titles! 🙏

@imeyer imeyer added the No Rabbit Disables CodeRabbit auto reviews label Sep 5, 2025
@imeyer imeyer enabled auto-merge September 5, 2025 14:42
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
go/deploy/metald/client/go.mod (1)

3-3: Align Go toolchain versions across all modules to 1.25.0.
Several go.mod files still declare older Go versions (e.g., apps/agent at 1.23.0, apps/chproxy at 1.24.0, go/demo_api at 1.24). Update these—and ensure CI/build images and any toolchain directives—so every module uses Go 1.25.0 to prevent build drift.

go/deploy/metald/client/client.go (4)

40-48: Stale client fields; drop tenant/project/environment or set them.

tenantID, projectID, and environmentID are never set. Keeping them leads to empty headers and misleading getters.

Apply this diff to streamline the client state:

 type Client struct {
-	vmService     metaldv1connect.VmServiceClient
-	tlsProvider   tls.Provider
-	tenantID      string
-	projectID     string
-	environmentID string
-	serverAddr    string
+	vmService   metaldv1connect.VmServiceClient
+	tlsProvider tls.Provider
+	serverAddr  string
 }

240-271: Rename and simplify transport; avoid stale tenant headers.

Keep only Authorization and an optional X-Deployment-ID. Injecting empty X-Tenant/Project/Environment is misleading.

Apply:

-// tenantTransport adds authentication and tenant isolation headers to all requests
-type tenantTransport struct {
-	Base          http.RoundTripper
-	EnvironmentID string
-	ProjectID     string
-	TenantID      string
-}
+// tenantTransport attaches auth and deployment context headers.
+type tenantTransport struct {
+	Base         http.RoundTripper
+	UserID       string
+	DeploymentID string
+}
 
 func (t *tenantTransport) RoundTrip(req *http.Request) (*http.Response, error) {
   // Clone the request to avoid modifying the original
   req2 := req.Clone(req.Context())
   if req2.Header == nil {
     req2.Header = make(http.Header)
   }
 
-  // Set Authorization header with development token format
-  // AIDEV-BUSINESS_RULE: In development, use "dev_user_<id>" format
-  // TODO: Update to proper JWT tokens in production
-  req2.Header.Set("Authorization", fmt.Sprintf("Bearer dev_user_%s", t.TenantID))
-
-  // Also set X-Tenant-ID header for tenant identification
-  req2.Header.Set("X-Tenant-ID", t.TenantID)
-  req2.Header.Set("X-Project-ID", t.ProjectID)
-  req2.Header.Set("X-Environment-ID", t.EnvironmentID)
+  // Dev-only token; TODO: replace with real JWT in production
+  if t.UserID != "" {
+    req2.Header.Set("Authorization", fmt.Sprintf("Bearer dev_user_%s", t.UserID))
+  }
+  if t.DeploymentID != "" {
+    req2.Header.Set("X-Deployment-ID", t.DeploymentID)
+  }
 
   // Use the base transport, or default if nil
   base := t.Base
   if base == nil {
     base = http.DefaultTransport
   }
   return base.RoundTrip(req2)
 }

25-38: Remove or wire Config.DeploymentID
Config.DeploymentID is declared in the client config but never stored or used in New or any RPC call. Either drop this field from the Config struct or assign it to Client and apply it to outgoing requests (e.g. set the RPC DeploymentId or include it in your transport).


194-202: Remove tenantID and its accessor from the Metald client
In go/deploy/metald/client/client.go, the tenantID field is never assigned in New and GetTenantID() will always return an empty string. Drop both the field and method. If you need to expose Config.DeploymentID, add a GetDeploymentID() accessor instead.

go/deploy/metald/client/cmd/metald-cli/main.go (3)

51-61: Pass deployment-id into the client config.

You define the flag but don’t wire it. This blocks header propagation if the client uses it.

 config := client.Config{
   ServerAddress:    *serverAddr,
   UserID:           *userID,
+  DeploymentID:     *deploymentID,
   TLSMode:          *tlsMode,
   SPIFFESocketPath: *spiffeSocket,
   TLSCertFile:      *tlsCert,
   TLSKeyFile:       *tlsKey,
   TLSCAFile:        *tlsCA,
   Timeout:          *timeout,
 }

118-141: Update usage: remove config-gen and config-validate if they’re gone.

Help text still lists deprecated commands, which confuses users.

   reboot <vm-id> [force]      Reboot a running VM
   create-and-boot [vm-id]     Create and immediately boot a VM
-  config-gen                  Generate a VM configuration file
-  config-validate <file>      Validate a VM configuration file

429-434: Fix newline in debug log.

"/n" prints literally; use "\n". Also consider gating behind a -v flag.

-log.Printf("createReq: %+v/n", createReq)
+log.Printf("createReq: %+v\n", createReq)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f4183e2 and d1b273d.

⛔ Files ignored due to path filters (2)
  • go/deploy/metald/client/go.sum is excluded by !**/*.sum
  • go/deploy/metald/go.sum is excluded by !**/*.sum
📒 Files selected for processing (15)
  • go/deploy/metald/client/client.go (6 hunks)
  • go/deploy/metald/client/cmd/metald-cli/main.go (18 hunks)
  • go/deploy/metald/client/config.go (0 hunks)
  • go/deploy/metald/client/go.mod (1 hunks)
  • go/deploy/metald/client/types.go (0 hunks)
  • go/deploy/metald/cmd/metald/main.go (0 hunks)
  • go/deploy/metald/go.mod (4 hunks)
  • go/deploy/metald/internal/service/auth.go (1 hunks)
  • go/deploy/metald/internal/service/deployment.go (1 hunks)
  • go/deploy/metald/internal/service/vm.go (0 hunks)
  • go/deploy/pkg/observability/interceptors/client.go (0 hunks)
  • go/deploy/pkg/observability/interceptors/interceptors.go (0 hunks)
  • go/deploy/pkg/observability/interceptors/logging.go (0 hunks)
  • go/deploy/pkg/observability/interceptors/metrics.go (0 hunks)
  • go/deploy/pkg/observability/interceptors/tenant.go (0 hunks)
💤 Files with no reviewable changes (9)
  • go/deploy/pkg/observability/interceptors/metrics.go
  • go/deploy/metald/cmd/metald/main.go
  • go/deploy/pkg/observability/interceptors/logging.go
  • go/deploy/pkg/observability/interceptors/interceptors.go
  • go/deploy/metald/internal/service/vm.go
  • go/deploy/pkg/observability/interceptors/client.go
  • go/deploy/metald/client/config.go
  • go/deploy/pkg/observability/interceptors/tenant.go
  • go/deploy/metald/client/types.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-01T15:10:44.959Z
Learnt from: imeyer
PR: unkeyed/unkey#3899
File: go/proto/metald/v1/deployment.proto:7-15
Timestamp: 2025-09-01T15:10:44.959Z
Learning: In the unkey/unkey repository, the metald service receives deployment_id values from upstream services rather than generating them internally. The deployment_id field in DeploymentRequest is part of a service-to-service communication pattern.

Applied to files:

  • go/deploy/metald/internal/service/deployment.go
📚 Learning: 2025-07-15T14:59:30.212Z
Learnt from: chronark
PR: unkeyed/unkey#3560
File: go/deploy/metald/internal/database/repository.go:0-0
Timestamp: 2025-07-15T14:59:30.212Z
Learning: go/deploy/metald cannot currently import helpers from go/pkg/db because it is not yet part of the main Go module; avoid suggesting such imports until the modules are unified.

Applied to files:

  • go/deploy/metald/client/go.mod
  • go/deploy/metald/go.mod
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Test Packages / Test
  • GitHub Check: Build / Build
🔇 Additional comments (11)
go/deploy/metald/client/go.mod (1)

11-24: Adjust go mod tidy -compat flag to your Go toolchain
The -compat=1.25 flag isn’t supported (max on your setup is Go 1.24.1). Replace with:

# In go/deploy/metald/client
go mod tidy -compat=1.24.1   # or omit -compat if on Go ≤1.24
git diff -- go.mod go.sum

govulncheck ./... || true

go list -m -json \
  google.golang.org/grpc \
  google.golang.org/protobuf \
  google.golang.org/genproto/googleapis/rpc \
  | sed -n 's/^{\|"Path"\|"Version"\|"Time"\|}/&/p'

Confirm no CVEs and that these modules stay in sync with others.

⛔ Skipped due to learnings
Learnt from: chronark
PR: unkeyed/unkey#3560
File: go/deploy/metald/internal/database/repository.go:0-0
Timestamp: 2025-07-15T14:59:30.212Z
Learning: go/deploy/metald cannot currently import helpers from go/pkg/db because it is not yet part of the main Go module; avoid suggesting such imports until the modules are unified.
go/deploy/metald/internal/service/deployment.go (1)

17-46: The verification script is running and will fetch the proto definition and Go imports for inspection.

go/deploy/metald/internal/service/auth.go (1)

3-186: Verify default interceptors include authentication enforcement
I couldn’t locate the NewDefaultInterceptors definition in the repo—please confirm that it wires in tenant-auth and bearer-token checks. If it doesn’t, add a minimal, opt-out Bearer interceptor (gated by METALD_AUTH_DISABLED) as shown in the original suggestion.

go/deploy/metald/go.mod (2)

16-25: OTel/OTLP/grpc versions look consistent; nice bump.

The OTel stack is aligned at 1.38.0 and grpc at 1.75.0; good modernization.

Also applies to: 79-86


83-86: Ensure genproto pseudo-versions match grpc v1.75.0
Confirm that the two genproto entries in go/deploy/metald/go.mod (google.golang.org/genproto/googleapis/api and rpc at v0.0.0-20250826171959-ef028d996bc1) exactly align with the upstream grpc v1.75.0 go.mod pins; any drift can cause symbol mismatches in transitive deps.

go/deploy/metald/client/client.go (1)

112-192: RPC method rewrites to metald/v1 look good.

Thin wrappers around connect.NewRequest with proper error wrapping; returning resp.Msg keeps the API clean.

go/deploy/metald/client/cmd/metald-cli/main.go (5)

338-350: Good: consistent JSON/text outputs and state echoing across pause/resume/reboot.

User-facing UX is consistent and predictable.

Also applies to: 365-376, 407-410


462-499: Create-deployment validates inputs and surfaces IDs cleanly.

Solid UX, returns IDs and counts; good defaults for vm-count.


160-172: Verify and wire CLI options into VmConfig
CreateVmRequest.Config is empty—map your CLI flags to VmConfig (e.g. VcpuCount, MemorySizeMib, DockerImage, Template/Profile, ForceBuild) and confirm each field name in your generated metaldv1.pb.go.


470-499: Full deployment request correct; no FieldMask support
Verified that UpdateDeploymentRequest in deployment.proto (line 27) and the generated Go struct in deployment.pb.go (line 188) include no FieldMask field—partial updates aren’t supported, so sending all fields is appropriate.


321-327: No action required: handleList reads CPUs and memory from the top-level VmInfo fields (ListVmsResponse.Vms []*VmInfo), while handleInfo reads them from the nested VmConfig in GetVmInfoResponse.Config, exactly as defined in the proto.

Likely an incorrect or invalid review comment.

@chronark chronark disabled auto-merge September 5, 2025 15:46
@chronark chronark merged commit b348aec into main Sep 5, 2025
19 checks passed
@chronark chronark deleted the push-yvnylzlopxok branch September 5, 2025 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

No Rabbit Disables CodeRabbit auto reviews

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants