Skip to content

feat: vault as a service#4555

Merged
chronark merged 31 commits intomainfrom
vault
Jan 14, 2026
Merged

feat: vault as a service#4555
chronark merged 31 commits intomainfrom
vault

Conversation

@chronark
Copy link
Collaborator

@chronark chronark commented Jan 7, 2026

this takes our existing pk/vault and creates a new standalone service, so we don't share our master keys everywhere

it's just the new service created, nothing is using it yet

merge after #4540

@vercel
Copy link

vercel bot commented Jan 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
dashboard Ready Ready Preview, Comment Jan 14, 2026 9:36am
1 Skipped Deployment
Project Deployment Review Updated (UTC)
engineering Ignored Ignored Preview Jan 14, 2026 9:36am

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 7, 2026

📝 Walkthrough

Walkthrough

Introduces a Vault service with keyring-backed DEK lifecycle, S3 and in-memory storage implementations, CLI and run wiring, protobuf validation, extensive unit/integration/fuzz tests, and Bazel/CI/build tooling additions and updates.

Changes

Cohort / File(s) Summary
Proto & generation
\buf.yaml`, `svc/vault/proto/buf.gen.yaml`, `svc/vault/proto/vault/v1/service.proto`, `svc/vault/proto/generate.go``
New proto module entry and buf gen config; added protovalidate import and max_len rules; removed EncryptBulk/CreateDEK/ReEncryptDEKs messages/RPCs; go:generate for buf.
Service config & run
\svc/vault/config.go`, `svc/vault/run.go`, `.bazelignore`, `Dockerfile``
Added Config type with Validate; Run entrypoint that wires storage, service, HTTP server and graceful shutdown; .bazelignore updated; Dockerfile switched to Bazel build flow.
Service core & RPCs
\svc/vault/internal/vault/*.go` (service.go, auth.go, create_dek.go, rpc_encrypt.go, rpc_decrypt.go, rpc_reencrypt.go, rpc_liveness.go, roll_deks.go, tests)`
New Service constructor, master-key loading, bearer auth, DEK cache, RPC handlers Encrypt/Decrypt/ReEncrypt/Liveness, RollDeks, and tests.
Keyring
\svc/vault/internal/keyring/*``
New Keyring type and constructor; DEK generation, encrypt/encode, decode/decrypt, Get/GetLatest/GetOrCreate, RollKeys, storage persistence, tracing, and fuzz/unit tests.
Storage layer
\svc/vault/internal/storage/*` (interface.go, memory.go, s3.go, middleware/tracing.go`, tests, BUILD.bazel)
Added storage interface, in-memory implementation, S3-backed implementation with bucket management, tracing middleware, tests and Bazel targets.
Keys utilities
\svc/vault/internal/keys/*``
Utilities to generate random keys and master KEKs, protobuf marshal and base64 encoding.
CLI & build wiring
\cmd/vault/main.go`, `cmd/vault/BUILD.bazel`, `cmd/run/main.go`, `cmd/run/BUILD.bazel`, `Makefile`, `go.mod``
New vault CLI command and BUILD rule; integrated into run command; Makefile fuzz target; go.mod/toolchain and dependency updates.
Integration tests
\svc/vault/integration/*``
New S3-backed integration tests: coldstart, migrate_deks, reencryption, reusing_deks.
Unit tests & fuzzing
\svc/vault/internal/vault/*_test.go`, `svc/vault/internal/keyring/fuzz_test.go`, `svc/vault/internal/storage/memory_test.go``, fuzz tests
Many new unit and fuzz tests for auth, RPCs, keyring, storage, corruption detection and fuzz suites.
Encryption & proto helpers
\pkg/encryption/aes.go`, `pkg/proto/marshal.go``
Added AES-GCM Decrypt helper with argument validation; thin protobuf Marshal/Unmarshal wrappers.
Library & test helpers
\pkg/assert/not_empty.go`, `pkg/fuzz/*`, `pkg/dockertest/redis.go`, `pkg/fuzz/BUILD.bazel``
Broadened NotEmpty generic constraint to include []string; added fuzz helpers and BUILD targets; explicit dockertest container fields.
API surface & callsite updates
\pkg/vault/create_dek.go`, `pkg/vault/encrypt_bulk.go`, `pkg/vault/integration/`, `web/apps/agent/services/vault/``
CreateDEK API changed to accept keyring string and return keyID string; EncryptBulk removed; call sites and tests updated accordingly; some struct field accesses replaced with protobuf getters.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant VaultService
    participant Keyring
    participant Storage
    participant Encryption

    Client->>VaultService: Encrypt(keyring, plaintext)
    VaultService->>VaultService: Check DEK cache
    alt cache miss
        VaultService->>Keyring: GetLatestKey(keyring)
        Keyring->>Storage: GetObject("keyring/{id}/LATEST")
        Storage-->>Keyring: encrypted DEK bytes
        Keyring->>Encryption: Decrypt using KEK -> DEK
        Keyring-->>VaultService: DataEncryptionKey
        VaultService->>VaultService: Store DEK in cache
    end
    VaultService->>Encryption: Encrypt(DEK, plaintext)
    Encryption-->>VaultService: nonce,ciphertext
    VaultService-->>Client: EncryptResponse(encrypted, keyId)
Loading
sequenceDiagram
    participant Client
    participant VaultService
    participant Keyring
    participant Storage
    participant Encryption

    Client->>VaultService: Decrypt(keyring, encrypted)
    VaultService->>VaultService: Check DEK cache
    alt cache miss
        VaultService->>Keyring: GetKey(keyring, keyId)
        Keyring->>Storage: GetObject("keyring/{id}/{keyId}")
        Storage-->>Keyring: encrypted DEK bytes
        Keyring->>Encryption: Decrypt using KEK -> DEK
        Keyring-->>VaultService: DataEncryptionKey
        VaultService->>VaultService: Store DEK in cache
    end
    VaultService->>Encryption: Decrypt(DEK, nonce, ciphertext)
    Encryption-->>VaultService: plaintext
    VaultService-->>Client: DecryptResponse(plaintext)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is minimal and does not follow the provided template structure. It lacks sections for 'What does this PR do', 'Type of change', 'How should this be tested', and the required checklist items. Complete the PR description template including: detailed summary of changes, type of change selection, testing instructions, and verification of required checklist items before merging.
Docstring Coverage ⚠️ Warning Docstring coverage is 54.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: vault as a service' clearly and concisely summarizes the main change: extracting the vault into a new standalone service.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


🧹 Recent nitpick comments
Makefile (1)

115-125: Good addition for fuzz test discovery and execution.

The logic correctly discovers fuzz functions and runs them with appropriate flags. A few minor observations:

  1. Missing blank line: There's no blank line between this target and the next .PHONY: unkey (line 126), which is inconsistent with the rest of the file's formatting.

  2. Body length (optional): Static analysis flags that the body exceeds 5 lines. If this becomes a concern, consider extracting to a helper script (e.g., scripts/fuzz.sh).

♻️ Optional: Add missing blank line for consistency
 		done; \
 	done
+
 .PHONY: unkey

📜 Recent review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a688be2 and 05cad93.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • Makefile
  • dev/docker-compose.yaml
  • go.mod
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/actions/setup-node/action.yaml:0-0
Timestamp: 2025-08-08T15:10:46.436Z
Learning: Repo: unkeyed/unkey — Preference: If imeyer comments “issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/workflows/job_detect_changes.yaml:111-114
Timestamp: 2025-08-08T16:10:00.224Z
Learning: Repo: unkeyed/unkey — When imeyer writes “make me an issue” on a PR thread, automatically open a thorough GitHub issue (Summary, Impact, Where, Repro, Observed vs Expected, Acceptance Criteria, Validation Plan, Out of Scope, References), include backlinks to the PR and comment, and assign to imeyer.
📚 Learning: 2025-08-08T15:09:01.312Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3753
File: go/apps/api/openapi/config.yaml:9-10
Timestamp: 2025-08-08T15:09:01.312Z
Learning: Repo unkeyed/unkey: For go/apps/api/openapi, oapi-codegen is used and does not support OpenAPI 3.1 union types like [T, "null"]; an overlay step is required to downconvert to 3.0-style nullable before code generation.

Applied to files:

  • go.mod
📚 Learning: 2025-08-08T15:09:01.312Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3753
File: go/apps/api/openapi/config.yaml:9-10
Timestamp: 2025-08-08T15:09:01.312Z
Learning: Repo unkeyed/unkey: In go/apps/api/openapi, oapi-codegen doesn’t support OpenAPI 3.1 union nullability; overlay.yaml must be applied before codegen. The overlay key in oapi-codegen config isn’t supported—use a pre-step (programmatic or CLI) to merge overlay into the bundled spec, then run oapi-codegen.

Applied to files:

  • go.mod
📚 Learning: 2025-09-01T01:57:42.227Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/proto/metald/v1/metald.proto:5-9
Timestamp: 2025-09-01T01:57:42.227Z
Learning: In the unkeyed/unkey repository, buf is configured to properly resolve metald proto imports like "metald/v1/vm.proto" without needing the full "go/proto/" prefix. The buf lint command `buf lint --path proto/metald` passes successfully with these relative import paths.

Applied to files:

  • go.mod
📚 Learning: 2025-07-15T14:59:30.212Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3560
File: go/deploy/metald/internal/database/repository.go:0-0
Timestamp: 2025-07-15T14:59:30.212Z
Learning: go/deploy/metald cannot currently import helpers from go/pkg/db because it is not yet part of the main Go module; avoid suggesting such imports until the modules are unified.

Applied to files:

  • go.mod
📚 Learning: 2025-09-01T02:33:43.791Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/buf.gen.yaml:0-0
Timestamp: 2025-09-01T02:33:43.791Z
Learning: In the unkeyed/unkey repository, buf commands are executed from the `go/` directory where the `buf.yaml` file is located. This means the `out: gen` configuration in `go/buf.gen.yaml` generates files to `go/gen/` relative to the repository root, which aligns with the expected directory structure.

Applied to files:

  • go.mod
📚 Learning: 2025-09-12T08:01:20.792Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3944
File: go/pkg/db/acme_challenge_update_verified_with_expiry.sql_generated.go:31-39
Timestamp: 2025-09-12T08:01:20.792Z
Learning: Do not review or suggest changes to files with sql_generated.go suffix or other files marked as auto-generated (containing "Code generated by" comments), as these are generated by tools like sqlc and changes would be overwritten on regeneration.

Applied to files:

  • go.mod
📚 Learning: 2026-01-13T17:46:51.335Z
Learnt from: CR
Repo: unkeyed/unkey PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-13T17:46:51.335Z
Learning: Organize the Go project structure with dedicated directories: cmd/ for CLI entrypoints, svc/ for backend services, pkg/ for shared libraries, proto/ for protocol buffer definitions, and gen/ for generated code

Applied to files:

  • go.mod
📚 Learning: 2025-08-08T14:55:11.981Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: deployment/docker-compose.yaml:179-184
Timestamp: 2025-08-08T14:55:11.981Z
Learning: In deployment/docker-compose.yaml (development only), MinIO is configured with API on 3902 and console on 3903; ports should map 3902:3902 and 3903:3903 to match MINIO_API_PORT_NUMBER and MINIO_CONSOLE_PORT_NUMBER.

Applied to files:

  • dev/docker-compose.yaml
📚 Learning: 2025-06-02T11:08:56.397Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 3292
File: apps/dashboard/lib/vault.ts:80-97
Timestamp: 2025-06-02T11:08:56.397Z
Learning: The vault.ts file in apps/dashboard/lib/vault.ts is a duplicate of the vault package from the `api` directory and should be kept consistent with that original implementation.

Applied to files:

  • dev/docker-compose.yaml
📚 Learning: 2026-01-13T17:46:51.335Z
Learnt from: CR
Repo: unkeyed/unkey PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-13T17:46:51.335Z
Learning: Applies to **/*.go : Run `make fmt` before committing Go changes to format and lint code

Applied to files:

  • Makefile
📚 Learning: 2026-01-13T17:46:51.335Z
Learnt from: CR
Repo: unkeyed/unkey PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-13T17:46:51.335Z
Learning: Applies to **/*.go : Run `make bazel` after adding new Go files to sync BUILD.bazel files

Applied to files:

  • Makefile
📚 Learning: 2025-08-08T15:37:14.734Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3755
File: .github/workflows/build.yaml:14-17
Timestamp: 2025-08-08T15:37:14.734Z
Learning: Repo: unkeyed/unkey — CI behavior: We rely on CI=true to make pnpm install behave as --frozen-lockfile. Don’t suggest adding --frozen-lockfile in .github/actions/setup-node/action.yaml or workflows like .github/workflows/build.yaml.

Applied to files:

  • Makefile
🪛 checkmake (0.2.2)
Makefile

[warning] 116-116: Target body for "fuzz" exceeds allowed length of 5 (9).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test Packages / Test
  • GitHub Check: Bazel / Test
🔇 Additional comments (8)
dev/docker-compose.yaml (1)

22-22: LGTM!

Standard MySQL healthcheck command using mysqladmin ping. Appropriate for a development environment.

go.mod (7)

8-9: Buf-generated client updates look correct.

The updated Buf-generated protobuf and connectRPC clients follow the expected version format (semver + timestamp + commit hash). The versions align with the Vault service implementation in this PR.


130-131: AWS SDK internal dependency updates align with main SDK update.

The internal AWS SDK modules (configsources and endpoints) have been updated consistently with the main AWS SDK v2 update on line 13. These transitive dependencies are managed by the AWS SDK team.


144-144: Smithy-go update aligns with AWS SDK v2 update.

The smithy-go library update from v1.23.2 to v1.24.0 is consistent with the AWS SDK v2 update. This is the code generation framework underlying the AWS SDK.


516-522: Tool block is correctly formatted.

The tool block has been reorganized to the bottom of the file, which is a conventional placement. All listed tools (buf, golangci-lint, oapi-codegen, protoc-gen-go-restate, sqlc) are standard development and build tools for this Go project.


3-5: No issues found with Go version directives.

The go directive update to 1.25 with toolchain 1.25.1 is correct. Go 1.25 was released on August 12, 2025, and uses the modern module syntax (separate go and toolchain directives) properly.

Likely an incorrect or invalid review comment.


13-13: AWS SDK v2 v1.41.0 is valid with no known security advisories.

The AWS SDK for Go v2 has been bumped from v1.39.6 to v1.41.0. This is a standard minor version release with no published security advisories.


37-37: No breaking changes identified in v0.23.0 upgrade. The restatedev/sdk-go v0.23.0 release adds new APIs (WithValue and WrapContext) with no documented breaking changes. The update is safe to proceed.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vercel vercel bot temporarily deployed to Preview – engineering January 7, 2026 12:49 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard January 7, 2026 12:52 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard January 8, 2026 08:59 Inactive
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
svc/vault/proto/vault/v1/service.proto (1)

53-54: Incomplete comment.

The comment appears to be truncated: // ReEncrypt rec. Consider completing it to describe the RPC's purpose.

-  // ReEncrypt rec
+  // ReEncrypt re-encrypts data with a new key, optionally specifying a key_id
   rpc ReEncrypt(ReEncryptRequest) returns (ReEncryptResponse) {}
🤖 Fix all issues with AI agents
In @cmd/vault/main.go:
- Around line 28-29: The CLI defines a "region" flag but it is never used when
building the config in action(); either add a Region field to the config struct
and populate it from cli.String("region") inside action(), or remove the
cli.String("region", ...) flag entirely. Locate the flag named "region" and the
function action() and update the config construction to include Config.Region
(or delete the flag if the value is not needed) so the flag is not unused.
- Around line 40-47: The S3 flags (cli.String for "s3-url", "s3-bucket",
"s3-access-key-id", "s3-access-key-secret") are being validated in action() via
RequireString but are not declared required; update each flag definition to
include cli.Required() so the CLI framework enforces their presence (or
alternatively remove the RequireString calls in action() if you intend them to
be optional); reference the flag names above and the action() RequireString
usage when making the change.

In @deployment/docker-compose.yaml:
- Around line 120-147: The vault service in the docker-compose snippet binds
host port 7070 ("ports: - '7070:7070'") which conflicts with the existing
apiv2_lb service that already uses host port 7070; change the vault service's
host-side port to an unused port (e.g., 7071 or another free port) in the vault
service definition and update any other configs or services that call vault on
host:7070 (healthcheck targets, other services or env vars referencing
UNKEY_HTTP_PORT/UNKEY_VAULT_* endpoints) to the new host port so Docker Compose
no longer fails with "port is already allocated."

In @k8s/manifests/vault.yaml:
- Around line 35-42: The manifest currently hardcodes secrets for
UNKEY_VAULT_S3_ACCESS_KEY_ID, UNKEY_VAULT_S3_ACCESS_KEY_SECRET,
UNKEY_VAULT_MASTER_KEYS, and UNKEY_VAULT_BEARER_TOKEN in vault.yaml; create a
Kubernetes Secret containing these values and update the Deployment/Pod spec to
inject them via env.valueFrom.secretKeyRef (or envFrom: secretRef) instead of
literal value fields, and remove the plaintext values from vault.yaml so the
container references the secret keys by name.

In @svc/vault/integration/reencryption_test.go:
- Around line 78-93: The test currently discards the result of v.ReEncrypt and
then calls v.Decrypt with the original encrypted blob (enc.Msg.GetEncrypted()),
so re-encryption is not validated; capture the response from v.ReEncrypt into a
variable (e.g., reEnc) inside the loop (or after the final iteration) and pass
reEnc.Msg.GetEncrypted() to the decrypt request (decReq) before calling
v.Decrypt, then assert dec.Msg.GetPlaintext() equals the original data; update
references to use reEnc instead of enc when building decReq so the test actually
verifies the re-encrypted ciphertext.

In @svc/vault/internal/keyring/get_latest_key.go:
- Around line 23-26: The error check in GetLatestKey currently does a direct
comparison (err != storage.ErrObjectNotFound); change it to use errors.Is(err,
storage.ErrObjectNotFound) so wrapped errors are handled like GetOrCreateKey
does—update the conditional in the GetLatestKey function to call errors.Is and
keep the same tracing.RecordError(span, err) and return fmt.Errorf("failed to
get key: %w", err) behavior.

In @svc/vault/internal/storage/memory.go:
- Around line 63-75: In ListObjectKeys (method on memory) the prefix check is
inverted causing all keys to be skipped when prefix is empty; change the
condition to only skip keys when a non-empty prefix is provided and the key does
not have that prefix (e.g., replace the current if condition with one that
checks prefix != "" && !strings.HasPrefix(key, prefix)), so that an empty prefix
returns all keys and matching prefixes are filtered out correctly.

In @svc/vault/internal/vault/rpc_reencrypt.go:
- Line 30: s.keyCache.Clear(ctx) clears the entire cache which is too broad for
re-encryption; instead invalidate only entries for the keyring in question (use
req.Msg.GetKeyring()). Replace the global clear with a targeted call (e.g.
s.keyCache.ClearPrefix(ctx, keyring) or s.keyCache.ClearForKeyring(ctx,
keyring)) and, if the cache interface lacks such a method, add a keyring-scoped
invalidation method to the cache implementation and interface and invoke it from
the re-encrypt handler where s.keyCache.Clear(ctx) currently sits.

In @Tiltfile:
- Around line 322-326: The docker_build call uses a mismatched build context
'../svc/vault' versus dockerfile './svc/vault/Dockerfile'; update the
docker_build invocation (function name: docker_build) so the context and
Dockerfile paths are consistent—use './svc/vault' (or '.' if you intend repo
root as context) as the first argument to match './svc/vault/Dockerfile' as the
dockerfile parameter so Tilt can locate the Dockerfile and context correctly.
- Around line 333-340: The vault k8s_resource is configured with
port_forwards='7070:7070' which conflicts with the API service also using host
port 7070; change the vault resource's port_forwards to a non-conflicting host
port (e.g., '7071:7070' or another unused host port) and update the
corresponding status/print statement that reports the Vault forward (the print
on line 426) to reflect the new host port so logs remain accurate; locate the
k8s_resource call for 'vault' and the print statement that references the Vault
port and make the coordinated updates.
- Around line 416-420: The Tiltfile contains a duplicated literal "Services
running: %s" in the multi-line string used for output; remove the redundant line
so the message appears only once and update any associated format string
arguments used when rendering that template to account for the removed
placeholder (search for the multi-line string or the variable that holds that
template and the code that calls its formatting to ensure argument count still
matches).
🧹 Nitpick comments (18)
svc/vault/internal/keys/key.go (1)

10-20: Consider documenting key lifecycle security requirements.

The key generation logic is correct and uses crypto/rand appropriately. However, since this generates sensitive cryptographic material for a vault service, consider adding documentation about the caller's responsibility for secure key handling:

  • Zeroing key material from memory after use
  • Avoiding key copies
  • Secure key storage and transmission
📝 Suggested documentation addition
+// GenerateKey generates a new cryptographically secure 32-byte key with a unique identifier.
+// The caller is responsible for:
+//   - Zeroing the key from memory after use
+//   - Avoiding unnecessary key copies
+//   - Ensuring secure storage and transmission
 func GenerateKey(prefix uid.Prefix) (id string, key []byte, err error) {
k8s/manifests/vault.yaml (2)

19-27: Add securityContext to harden the container.

Static analysis flagged missing security constraints. Adding a securityContext prevents privilege escalation and enforces non-root execution.

🔒 Proposed securityContext addition
     spec:
       containers:
       - name: vault
         image: unkey/vault:latest
         command: ["run", "vault"]
+        securityContext:
+          allowPrivilegeEscalation: false
+          runAsNonRoot: true
+          runAsUser: 1000
+          readOnlyRootFilesystem: true
+          capabilities:
+            drop:
+              - ALL
         ports:
         - name: http
           containerPort: 7070
           protocol: TCP

22-22: Avoid using :latest tag for production deployments.

Using :latest makes deployments non-reproducible and can cause unexpected behavior when images are updated. Consider using a specific version tag or digest.

svc/vault/proto/vault/v1/service.proto (1)

45-46: Orphaned empty message types.

ReEncryptDEKsRequest and ReEncryptDEKsResponse are defined but no longer used by any RPC in the service. If the ReEncryptDEKs RPC was intentionally removed, consider removing these unused message definitions as well.

svc/vault/internal/storage/interface.go (2)

11-13: GetObjectOptions appears unused.

GetObjectOptions is defined but not referenced by any method in the Storage interface. If this is for future use, consider removing it until needed to keep the interface minimal.


19-20: Clarify the boolean return value in GetObject.

The bool return value's meaning isn't immediately clear from the signature. Consider adding documentation (e.g., "returns true if the object exists") or using a more explicit return type.

-	// GetObject returns the object data for the given key
-	GetObject(ctx context.Context, key string) ([]byte, bool, error)
+	// GetObject returns the object data for the given key.
+	// The boolean indicates whether the object was found.
+	// Returns ErrObjectNotFound if the object does not exist.
+	GetObject(ctx context.Context, key string) (data []byte, found bool, err error)
svc/vault/internal/vault/roll_deks.go (1)

31-31: Prefer keyring.GetLatestKey() for consistency.

Line 31 compares kekID with s.encryptionKey.GetId(), while Line 37 uses s.keyring.EncryptAndEncodeKey() which internally uses the keyring's latest key. For consistency and to ensure both operations reference the same latest key, consider retrieving the latest key ID from the keyring.

♻️ Proposed refactor
+		latestKek, err := s.keyring.GetLatestKey(ctx)
+		if err != nil {
+			return fmt.Errorf("failed to get latest key: %w", err)
+		}
-		if kekID == s.encryptionKey.GetId() {
+		if kekID == latestKek.GetId() {
 			s.logger.Info("key already encrypted with latest kek",
 				"dekId", dek.GetId(),
 			)
svc/vault/internal/keyring/decode_and_decrypt_key.go (1)

13-42: Implementation is correct and well-instrumented with tracing.

The decryption flow properly handles:

  1. Unmarshaling the encrypted DEK envelope
  2. KEK lookup with appropriate error for missing keys
  3. Decryption and final unmarshaling

One minor observation: Line 17 uses proto.Unmarshal directly, while the codebase has a wrapper at pkg/proto/marshal.go. Consider using the wrapper for consistency if that's the project convention.

svc/vault/internal/keyring/create_key.go (1)

32-39: Consider atomicity of the dual storage writes.

The two PutObject calls (lines 32-35 and 36-39) are not atomic. If the first write succeeds but the second fails, the DEK is stored but the LATEST pointer is not updated, potentially causing inconsistency.

For now this is acceptable since:

  1. The DEK is still usable if referenced by ID
  2. A retry would create a new DEK and update LATEST

However, for increased reliability, consider implementing a transactional write or a recovery mechanism.

svc/vault/internal/keyring/keyring.go (1)

28-36: Consider validating required Config fields.

The constructor doesn't validate that essential fields like Store, Logger, or EncryptionKey are non-nil, which could lead to nil pointer dereferences in methods like EncryptAndEncodeKey or RollKeys.

Proposed validation
 func New(config Config) (*Keyring, error) {
+	if config.Store == nil {
+		return nil, fmt.Errorf("store is required")
+	}
+	if config.EncryptionKey == nil {
+		return nil, fmt.Errorf("encryption key is required")
+	}

 	return &Keyring{
 		store:          config.Store,
 		logger:         config.Logger,
 		encryptionKey:  config.EncryptionKey,
 		decryptionKeys: config.DecryptionKeys,
 	}, nil
 }
svc/vault/internal/vault/rpc_decrypt.go (1)

60-63: Add nil check before using DEK.

If the cache returns a hit with a nil or corrupted entry, dek.GetKey() could cause issues. Consider validating dek before use.

Suggested defensive check
+	if dek == nil {
+		return nil, fmt.Errorf("dek not found for keyring %s and key %s", req.GetKeyring(), encrypted.GetEncryptionKeyId())
+	}
+
 	plaintext, err := encryption.Decrypt(dek.GetKey(), encrypted.GetNonce(), encrypted.GetCiphertext())
svc/vault/internal/keyring/roll_keys.go (2)

19-26: Consider continuing on missing keys instead of failing.

If a key is deleted between ListObjectKeys and GetObject (race condition), returning ErrObjectNotFound aborts the entire roll operation. For resiliency, consider logging a warning and continuing to process remaining keys.

Proposed change
 		b, found, err := k.store.GetObject(ctx, objectKey)
 		if err != nil {
 			return fmt.Errorf("failed to get object: %w", err)
 		}
 		if !found {
-			return storage.ErrObjectNotFound
+			k.logger.Warn("key no longer exists, skipping",
+				"objectKey", objectKey,
+			)
+			continue
 		}

11-49: Partial failures leave keyring in inconsistent state.

If an error occurs mid-iteration (e.g., re-encryption fails for one key), some keys will be rolled to the new KEK while others remain on the old one. Consider:

  1. Collecting errors and continuing to process all keys, then returning a combined error
  2. Documenting that RollKeys is idempotent and can be retried on failure
svc/vault/integration/reusing_deks_test.go (1)

3-19: Consider grouping imports consistently.

Standard library imports (context, testing, fmt, time) should be grouped together, followed by external packages.

♻️ Suggested import grouping
 import (
 	"context"
+	"fmt"
 	"testing"
-
-	"fmt"
 	"time"
 
 	"connectrpc.com/connect"
svc/vault/internal/storage/s3.go (3)

63-65: String-based error matching is fragile for bucket existence check.

Using strings.Contains(err.Error(), "BucketAlreadyOwnedByYou") is fragile. Consider using AWS SDK's typed errors with errors.As for more robust error handling.

♻️ Suggested improvement using typed errors
+	"errors"
+	"github.com/aws/smithy-go"
 ...
-	if err != nil && !strings.Contains(err.Error(), "BucketAlreadyOwnedByYou") {
+	var apiErr smithy.APIError
+	if err != nil && !(errors.As(err, &apiErr) && apiErr.ErrorCode() == "BucketAlreadyOwnedByYou") {
 		return nil, fmt.Errorf("failed to create bucket: %w", err)
 	}

97-103: String-based 404 detection is fragile.

Prefer using AWS SDK v2's typed error handling with errors.As and *types.NoSuchKey instead of string matching against the error message.

♻️ Suggested improvement
+	"github.com/aws/aws-sdk-go-v2/service/s3/types"
 ...
 	if err != nil {
-		if strings.Contains(err.Error(), "StatusCode: 404") {
+		var noSuchKey *types.NoSuchKey
+		if errors.As(err, &noSuchKey) {
 			return nil, false, nil
 		}
 		return nil, false, fmt.Errorf("failed to get object: %w", err)
 	}

112-128: ListObjectKeys doesn't handle pagination.

ListObjectsV2 returns at most 1000 objects per call. If more objects exist, the response will be truncated. Consider using a paginator to handle large key sets.

♻️ Suggested pagination handling
 func (s *s3) ListObjectKeys(ctx context.Context, prefix string) ([]string, error) {
-	input := &awsS3.ListObjectsV2Input{
-		Bucket: aws.String(s.config.S3Bucket),
-	}
-	if prefix != "" {
-		input.Prefix = aws.String(prefix)
-	}
-
-	o, err := s.client.ListObjectsV2(ctx, input)
-	if err != nil {
-		return nil, fmt.Errorf("failed to list objects: %w", err)
-	}
-	keys := make([]string, len(o.Contents))
-	for i, obj := range o.Contents {
-		keys[i] = *obj.Key
+	var keys []string
+	paginator := awsS3.NewListObjectsV2Paginator(s.client, &awsS3.ListObjectsV2Input{
+		Bucket: aws.String(s.config.S3Bucket),
+		Prefix: aws.String(prefix),
+	})
+	for paginator.HasMorePages() {
+		page, err := paginator.NextPage(ctx)
+		if err != nil {
+			return nil, fmt.Errorf("failed to list objects: %w", err)
+		}
+		for _, obj := range page.Contents {
+			keys = append(keys, *obj.Key)
+		}
 	}
 	return keys, nil
 }
svc/vault/internal/vault/auth_test.go (1)

12-80: Excellent authentication test coverage!

The test suite comprehensively covers the key authentication scenarios:

  • Valid Bearer token (success case)
  • Missing Authorization header
  • Invalid authentication scheme
  • Empty token
  • Invalid/wrong token

Each test is well-structured, focused, and uses appropriate assertions.

♻️ Optional: Consider table-driven tests to reduce duplication

The current tests share similar structure. If you prefer, you could refactor to a table-driven approach:

+func TestAuthenticate(t *testing.T) {
+	tests := []struct {
+		name           string
+		setupAuth      func(service *Service, req *connect.Request[vaultv1.EncryptRequest])
+		expectError    bool
+		expectedCode   connect.Code
+	}{
+		{
+			name: "Success",
+			setupAuth: func(service *Service, req *connect.Request[vaultv1.EncryptRequest]) {
+				req.Header().Set("Authorization", fmt.Sprintf("Bearer %s", service.bearer))
+			},
+			expectError: false,
+		},
+		{
+			name: "MissingHeader",
+			setupAuth: func(service *Service, req *connect.Request[vaultv1.EncryptRequest]) {
+				// No header set
+			},
+			expectError:  true,
+			expectedCode: connect.CodeUnauthenticated,
+		},
+		{
+			name: "InvalidScheme",
+			setupAuth: func(service *Service, req *connect.Request[vaultv1.EncryptRequest]) {
+				req.Header().Set("Authorization", "Basic "+service.bearer)
+			},
+			expectError:  true,
+			expectedCode: connect.CodeUnauthenticated,
+		},
+		{
+			name: "EmptyToken",
+			setupAuth: func(service *Service, req *connect.Request[vaultv1.EncryptRequest]) {
+				req.Header().Set("Authorization", "Bearer ")
+			},
+			expectError:  true,
+			expectedCode: connect.CodeUnauthenticated,
+		},
+		{
+			name: "InvalidToken",
+			setupAuth: func(service *Service, req *connect.Request[vaultv1.EncryptRequest]) {
+				req.Header().Set("Authorization", "Bearer wrong-token")
+			},
+			expectError:  true,
+			expectedCode: connect.CodeUnauthenticated,
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			service := setupTestService(t)
+			req := connect.NewRequest(&vaultv1.EncryptRequest{
+				Keyring: "test",
+				Data:    "test",
+			})
+			tt.setupAuth(service, req)
+
+			err := service.authenticate(req)
+			if tt.expectError {
+				require.Error(t, err)
+				require.Equal(t, tt.expectedCode, connect.CodeOf(err))
+			} else {
+				require.NoError(t, err)
+			}
+		})
+	}
+}

This is entirely optional—the current approach is clear and perfectly acceptable.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4982804 and 5be8ab1.

⛔ Files ignored due to path filters (4)
  • buf.lock is excluded by !**/*.lock
  • gen/proto/vault/v1/service.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • gen/proto/vault/v1/vaultv1connect/service.connect.go is excluded by !**/gen/**
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (51)
  • Tiltfile
  • buf.yaml
  • cmd/vault/main.go
  • deployment/docker-compose.yaml
  • go.mod
  • k8s/manifests/vault.yaml
  • pkg/assert/not_empty.go
  • pkg/proto/marshal.go
  • pkg/vault/create_dek.go
  • pkg/vault/encrypt_bulk.go
  • pkg/vault/integration/coldstart_test.go
  • pkg/vault/integration/migrate_deks_test.go
  • pkg/vault/integration/reencryption_test.go
  • svc/vault/config.go
  • svc/vault/integration/coldstart_test.go
  • svc/vault/integration/migrate_deks_test.go
  • svc/vault/integration/reencryption_test.go
  • svc/vault/integration/reusing_deks_test.go
  • svc/vault/internal/keyring/create_key.go
  • svc/vault/internal/keyring/decode_and_decrypt_key.go
  • svc/vault/internal/keyring/encrypt_and_encode_key.go
  • svc/vault/internal/keyring/get_key.go
  • svc/vault/internal/keyring/get_latest_key.go
  • svc/vault/internal/keyring/get_or_create_key.go
  • svc/vault/internal/keyring/keyring.go
  • svc/vault/internal/keyring/roll_keys.go
  • svc/vault/internal/keys/key.go
  • svc/vault/internal/keys/master_key.go
  • svc/vault/internal/storage/interface.go
  • svc/vault/internal/storage/memory.go
  • svc/vault/internal/storage/middleware/tracing.go
  • svc/vault/internal/storage/s3.go
  • svc/vault/internal/vault/auth.go
  • svc/vault/internal/vault/auth_test.go
  • svc/vault/internal/vault/create_dek.go
  • svc/vault/internal/vault/roll_deks.go
  • svc/vault/internal/vault/rpc_decrypt.go
  • svc/vault/internal/vault/rpc_decrypt_test.go
  • svc/vault/internal/vault/rpc_encrypt.go
  • svc/vault/internal/vault/rpc_encrypt_test.go
  • svc/vault/internal/vault/rpc_liveness.go
  • svc/vault/internal/vault/rpc_liveness_test.go
  • svc/vault/internal/vault/rpc_reencrypt.go
  • svc/vault/internal/vault/rpc_reencrypt_test.go
  • svc/vault/internal/vault/service.go
  • svc/vault/internal/vault/service_test.go
  • svc/vault/proto/buf.gen.yaml
  • svc/vault/proto/generate.go
  • svc/vault/proto/vault/v1/object.proto
  • svc/vault/proto/vault/v1/service.proto
  • svc/vault/run.go
💤 Files with no reviewable changes (1)
  • pkg/vault/encrypt_bulk.go
🧰 Additional context used
🧠 Learnings (9)
📚 Learning: 2025-09-01T02:33:43.791Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/buf.gen.yaml:0-0
Timestamp: 2025-09-01T02:33:43.791Z
Learning: In the unkeyed/unkey repository, buf commands are executed from the `go/` directory where the `buf.yaml` file is located. This means the `out: gen` configuration in `go/buf.gen.yaml` generates files to `go/gen/` relative to the repository root, which aligns with the expected directory structure.

Applied to files:

  • svc/vault/proto/buf.gen.yaml
📚 Learning: 2025-09-01T01:57:42.227Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/proto/metald/v1/metald.proto:5-9
Timestamp: 2025-09-01T01:57:42.227Z
Learning: In the unkeyed/unkey repository, buf is configured to properly resolve metald proto imports like "metald/v1/vm.proto" without needing the full "go/proto/" prefix. The buf lint command `buf lint --path proto/metald` passes successfully with these relative import paths.

Applied to files:

  • svc/vault/proto/buf.gen.yaml
  • svc/vault/proto/generate.go
  • svc/vault/proto/vault/v1/service.proto
  • buf.yaml
  • go.mod
📚 Learning: 2025-09-01T15:10:44.959Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/proto/metald/v1/deployment.proto:7-15
Timestamp: 2025-09-01T15:10:44.959Z
Learning: In the unkey/unkey repository, the metald service receives deployment_id values from upstream services rather than generating them internally. The deployment_id field in DeploymentRequest is part of a service-to-service communication pattern.

Applied to files:

  • k8s/manifests/vault.yaml
📚 Learning: 2025-09-01T16:08:10.327Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/proto/metald/v1/deployment.proto:45-52
Timestamp: 2025-09-01T16:08:10.327Z
Learning: In the unkey/unkey repository metald service, the host:port pairs in Instance messages within GetDeploymentResponse are consumed by upstream services in a tightly controlled manner, with the host:port being mapped from DNS entries rather than being used for arbitrary client connections.

Applied to files:

  • k8s/manifests/vault.yaml
📚 Learning: 2025-09-24T18:57:34.843Z
Learnt from: mcstepp
Repo: unkeyed/unkey PR: 4010
File: QUICKSTART-DEPLOY.md:17-17
Timestamp: 2025-09-24T18:57:34.843Z
Learning: In the Unkey deployment platform, API key environment variables use component-specific naming but share the same secret value: UNKEY_API_KEY for the ctrl service (validator), API_KEY for the CLI client, and CTRL_API_KEY for the dashboard client. The ctrl service acts as the source of truth for validation.

Applied to files:

  • k8s/manifests/vault.yaml
📚 Learning: 2025-09-11T14:12:30.570Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3944
File: go/apps/ctrl/run.go:292-299
Timestamp: 2025-09-11T14:12:30.570Z
Learning: In go/apps/ctrl/config.go, the Validate() method already includes validation for Cloudflare API token when ACME and Cloudflare are both enabled, checking that cfg.Acme.Cloudflare.ApiToken is not empty.

Applied to files:

  • svc/vault/config.go
📚 Learning: 2025-09-11T14:12:30.570Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3944
File: go/apps/ctrl/run.go:292-299
Timestamp: 2025-09-11T14:12:30.570Z
Learning: In go/apps/ctrl/config.go, the Validate() method already includes validation for Cloudflare API token using assert.NotEmpty() when both c.Acme.Enabled and c.Acme.Cloudflare.Enabled are true, with the error message "cloudflare API token is required when cloudflare is enabled".

Applied to files:

  • svc/vault/config.go
  • go.mod
📚 Learning: 2025-10-15T10:12:40.810Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4098
File: go/proto/ctrl/v1/deployment.proto:33-36
Timestamp: 2025-10-15T10:12:40.810Z
Learning: In the Unkey codebase proto files (ctrl/v1/build.proto, ctrl/v1/deployment.proto, hydra/v1/deployment.proto), use `dockerfile_path` (not `docker_file_path`) for consistency in generated Go field names.

Applied to files:

  • buf.yaml
📚 Learning: 2025-07-15T14:59:30.212Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3560
File: go/deploy/metald/internal/database/repository.go:0-0
Timestamp: 2025-07-15T14:59:30.212Z
Learning: go/deploy/metald cannot currently import helpers from go/pkg/db because it is not yet part of the main Go module; avoid suggesting such imports until the modules are unified.

Applied to files:

  • go.mod
🧬 Code graph analysis (28)
svc/vault/internal/vault/auth.go (1)
svc/vault/internal/vault/service.go (1)
  • Service (21-32)
svc/vault/internal/vault/create_dek.go (1)
svc/vault/internal/vault/service.go (1)
  • Service (21-32)
svc/vault/internal/vault/rpc_encrypt.go (2)
svc/vault/internal/vault/service.go (1)
  • LATEST (19-19)
gen/proto/vault/v1/service.pb.go (6)
  • EncryptRequest (105-111)
  • EncryptRequest (124-124)
  • EncryptRequest (139-141)
  • EncryptResponse (157-163)
  • EncryptResponse (176-176)
  • EncryptResponse (191-193)
svc/vault/internal/storage/interface.go (2)
svc/vault/internal/vault/service.go (1)
  • New (43-83)
svc/vault/internal/keyring/keyring.go (1)
  • New (28-36)
svc/vault/internal/keyring/get_key.go (2)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
svc/vault/internal/storage/interface.go (1)
  • ErrObjectNotFound (9-9)
svc/vault/integration/reusing_deks_test.go (7)
svc/vault/internal/storage/s3.go (2)
  • NewS3 (33-70)
  • S3Config (25-31)
svc/vault/internal/keys/master_key.go (1)
  • GenerateMasterKey (12-31)
svc/vault/internal/vault/service.go (2)
  • New (43-83)
  • Config (36-41)
svc/vault/internal/keyring/keyring.go (3)
  • New (28-36)
  • Config (20-26)
  • Keyring (11-18)
svc/vault/config.go (1)
  • Config (5-33)
svc/vault/internal/storage/interface.go (1)
  • Storage (15-30)
gen/proto/vault/v1/service.pb.go (3)
  • EncryptRequest (105-111)
  • EncryptRequest (124-124)
  • EncryptRequest (139-141)
svc/vault/internal/vault/roll_deks.go (2)
svc/vault/internal/vault/service.go (1)
  • Service (21-32)
svc/vault/internal/storage/interface.go (1)
  • ErrObjectNotFound (9-9)
svc/vault/internal/keyring/get_or_create_key.go (2)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
svc/vault/internal/storage/interface.go (1)
  • ErrObjectNotFound (9-9)
svc/vault/internal/vault/rpc_liveness_test.go (1)
gen/proto/vault/v1/service.pb.go (3)
  • LivenessRequest (25-29)
  • LivenessRequest (42-42)
  • LivenessRequest (57-59)
pkg/vault/create_dek.go (1)
svc/vault/internal/vault/service.go (1)
  • Service (21-32)
svc/vault/config.go (2)
svc/vault/internal/vault/service.go (1)
  • Config (36-41)
pkg/assert/not_empty.go (1)
  • NotEmpty (17-26)
svc/vault/internal/vault/rpc_decrypt.go (2)
svc/vault/internal/vault/service.go (1)
  • Service (21-32)
gen/proto/vault/v1/service.pb.go (6)
  • DecryptRequest (209-215)
  • DecryptRequest (228-228)
  • DecryptRequest (243-245)
  • DecryptResponse (261-266)
  • DecryptResponse (279-279)
  • DecryptResponse (294-296)
svc/vault/internal/vault/rpc_decrypt_test.go (3)
gen/proto/vault/v1/service.pb.go (6)
  • EncryptRequest (105-111)
  • EncryptRequest (124-124)
  • EncryptRequest (139-141)
  • DecryptRequest (209-215)
  • DecryptRequest (228-228)
  • DecryptRequest (243-245)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
cmd/unkey-env/main.go (1)
  • Encrypted (120-120)
svc/vault/internal/keyring/roll_keys.go (2)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
svc/vault/internal/storage/interface.go (1)
  • ErrObjectNotFound (9-9)
svc/vault/internal/keys/master_key.go (2)
svc/vault/internal/keys/key.go (1)
  • GenerateKey (10-20)
pkg/proto/marshal.go (1)
  • Marshal (7-9)
svc/vault/internal/keyring/encrypt_and_encode_key.go (3)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
pkg/proto/marshal.go (1)
  • Marshal (7-9)
cmd/unkey-env/main.go (1)
  • Encrypted (120-120)
svc/vault/internal/vault/rpc_reencrypt_test.go (1)
gen/proto/vault/v1/service.pb.go (6)
  • EncryptRequest (105-111)
  • EncryptRequest (124-124)
  • EncryptRequest (139-141)
  • ReEncryptRequest (305-313)
  • ReEncryptRequest (326-326)
  • ReEncryptRequest (341-343)
svc/vault/internal/keyring/get_latest_key.go (2)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
svc/vault/internal/storage/interface.go (1)
  • ErrObjectNotFound (9-9)
svc/vault/integration/migrate_deks_test.go (3)
svc/vault/internal/keys/master_key.go (1)
  • GenerateMasterKey (12-31)
svc/vault/internal/vault/service.go (2)
  • New (43-83)
  • Config (36-41)
gen/proto/vault/v1/service.pb.go (6)
  • EncryptRequest (105-111)
  • EncryptRequest (124-124)
  • EncryptRequest (139-141)
  • DecryptRequest (209-215)
  • DecryptRequest (228-228)
  • DecryptRequest (243-245)
svc/vault/internal/storage/memory.go (1)
svc/vault/internal/storage/interface.go (1)
  • Storage (15-30)
svc/vault/internal/vault/rpc_encrypt_test.go (1)
gen/proto/vault/v1/service.pb.go (3)
  • EncryptRequest (105-111)
  • EncryptRequest (124-124)
  • EncryptRequest (139-141)
svc/vault/run.go (5)
svc/vault/config.go (1)
  • Config (5-33)
svc/vault/internal/vault/service.go (2)
  • Config (36-41)
  • New (43-83)
svc/vault/internal/storage/s3.go (2)
  • NewS3 (33-70)
  • S3Config (25-31)
svc/vault/internal/storage/middleware/tracing.go (1)
  • WithTracing (18-23)
svc/vault/internal/storage/interface.go (1)
  • Storage (15-30)
svc/vault/internal/vault/rpc_liveness.go (2)
svc/vault/internal/vault/service.go (1)
  • Service (21-32)
gen/proto/vault/v1/service.pb.go (6)
  • LivenessRequest (25-29)
  • LivenessRequest (42-42)
  • LivenessRequest (57-59)
  • LivenessResponse (61-66)
  • LivenessResponse (79-79)
  • LivenessResponse (94-96)
svc/vault/internal/storage/s3.go (1)
svc/vault/internal/storage/interface.go (1)
  • Storage (15-30)
svc/vault/internal/storage/middleware/tracing.go (1)
svc/vault/internal/storage/interface.go (1)
  • Storage (15-30)
svc/vault/internal/vault/auth_test.go (2)
gen/proto/vault/v1/service.pb.go (3)
  • EncryptRequest (105-111)
  • EncryptRequest (124-124)
  • EncryptRequest (139-141)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
svc/vault/internal/keyring/keyring.go (2)
svc/vault/internal/storage/interface.go (1)
  • Storage (15-30)
svc/vault/internal/vault/service.go (2)
  • New (43-83)
  • Config (36-41)
svc/vault/internal/keyring/decode_and_decrypt_key.go (2)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
pkg/proto/marshal.go (1)
  • Unmarshal (11-13)
🪛 Checkov (3.2.334)
k8s/manifests/vault.yaml

[medium] 1-66: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 1-66: Minimize the admission of root containers

(CKV_K8S_23)

🪛 Gitleaks (8.30.0)
deployment/docker-compose.yaml

[high] 140-140: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test Packages / Test
  • GitHub Check: Lint Go Code / Lint
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: autofix
🔇 Additional comments (49)
buf.yaml (2)

6-6: LGTM - Vault proto module added.

The addition of the vault proto module path is consistent with extracting vault as a standalone service.


10-10: googleapis removal is safe—no proto files import googleapis types.

The protovalidate dependency addition is appropriate, and the googleapis removal does not present a risk. A search across all proto files in the repository shows zero imports of google/api, google/protobuf, or google/rpc, so removing the googleapis dependency will not break existing imports.

Likely an incorrect or invalid review comment.

pkg/assert/not_empty.go (1)

17-17: LGTM - String slice support added.

The addition of ~[]string to the generic constraint is correct and enables NotEmpty validation for string slices, which aligns with vault configuration validation needs mentioned in the summary.

svc/vault/proto/buf.gen.yaml (1)

1-15: LGTM - Standard buf v2 configuration.

The buf generation configuration follows standard patterns:

  • Managed mode is enabled
  • Plugin versions are pinned for reproducibility
  • Output paths and module configuration are consistent with the repository structure

Based on learnings, the buf tooling in this repository is properly configured to handle proto generation and imports.

svc/vault/proto/generate.go (1)

3-3: The buf invocation is correct for Go 1.24+.

The directive go tool buf generate is a valid modern Go pattern. Since Go 1.24, the tool directive in go.mod registers third-party tools to be invoked via go tool. The repository's tools/go.mod properly declares github.com/bufbuild/buf/cmd/buf in the tool block, and go 1.25.1 is configured, so this invocation method is standard and functional.

pkg/proto/marshal.go (1)

1-13: LGTM! Simple pass-through wrapper.

This provides a unified API for protobuf marshaling that can be extended with middleware (tracing, metrics) in the future. The implementation is correct.

svc/vault/proto/vault/v1/service.proto (1)

5-6: Good addition of field validations.

The buf.validate constraints provide server-side input validation, preventing oversized payloads from reaching the encryption logic.

Also applies to: 15-16, 25-26, 34-38

pkg/vault/create_dek.go (1)

9-17: LGTM! Clean API simplification.

The refactored method provides a simpler interface by accepting a direct keyring string parameter and returning the DEK ID directly. Tracing is correctly preserved.

svc/vault/internal/storage/interface.go (1)

15-29: Clean storage abstraction.

The interface provides a good abstraction for object storage with clear method contracts. The Key and Latest helper methods centralize key format logic.

go.mod (1)

8-12: LGTM! Dependencies align with protobuf validation features.

The addition of buf.build/.../protovalidate and connectrpc.com/validate supports the field validation annotations added to service.proto. The AWS SDK update is a minor version bump.

svc/vault/internal/vault/auth.go (1)

16-29: LGTM! Solid authentication implementation with timing-safe comparison.

The use of subtle.ConstantTimeCompare correctly prevents timing attacks on token validation. Error messages are appropriately generic to avoid leaking information. The BearerToken is already validated as non-empty during configuration validation (Config.Validate() in svc/vault/config.go), so the initialization concern is properly addressed.

svc/vault/internal/vault/create_dek.go (1)

7-13: LGTM!

The method cleanly delegates to the keyring for key creation and correctly handles errors while returning the generated key ID on success.

svc/vault/internal/vault/service_test.go (1)

13-37: LGTM!

The test helper is well-structured with proper error handling, in-memory storage for fast isolated tests, and a unique bearer token per test instance to prevent conflicts.

svc/vault/internal/vault/rpc_liveness.go (1)

10-16: LGTM!

The liveness endpoint implementation is appropriate for health checks—it returns a simple "ok" status without performing complex operations, which is the expected behavior for k8s/container liveness probes.

svc/vault/internal/keyring/get_or_create_key.go (1)

14-25: Review comment is incorrect: keyID parameter is properly handled.

CreateKey stores the newly created key at two locations (line 32 stores with auto-generated ID, line 36 stores at buildLookupKey(ringID, "LATEST")). All callers pass LATEST as the keyID parameter, which is properly handled via the second store operation. When GetKey(ctx, ringID, "LATEST") fails with ErrObjectNotFound, the subsequent CreateKey call stores the new key at that exact location, making it available for subsequent lookups. No logic issue exists.

Likely an incorrect or invalid review comment.

svc/vault/internal/vault/rpc_liveness_test.go (1)

12-21: LGTM! Clean liveness test.

The test appropriately validates that the liveness endpoint returns "ok" without requiring authentication, which is correct for a health check endpoint.

pkg/vault/integration/coldstart_test.go (1)

75-75: LGTM! API signature update is correct.

The change correctly updates to the new CreateDEK signature that accepts a keyring string directly.

pkg/vault/integration/migrate_deks_test.go (1)

53-53: LGTM! API signature update is correct.

The change correctly updates to the new CreateDEK signature that accepts a keyring string directly.

deployment/docker-compose.yaml (1)

140-140: Master key in plaintext is acceptable for local development.

Static analysis flagged the base64-encoded master key, but this is expected for a local development docker-compose configuration. The same test key is consistently used across services (apiv2, agent, ctrl) for local integration testing.

svc/vault/integration/migrate_deks_test.go (1)

20-95: Excellent integration test for master key rotation.

This test effectively validates the critical scenario of master key rotation with backward compatibility. The test:

  • Seeds DEKs with the old master key
  • Encrypts data using those DEKs
  • Simulates a restart with a new master key prepended to the list
  • Verifies all previously encrypted data remains accessible

The unused DEK ID on Line 53 is appropriate for this seeding scenario.

svc/vault/internal/keys/master_key.go (1)

12-31: Clean and well-structured master key generation.

The function properly:

  • Generates cryptographically secure random bytes via GenerateKey
  • Constructs the KEK with appropriate metadata
  • Marshals to protobuf for serialization
  • Base64-encodes for string representation
  • Wraps errors with helpful context at each step
svc/vault/internal/vault/rpc_reencrypt_test.go (1)

13-98: Comprehensive authentication test coverage for ReEncrypt.

The test suite effectively covers:

  • Valid authentication with full encrypt→re-encrypt flow
  • Missing Authorization header
  • Invalid bearer token
  • Empty bearer token
  • Invalid authentication scheme (Basic instead of Bearer)

All tests properly verify connect.CodeUnauthenticated error codes. The comment on Line 38 demonstrates good understanding that re-encryption may reuse the same key if it's already the latest.

svc/vault/config.go (2)

5-33: LGTM on the Config struct definition.

The configuration struct is well-documented and covers all necessary fields for the vault service. The separation of S3 credentials and the comment explaining the master key rotation behavior is helpful.


35-50: Validation logic is sound.

The use of assert.All() to aggregate validation errors is a clean approach. All required fields are validated appropriately.

Consider adding validation for individual master key format/length in a future iteration, as cryptographic keys typically have specific requirements (e.g., base64-encoded, specific byte length).

svc/vault/integration/coldstart_test.go (1)

21-105: Comprehensive cold-start integration test.

The test effectively validates:

  • S3 storage initialization
  • Multi-user keyring isolation (alice/bob)
  • Encrypt → Decrypt round-trip
  • DEK creation and re-encryption with key rotation verification

The assertion on line 103 (require.NotEqual for key IDs) properly verifies that re-encryption uses the newly created DEK.

svc/vault/internal/keyring/get_key.go (2)

13-36: Key retrieval logic is correct.

The method properly:

  1. Builds the lookup key and records it in the span
  2. Fetches from storage with found flag handling
  3. Returns the sentinel error for not-found cases
  4. Delegates decryption to DecodeAndDecryptKey

9-9: > Likely an incorrect or invalid review comment.

svc/vault/internal/keyring/create_key.go (1)

13-41: Key creation flow is well-structured.

The method correctly:

  1. Generates cryptographic key material via keys.GenerateKey
  2. Constructs the DEK with ID, key, and timestamp
  3. Encrypts before persisting (encryption at rest)
  4. Updates both the specific key path and LATEST pointer
svc/vault/run.go (1)

21-60: Service initialization is well-structured.

The startup sequence properly:

  1. Validates configuration upfront
  2. Initializes shutdown handling early
  3. Sets up S3 storage with tracing middleware
  4. Creates the vault service with all dependencies
  5. Wires up the Connect handler with validation interceptor

The separation of concerns between svc/vault/config.go (Config/Validate) and this Run function is clean.

svc/vault/internal/vault/rpc_encrypt.go (3)

18-30: Public RPC method properly delegates after authentication.

The separation between the public Encrypt method (authentication + error mapping) and the internal encrypt method (business logic) is a clean pattern. Authentication errors are passed through directly, while internal errors are wrapped with CodeInternal.


40-50: Cache integration with keyring is well-implemented.

The cache key format {keyring}-LATEST correctly ties the cached DEK to the latest key for a given keyring. On cache miss, the DEK is fetched/created and then cached. This avoids repeated storage lookups for hot paths.

One consideration: if GetOrCreateKey succeeds but Set fails silently, subsequent requests will still work (just with a cache miss). This is acceptable behavior.


52-73: Encryption and response construction is correct.

The implementation:

  1. Uses the DEK for AES-256-GCM encryption
  2. Constructs a complete Encrypted protobuf with algorithm, nonce, ciphertext, key ID, and timestamp
  3. Marshals and base64-encodes for transport
  4. Returns both the encrypted blob and the key ID for client reference
cmd/vault/main.go (1)

30-31: Instance ID is generated at definition time, not runtime.

uid.New(uid.InstancePrefix, 4) is evaluated once when the Cmd variable is initialized (program startup), not when each command invocation runs. If the binary is long-lived or used as a library, all invocations share the same default ID.

Consider generating the default inside action() if truly unique per-invocation IDs are needed:

-		cli.String("instance-id", "Unique identifier for this instance. Auto-generated if not provided.",
-			cli.Default(uid.New(uid.InstancePrefix, 4)), cli.EnvVar("UNKEY_INSTANCE_ID")),
+		cli.String("instance-id", "Unique identifier for this instance. Auto-generated if not provided.",
+			cli.EnvVar("UNKEY_INSTANCE_ID")),

Then in action():

instanceID := cmd.String("instance-id")
if instanceID == "" {
    instanceID = uid.New(uid.InstancePrefix, 4)
}
config := vault.Config{
    // ...
    InstanceID: instanceID,
    // ...
}
svc/vault/internal/vault/rpc_decrypt.go (1)

51-58: Consider handling stale cache entries.

The code only refreshes on cache.Miss. If the cache returns a stale entry, it's still used without refresh. Verify whether stale DEKs should trigger a background refresh or if this is intentional.

pkg/vault/integration/reencryption_test.go (2)

68-71: LGTM!

The updated code correctly uses the new CreateDEK API signature that returns a string dekID directly instead of a protobuf response.


72-84: Test decrypts original ciphertext, not re-encrypted value.

The test calls ReEncrypt (lines 72-76) but then decrypts using enc.GetEncrypted() (line 81), which is the original encrypted value from line 60. The re-encrypted result is discarded. If the intent is to verify re-encryption produces a valid, decryptable result, the test should decrypt the re-encrypted output instead.

If this is intentional (testing that original ciphertext remains decryptable after key rotation), consider adding a comment to clarify. Otherwise:

-			_, err = v.ReEncrypt(ctx, &vaultv1.ReEncryptRequest{
+			reencrypted, err := v.ReEncrypt(ctx, &vaultv1.ReEncryptRequest{
 				Keyring:   keyring,
 				Encrypted: enc.GetEncrypted(),
 			})
 			require.NoError(t, err)
+			enc = reencrypted // Use re-encrypted value for next iteration
 		}
svc/vault/internal/keyring/encrypt_and_encode_key.go (1)

14-44: LGTM!

The encryption and encoding flow is well-structured with proper error wrapping at each step. The use of AES-256-GCM and proper nonce handling via encryption.Encrypt follows best practices.

svc/vault/internal/storage/middleware/tracing.go (1)

13-65: LGTM!

Clean implementation of the decorator pattern for tracing. The middleware correctly:

  • Creates spans for I/O operations (PutObject, GetObject, ListObjectKeys)
  • Records relevant attributes (key, prefix, found)
  • Sets error status on failures
  • Passes through non-I/O methods without overhead
svc/vault/internal/keyring/keyring.go (1)

8-8: > Likely an incorrect or invalid review comment.

svc/vault/internal/vault/rpc_decrypt_test.go (2)

13-37: LGTM! Good round-trip test for decrypt with valid auth.

The test correctly validates the encrypt-then-decrypt flow, ensuring the plaintext is recovered accurately. The authorization header setup is appropriate.


39-96: Auth failure test coverage is comprehensive.

The four authentication failure scenarios (missing, invalid token, empty token, invalid scheme) provide good coverage of the auth layer. Since authentication is checked before attempting decryption, using placeholder encrypted data is acceptable.

svc/vault/internal/vault/service.go (1)

85-111: LGTM! Master key loading logic is correct.

The first key in the list is used for encryption while all keys are available for decryption - this correctly supports key rotation scenarios. Base64 decoding and protobuf unmarshaling are properly handled with appropriate error wrapping.

svc/vault/integration/reusing_deks_test.go (2)

22-67: LGTM! DEK reuse test validates correct caching behavior.

The test correctly verifies that encrypting multiple secrets with the same keyring reuses a single DEK, which is the expected behavior for performance and key management.


69-114: LGTM! Individual DEK test validates keyring isolation.

The test correctly verifies that each keyring gets its own DEK, ensuring proper key isolation between different keyrings.

svc/vault/internal/storage/s3.go (1)

72-78: LGTM! Key and Latest helper methods are correct.

The composite key format workspaceId/dekID and workspaceId/LATEST are clear and appropriate for the storage structure.

svc/vault/internal/storage/memory.go (2)

24-33: LGTM! Memory storage constructor is well-implemented.

The in-memory storage is properly initialized with a mutex and empty map.


43-61: LGTM! Put/Get operations are correctly synchronized.

Write operations use Lock() and read operations use RLock(), which is the correct pattern for RWMutex usage.

svc/vault/internal/vault/rpc_encrypt_test.go (2)

13-28: LGTM! Valid auth test correctly validates encrypt response.

The test verifies that a successful encryption returns both the encrypted data and the key ID, which are essential for later decryption.


30-88: Auth failure tests are comprehensive and consistent with decrypt tests.

The four authentication failure scenarios provide symmetric coverage with the decrypt tests, ensuring consistent auth behavior across RPC methods.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @cmd/vault/main.go:
- Around line 24-27: The CLI flags "platform" and "image" are defined without
cli.Required(), but the action uses cmd.RequireString("platform") and
cmd.RequireString("image"); update the flag definitions for "platform" and
"image" to include cli.Required() so they are validated at parse time; locate
the flag definitions in the CLI setup (the entries creating
cli.String("platform", ...) and cli.String("image", ...)) and add cli.Required()
to each so the action that calls cmd.RequireString(...) can assume presence.
🧹 Nitpick comments (2)
cmd/vault/main.go (2)

11-54: Add a description for better CLI UX.

The command's Description field is empty, which reduces discoverability and usability of the CLI help. Consider adding a brief description explaining the vault service's purpose.

📝 Suggested improvement
 var Cmd = &cli.Command{
 	Version:     "",
 	Commands:    []*cli.Command{},
 	Aliases:     []string{},
-	Description: "",
+	Description: "Standalone encryption and secrets management service with S3-backed storage",
 	Name:        "vault",
 	Usage:       "Run unkey's encryption service",

28-29: Note: The region flag is defined but not used.

The region flag is configured with a default value and environment variable fallbacks, but it's never mapped to the vault.Config in the action function. If regional routing or logging is planned for future use, consider adding a comment indicating this is intentional. Otherwise, this flag can be removed to avoid confusion.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5be8ab1 and 8ed95b0.

📒 Files selected for processing (1)
  • cmd/vault/main.go
🧰 Additional context used
🧬 Code graph analysis (1)
cmd/vault/main.go (4)
svc/vault/internal/vault/service.go (2)
  • New (43-83)
  • Config (36-41)
svc/vault/internal/keyring/keyring.go (2)
  • New (28-36)
  • Config (20-26)
svc/vault/config.go (1)
  • Config (5-33)
svc/vault/run.go (1)
  • Run (21-93)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Packages / Test
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Lint Go Code / Lint
  • GitHub Check: autofix
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
cmd/vault/main.go (2)

1-9: LGTM!

Package structure and imports are clean and appropriate for a CLI command that exports Cmd for composition.


56-78: Action function correctly validates config before starting the service.

The flow is clean: construct config from CLI args, validate, then run. Error propagation is handled properly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @dev/k8s/manifests/vault.yaml:
- Around line 43-58: The current livenessProbe and readinessProbe use httpGet
against /vault.v1.VaultService/Liveness on port 8060 which fails because Connect
RPC expects POST; replace those httpGet probes with exec probes that perform a
POST-style health check (for example run grpcurl or a small script that POSTs to
the Connect RPC or invokes the service health method) or alternatively expose a
separate HTTP GET-compatible health endpoint and point
livenessProbe/readinessProbe to that URL; update the probes named livenessProbe
and readinessProbe to use either exec (running grpcurl or the script) or the new
GET path so the probe protocol matches the service endpoint.
🧹 Nitpick comments (4)
pkg/encryption/aes.go (2)

37-47: Good input validation; consider adding the same to Encrypt for consistency.

The validation logic is correct for AES-256-GCM. However, the Encrypt function (lines 12-35) doesn't validate key length upfront, which creates an asymmetry — invalid keys would fail with a less descriptive error from aes.NewCipher rather than a clear validation message.

♻️ Optional: Add similar validation to Encrypt
 func Encrypt(key []byte, plaintext []byte) (nonce []byte, ciphertext []byte, err error) {
+	if err := assert.Equal(len(key), 32, "key size must be 32 bytes"); err != nil {
+		return nil, nil, err
+	}
+
 	block, err := aes.NewCipher(key)

54-64: Good variable naming; avoids shadowing the aes import.

Using gcm for the GCM cipher instance is clearer and avoids the shadowing issue present in the Encrypt function (line 26 uses aes which shadows the imported package). Consider renaming aesgcm in Encrypt as well for consistency.

♻️ Optional: Fix shadowing in Encrypt
-	aes, err := cipher.NewGCM(block)
+	gcm, err := cipher.NewGCM(block)

 	if err != nil {
 		return nil, nil, fmt.Errorf("failed to create gcm: %w", err)
 	}
-	ciphertext = aes.Seal(nil, nonce, plaintext, nil)
+	ciphertext = gcm.Seal(nil, nonce, plaintext, nil)
dev/k8s/manifests/vault.yaml (2)

22-22: Avoid latest tag for container images.

Using latest is non-deterministic and can cause inconsistent deployments, even in development. Pin to a specific version or commit SHA for reproducibility.

Suggested fix
-          image: unkey/vault:latest
+          image: unkey/vault:v0.1.0  # or use a specific commit SHA

20-27: Add securityContext to harden the container.

The container lacks a security context, which static analysis flagged. Adding security constraints prevents privilege escalation and enforces non-root execution.

Suggested fix
      containers:
        - name: vault
          image: unkey/vault:latest
          command: ["run", "vault"]
+          securityContext:
+            runAsNonRoot: true
+            allowPrivilegeEscalation: false
+            readOnlyRootFilesystem: true
+            capabilities:
+              drop:
+                - ALL
          ports:
            - name: http
              containerPort: 8060
              protocol: TCP
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a84479e and d092a2d.

📒 Files selected for processing (3)
  • Makefile
  • dev/k8s/manifests/vault.yaml
  • pkg/encryption/aes.go
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-15T10:12:40.810Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4098
File: go/proto/ctrl/v1/deployment.proto:33-36
Timestamp: 2025-10-15T10:12:40.810Z
Learning: In the Unkey codebase proto files (ctrl/v1/build.proto, ctrl/v1/deployment.proto, hydra/v1/deployment.proto), use `dockerfile_path` (not `docker_file_path`) for consistency in generated Go field names.

Applied to files:

  • dev/k8s/manifests/vault.yaml
📚 Learning: 2025-09-01T15:10:44.959Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/proto/metald/v1/deployment.proto:7-15
Timestamp: 2025-09-01T15:10:44.959Z
Learning: In the unkey/unkey repository, the metald service receives deployment_id values from upstream services rather than generating them internally. The deployment_id field in DeploymentRequest is part of a service-to-service communication pattern.

Applied to files:

  • dev/k8s/manifests/vault.yaml
📚 Learning: 2025-09-01T16:08:10.327Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/proto/metald/v1/deployment.proto:45-52
Timestamp: 2025-09-01T16:08:10.327Z
Learning: In the unkey/unkey repository metald service, the host:port pairs in Instance messages within GetDeploymentResponse are consumed by upstream services in a tightly controlled manner, with the host:port being mapped from DNS entries rather than being used for arbitrary client connections.

Applied to files:

  • dev/k8s/manifests/vault.yaml
📚 Learning: 2025-09-24T18:57:34.843Z
Learnt from: mcstepp
Repo: unkeyed/unkey PR: 4010
File: QUICKSTART-DEPLOY.md:17-17
Timestamp: 2025-09-24T18:57:34.843Z
Learning: In the Unkey deployment platform, API key environment variables use component-specific naming but share the same secret value: UNKEY_API_KEY for the ctrl service (validator), API_KEY for the CLI client, and CTRL_API_KEY for the dashboard client. The ctrl service acts as the source of truth for validation.

Applied to files:

  • dev/k8s/manifests/vault.yaml
🪛 checkmake (0.2.2)
Makefile

[warning] 116-116: Target body for "fuzz" exceeds allowed length of 5 (9).

(maxbodylength)


[warning] 115-115: Missing required phony target "all"

(minphony)

🪛 Checkov (3.2.334)
dev/k8s/manifests/vault.yaml

[medium] 1-66: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 1-66: Minimize the admission of root containers

(CKV_K8S_23)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Packages / Test
  • GitHub Check: Lint Go Code / Lint
  • GitHub Check: Bazel / Test
  • GitHub Check: Bazel / Build
  • GitHub Check: autofix
🔇 Additional comments (3)
Makefile (1)

115-125: LGTM — useful addition for automated fuzz testing.

The implementation correctly discovers and runs Go fuzz tests. The grep/sed pipeline properly extracts fuzz function names, and the -fuzz flag usage is correct.

Regarding the static analysis hint about body length exceeding 5 lines: this is a style guideline, not a functional issue. If the logic grows further, consider extracting it to a shell script (e.g., scripts/fuzz.sh) for easier maintenance and testing.

pkg/encryption/aes.go (1)

8-9: LGTM!

The import of the internal assert package for input validation is appropriate.

dev/k8s/manifests/vault.yaml (1)

67-83: LGTM!

The Service configuration correctly exposes port 8060 with ClusterIP type and the selector properly matches the Deployment's pod labels.

@vercel vercel bot temporarily deployed to Preview – dashboard January 12, 2026 20:37 Inactive
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In @Dockerfile:
- Around line 12-13: The RUN that copies the Bazel-built binary currently
silences errors by redirecting stderr; update the command invoking bazelisk
cquery (the RUN line using cp $(bazelisk cquery //:unkey --output=files
2>/dev/null) /unkey) to stop redirecting stderr and fail loudly if the query
fails, and add a simple existence/validation check for the returned path before
calling cp so you get a clear diagnostic when cquery fails (i.e., remove the
"2>/dev/null" suppression and ensure the path variable is non-empty/exists prior
to the cp).
- Around line 3-4: The Dockerfile currently installs Bazelisk with a floating
tag ("RUN go install github.com/bazelbuild/bazelisk@latest"); change this to a
pinned version by replacing the @latest tag with a specific release (e.g.,
@vX.Y.Z) or introduce a build ARG (e.g., BAZELISK_VERSION) and use that variable
in the go install invocation so builds are reproducible and you can update the
version intentionally.

In @pkg/fuzz/slice.go:
- Around line 10-12: The package comment in pkg/fuzz/slice.go lists time.Time
and time.Duration as supported types but the implementation's type switch in the
slice constructor/consumer does not handle them; either remove time.Time and
time.Duration from that doc comment or implement handling by adding cases for
time.Time and time.Duration in the type switch (in the slice creation/consume
function) to serialize/deserialize them consistently with other numeric/string
cases. Ensure the symbols referenced are the top-of-file doc comment and the
type switch inside the slice implementation so reviewers can locate and update
the code and docs together.

In @svc/vault/internal/storage/memory_test.go:
- Around line 293-320: Complete TestMemory_DataIsolation so it actually asserts
the storage's behavior: after calling store.PutObject and modifying the returned
slice retrieved1, fetch retrieved2 via store.GetObject and assert that
retrieved2 still equals the original data (for isolation) using require.Equal(t,
originalData, retrieved2); if the intended behavior is shared slices instead,
assert require.Equal(t, retrieved1, retrieved2) to document that behavior.
Ensure you reference TestMemory_DataIsolation, store.PutObject and
store.GetObject and use retrieved1/retrieved2 in the assertion so future
regressions are caught.

In @web/apps/agent/services/vault/keyring/decode_and_decrypt_key.go:
- Line 30: The code mixes direct field access with generated accessors; replace
encrypted.Encrypted.GetCiphertext() with
encrypted.GetEncrypted().GetCiphertext() in the call to encryption.Decrypt (the
line constructing plaintext), and also change any other direct usages like
encrypted.Encrypted.EncryptionKeyId to use
encrypted.GetEncrypted().GetEncryptionKeyId() so all reads use the nil-safe Get*
accessors on the encrypted variable.
🧹 Nitpick comments (2)
pkg/fuzz/slice.go (1)

19-52: Type assertion check is redundant.

The type switch on any(zero).(type) already guarantees that T matches the case type. For example, when the case is bool, T must be bool, so any(c.Bool()).(T) will always succeed. The ok variable will always be true, making the require.True check on line 51 effectively a no-op.

Consider simplifying if you want to remove the overhead, though the current code is still correct.

svc/vault/internal/keyring/fuzz_test.go (1)

285-327: Test name/description slightly misaligned with behavior.

The test creates a protobuf with EncryptionKeyId: safeKeyID(keyIDRaw) (line 315), which generates IDs like "dek-...". Since the test keyring only has "test-kek-id" in its decryption keys map, the failure will be due to key ID lookup failure, not "garbage encrypted content" decryption failure.

The test is still valuable for coverage, but consider either:

  1. Updating the comment to reflect it tests unknown key ID handling, or
  2. Setting EncryptionKeyId: "test-kek-id" to actually test garbage content decryption with a valid key ID
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d092a2d and 9434fb1.

📒 Files selected for processing (12)
  • .bazelrc
  • Dockerfile
  • pkg/dockertest/redis.go
  • pkg/fuzz/BUILD.bazel
  • pkg/fuzz/seed.go
  • pkg/fuzz/slice.go
  • svc/vault/internal/keyring/fuzz_test.go
  • svc/vault/internal/storage/memory_test.go
  • svc/vault/internal/vault/corruption_detection_test.go
  • svc/vault/internal/vault/fuzz_corruption_test.go
  • web/apps/agent/services/vault/decrypt.go
  • web/apps/agent/services/vault/keyring/decode_and_decrypt_key.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/dockertest/redis.go
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-14T16:25:48.167Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3785
File: go/apps/api/routes/v2_keys_reroll_key/401_test.go:52-61
Timestamp: 2025-08-14T16:25:48.167Z
Learning: User Flo4604 requested creation of a GitHub issue to track converting all test files to use table-driven test patterns as a broader codebase improvement, following the suggestion made during review of go/apps/api/routes/v2_keys_reroll_key/401_test.go.

Applied to files:

  • svc/vault/internal/keyring/fuzz_test.go
📚 Learning: 2025-10-15T10:12:40.810Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4098
File: go/proto/ctrl/v1/deployment.proto:33-36
Timestamp: 2025-10-15T10:12:40.810Z
Learning: In the Unkey codebase proto files (ctrl/v1/build.proto, ctrl/v1/deployment.proto, hydra/v1/deployment.proto), use `dockerfile_path` (not `docker_file_path`) for consistency in generated Go field names.

Applied to files:

  • Dockerfile
📚 Learning: 2025-08-08T15:09:01.312Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3753
File: go/apps/api/openapi/config.yaml:9-10
Timestamp: 2025-08-08T15:09:01.312Z
Learning: Repo unkeyed/unkey: In go/apps/api/openapi, oapi-codegen doesn’t support OpenAPI 3.1 union nullability; overlay.yaml must be applied before codegen. The overlay key in oapi-codegen config isn’t supported—use a pre-step (programmatic or CLI) to merge overlay into the bundled spec, then run oapi-codegen.

Applied to files:

  • Dockerfile
🧬 Code graph analysis (6)
web/apps/agent/services/vault/decrypt.go (1)
pkg/encryption/aes.go (1)
  • Decrypt (40-65)
web/apps/agent/services/vault/keyring/decode_and_decrypt_key.go (1)
pkg/encryption/aes.go (1)
  • Decrypt (40-65)
pkg/fuzz/seed.go (1)
pkg/fuzz/consumer.go (1)
  • New (22-28)
pkg/fuzz/slice.go (1)
pkg/fuzz/consumer.go (1)
  • Consumer (12-16)
svc/vault/internal/vault/corruption_detection_test.go (3)
gen/proto/vault/v1/service.pb.go (6)
  • EncryptRequest (104-110)
  • EncryptRequest (123-123)
  • EncryptRequest (138-140)
  • DecryptRequest (208-214)
  • DecryptRequest (227-227)
  • DecryptRequest (242-244)
gen/proto/vault/v1/object.pb.go (7)
  • Encrypted (254-268)
  • Encrypted (281-281)
  • Encrypted (296-298)
  • Algorithm (24-24)
  • Algorithm (50-52)
  • Algorithm (54-56)
  • Algorithm (63-65)
pkg/proto/marshal.go (2)
  • Unmarshal (11-13)
  • Marshal (7-9)
svc/vault/internal/storage/memory_test.go (3)
cmd/dev/seed/verifications.go (1)
  • Key (122-131)
svc/vault/internal/storage/interface.go (1)
  • Storage (15-30)
svc/vault/internal/storage/memory.go (2)
  • NewMemory (24-33)
  • MemoryConfig (20-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Packages / Test
  • GitHub Check: Lint Go Code / Lint
  • GitHub Check: Bazel / Build
  • GitHub Check: Bazel / Test
  • GitHub Check: autofix
🔇 Additional comments (26)
.bazelrc (1)

3-3: LGTM – async remote caching is a sensible optimization.

This enables non-blocking uploads to the remote cache, reducing build latency. Modern Bazel versions wait for async uploads to complete before exit, so cache consistency should be preserved.

Worth verifying in CI that jobs don't terminate abruptly (e.g., via SIGKILL or hard timeouts) before uploads finish, as that could result in incomplete cache entries.

web/apps/agent/services/vault/decrypt.go (1)

43-43: LGTM! Using protobuf getters is the correct pattern.

Switching to GetKey(), GetNonce(), and GetCiphertext() follows idiomatic protobuf usage in Go. These generated getters are nil-safe and consistent with the pattern used elsewhere in this PR.

Dockerfile (2)

15-21: LGTM on using distroless base image.

Using gcr.io/distroless/static-debian12 is a security best practice as it minimizes the attack surface. The ENTRYPOINT correctly points to the copied binary.


1-1: No action needed. Go 1.25 was released on August 12, 2025, and is currently in active support as of January 2026. The base image golang:1.25 is valid and the Docker build will succeed without modification.

Likely an incorrect or invalid review comment.

svc/vault/internal/storage/memory_test.go (4)

12-100: LGTM!

The basic Put/Get tests cover essential storage behaviors well: standard operations, non-existent key handling, overwrite semantics, and edge cases for empty and nil data. The assertions are appropriate.


102-184: LGTM!

Good coverage of binary data integrity, large payload handling, and prefix-based listing. The ListObjectKeys test verifies multiple prefix scenarios including non-matching prefixes.


186-226: LGTM!

Key helper tests validate the expected key format. Special character tests cover a good variety of edge cases using subtests for better failure isolation.


322-329: LGTM!

Clean helper function with proper t.Helper() annotation.

pkg/fuzz/seed.go (1)

32-45: LGTM!

The deterministic seeding implementation is well-designed. Using a fixed ChaCha8 seed ensures reproducible fuzz inputs across runs, and the quadratic length progression provides good coverage from empty slices up to ~65KB.

pkg/fuzz/BUILD.bazel (1)

1-26: LGTM!

The Bazel configuration correctly defines the library and test targets with appropriate dependencies and visibility.

svc/vault/internal/keyring/fuzz_test.go (7)

18-53: LGTM!

The helper functions are well-designed:

  • safeKeyID reliably produces valid UTF-8 strings via hex encoding
  • setupTestKeyring creates a consistent test environment with a deterministic KEK

55-102: LGTM!

The roundtrip test correctly validates the core keyring property: any DEK that is encrypted should decrypt back to identical key material. The precondition checks for 32-byte keys and valid UTF-8 are appropriate.


104-124: LGTM!

Good robustness test ensuring malformed input doesn't cause panics. The assumption that random bytes will always fail decryption is valid in practice since it would require matching a valid encrypted protobuf structure with correct authentication.


126-151: LGTM!

The test verifies essential properties of buildLookupKey: determinism for consistent storage paths and proper prefix formatting.


153-201: LGTM!

Excellent security test verifying nonce uniqueness. Nonce reuse in AES-GCM would be catastrophic, so confirming that encrypting the same DEK twice produces different ciphertexts is essential.


204-283: LGTM!

The test correctly verifies that encrypted data cannot be decrypted when the decrypting keyring doesn't have the encryption KEK in its map. This tests the key lookup path, which is an important security boundary.


329-351: LGTM with note on overlap.

This test is functionally similar to FuzzDecodeAndDecryptMalformedInput - both feed arbitrary bytes to DecodeAndDecryptKey and expect errors. The documentation distinction (UTF-8 focus vs general malformed input) is reasonable for clarity, though they could potentially be consolidated.

svc/vault/internal/vault/fuzz_corruption_test.go (2)

16-91: LGTM!

Well-designed fuzz test for detecting storage/tampering corruption. The approach of constraining the XOR offset to the actual protobuf message size (line 66) ensures meaningful corruption testing. The assertion logic correctly handles both failure modes: explicit error or changed plaintext.


93-179: LGTM!

Comprehensive protobuf field-level corruption testing. The defensive length checks before XOR operations (lines 147, 150) prevent panics on empty fields. Testing nonce, ciphertext, and key ID corruption covers the critical authenticated fields.

svc/vault/internal/vault/corruption_detection_test.go (7)

21-88: LGTM!

Thorough single-bit flip coverage testing. The systematic iteration through byte/bit positions with subtests provides excellent granularity for identifying any gaps in authentication. Limiting to 50 bytes balances coverage with test runtime.


95-136: LGTM!

Good truncation attack coverage with strategically chosen truncation points. Testing truncation of the serialized protobuf (rather than just the ciphertext field) also validates handling of malformed message structures.


138-204: Excellent documentation of known limitation.

The thorough explanation of protobuf's lenient parsing behavior (lines 141-151) is valuable. This documents rather than hides the behavior, and correctly explains why it's not a security issue since GCM authentication still protects the actual encrypted data.


210-283: LGTM!

Comprehensive nonce modification test suite. The variety of modifications (bit flips, zero patterns, truncation, extension, empty) provides excellent coverage. The table-driven approach keeps the test readable and extensible.


290-351: LGTM!

Good cross-encryption attack test. Swapping ciphertext components between messages tests that GCM authentication binds the nonce to its corresponding ciphertext, preventing component substitution attacks.


354-393: LGTM!

Important edge case test. AES-GCM ciphertext must contain at least the 16-byte authentication tag, so an empty ciphertext should always fail decryption.


400-456: LGTM!

Good coverage of key ID tampering scenarios. Testing empty, fake, and modified key IDs ensures the vault correctly rejects requests when the DEK lookup fails or returns the wrong key.

@vercel vercel bot temporarily deployed to Preview – engineering January 12, 2026 21:25 Inactive
chronark and others added 2 commits January 13, 2026 09:45
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @svc/vault/internal/storage/memory_test.go:
- Around line 3-10: The test TestMemory_ConcurrentAccess uses the testify assert
package but the import is missing; update the imports in memory_test.go to
include "github.com/stretchr/testify/assert" so the assertions at lines where
TestMemory_ConcurrentAccess calls assert.* compile and the test build succeeds.
- Around line 280-286: The test spawns a goroutine that uses require.* (which
calls t.FailNow and must not be used outside the main test goroutine); update
the assertions inside the goroutine to use the testify/assert variants instead
(replace require.NoError(t, err), require.True(t, found), require.Equal(t, data,
retrieved) with assert.NoError(t, err), assert.True(t, found), assert.Equal(t,
data, retrieved)) for the PutObject/GetObject checks (referencing the
store.PutObject and store.GetObject calls) so failures in the goroutine use
non-fatal assertions and match the pattern used in TestMemory_ConcurrentAccess.
🧹 Nitpick comments (2)
web/apps/agent/services/vault/keyring/decode_and_decrypt_key.go (1)

23-28: Inconsistent field access pattern with the changed line.

Line 30 now correctly uses protobuf getters (GetEncrypted().GetNonce()), but lines 23 and 42 still use direct field access (encrypted.Encrypted.EncryptionKeyId). This creates:

  1. Inconsistency: Mixed access patterns within the same function.
  2. Nil-safety gap: Protobuf getters are nil-safe, but direct access to encrypted.Encrypted will panic if nil.
♻️ Suggested fix for consistency
-	kek, ok := k.decryptionKeys[encrypted.Encrypted.EncryptionKeyId]
+	kek, ok := k.decryptionKeys[encrypted.GetEncrypted().GetEncryptionKeyId()]
 	if !ok {
-		err = fmt.Errorf("no kek found for key id: %s", encrypted.Encrypted.EncryptionKeyId)
+		err = fmt.Errorf("no kek found for key id: %s", encrypted.GetEncrypted().GetEncryptionKeyId())
 		tracing.RecordError(span, err)
 		return nil, "", err
 	}
-	return dek, encrypted.Encrypted.EncryptionKeyId, nil
+	return dek, encrypted.GetEncrypted().GetEncryptionKeyId(), nil

Also applies to: 42-42

svc/vault/internal/storage/memory_test.go (1)

309-319: Complete the isolation assertion.

The test modifies retrieved data but doesn't verify the outcome. Consider asserting the expected behavior to catch regressions.

♻️ Proposed fix - verify isolation is enforced
 	// Modify the retrieved slice
 	retrieved1[0] = 'X'

 	// Get again - should be unmodified
 	retrieved2, _, err := store.GetObject(ctx, key)
 	require.NoError(t, err)

-	// This test documents behavior - memory storage may or may not copy
-	// If this fails, it means the storage returns the same slice (not a copy)
-	// which could be a bug or intended behavior
-	_ = retrieved2
+	// Verify that stored data is isolated from modifications to retrieved slices
+	require.Equal(t, originalData, retrieved2, "storage should return isolated copies")

If the implementation intentionally shares slices (for performance), then document that by asserting the opposite, or remove this test if the behavior is undefined.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9434fb1 and c79f8f5.

📒 Files selected for processing (2)
  • svc/vault/internal/storage/memory_test.go
  • web/apps/agent/services/vault/keyring/decode_and_decrypt_key.go
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-27T11:02:17.155Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3860
File: go/pkg/zen/middleware_metrics_test.go:297-313
Timestamp: 2025-08-27T11:02:17.155Z
Learning: The redact function in go/pkg/zen/middleware_metrics.go is not expected to be called concurrently, so concurrency tests are unnecessary for this function.

Applied to files:

  • svc/vault/internal/storage/memory_test.go
📚 Learning: 2025-04-23T14:59:37.294Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3180
File: go/pkg/buffer/buffer.go:128-143
Timestamp: 2025-04-23T14:59:37.294Z
Learning: In Go buffer implementations using channels, the `Close()` method should be made thread-safe (using mechanisms like `sync.Once` and/or atomic flags) to prevent race conditions with concurrent `Buffer()` calls, as sending to a closed channel causes a panic.

Applied to files:

  • svc/vault/internal/storage/memory_test.go
📚 Learning: 2025-08-14T16:25:48.167Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3785
File: go/apps/api/routes/v2_keys_reroll_key/401_test.go:52-61
Timestamp: 2025-08-14T16:25:48.167Z
Learning: User Flo4604 requested creation of a GitHub issue to track converting all test files to use table-driven test patterns as a broader codebase improvement, following the suggestion made during review of go/apps/api/routes/v2_keys_reroll_key/401_test.go.

Applied to files:

  • svc/vault/internal/storage/memory_test.go
📚 Learning: 2025-10-21T09:45:47.560Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 4107
File: go/pkg/clickhouse/key_verifications_test.go:20-20
Timestamp: 2025-10-21T09:45:47.560Z
Learning: In ClickHouse tests (e.g., go/pkg/clickhouse/key_verifications_test.go), parallel execution with t.Parallel() is safe when tests use workspaceID-based isolation. Each test generates a unique workspaceID (via uid.New(uid.WorkspacePrefix)), and all database operations are scoped to that workspaceID, providing logical isolation even when tests share a single ClickHouse instance.

Applied to files:

  • svc/vault/internal/storage/memory_test.go
🧬 Code graph analysis (1)
web/apps/agent/services/vault/keyring/decode_and_decrypt_key.go (1)
pkg/encryption/aes.go (1)
  • Decrypt (40-65)
🪛 GitHub Actions: autofix.ci
svc/vault/internal/storage/memory_test.go

[error] 250-250: undefined: assert. Missing import for assertion package (e.g., github.com/stretchr/testify/assert).

🪛 GitHub Check: autofix
svc/vault/internal/storage/memory_test.go

[failure] 253-253:
undefined: assert (typecheck)


[failure] 250-250:
undefined: assert

🪛 GitHub Check: Lint Go Code / Lint
svc/vault/internal/storage/memory_test.go

[failure] 253-253:
undefined: assert (typecheck)


[failure] 250-250:
undefined: assert

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test Packages / Test
🔇 Additional comments (7)
web/apps/agent/services/vault/keyring/decode_and_decrypt_key.go (3)

13-21: LGTM!

The function setup, tracing span initialization, and initial unmarshalling are well-structured with proper error handling and tracing.


30-34: Good change to use protobuf getters.

Using GetKey(), GetEncrypted().GetNonce(), and GetEncrypted().GetCiphertext() is the correct approach for nil-safe access to protobuf fields.


36-41: LGTM!

The DEK unmarshalling and error handling are correct.

svc/vault/internal/storage/memory_test.go (4)

145-197: Well-structured tests for listing and helper functions.

Good coverage of prefix-based listing with various patterns and verification of the Key and Latest helper functions.


199-226: Good use of table-driven subtests.

Nice coverage of special characters in keys with properly structured subtests.


248-254: Correct use of assert in goroutines (once import is added).

Using assert instead of require inside goroutines is the right approach since require.NoError calls t.FailNow(), which panics when called from a goroutine other than the one running the test function.


322-329: LGTM!

Clean test helper with proper t.Helper() call and appropriate use of noop logger.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @svc/vault/internal/storage/memory_test.go:
- Around line 274-288: The test spawns goroutines that call require.* (which
calls t.FailNow()) — move to non-fatal assertions inside the goroutine by
replacing require.NoError/require.True/require.Equal with
assert.NoError/assert.True/assert.Equal (keeping the same t and variables), so
failures in worker goroutines are reported without calling FailNow; update the
anonymous function in the loop (the goroutine body that references
store.PutObject, store.GetObject and the three require calls) to use assert.*
instead.
- Around line 294-321: The test TestMemory_DataIsolation currently mutates
retrieved1 then discards retrieved2; either remove the test or assert expected
isolation: after mutating retrieved1, call store.GetObject again and assert that
the returned bytes equal the original data (e.g. use require.Equal/bytes.Equal
against originalData) to verify the stored value was not modified; optionally
also assert that retrieved1 does not equal retrieved2 to ensure a copy was
returned.
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c79f8f5 and 5a0cc20.

📒 Files selected for processing (1)
  • svc/vault/internal/storage/memory_test.go
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-27T11:02:17.155Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3860
File: go/pkg/zen/middleware_metrics_test.go:297-313
Timestamp: 2025-08-27T11:02:17.155Z
Learning: The redact function in go/pkg/zen/middleware_metrics.go is not expected to be called concurrently, so concurrency tests are unnecessary for this function.

Applied to files:

  • svc/vault/internal/storage/memory_test.go
📚 Learning: 2025-04-23T14:59:37.294Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3180
File: go/pkg/buffer/buffer.go:128-143
Timestamp: 2025-04-23T14:59:37.294Z
Learning: In Go buffer implementations using channels, the `Close()` method should be made thread-safe (using mechanisms like `sync.Once` and/or atomic flags) to prevent race conditions with concurrent `Buffer()` calls, as sending to a closed channel causes a panic.

Applied to files:

  • svc/vault/internal/storage/memory_test.go
📚 Learning: 2025-08-14T16:25:48.167Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3785
File: go/apps/api/routes/v2_keys_reroll_key/401_test.go:52-61
Timestamp: 2025-08-14T16:25:48.167Z
Learning: User Flo4604 requested creation of a GitHub issue to track converting all test files to use table-driven test patterns as a broader codebase improvement, following the suggestion made during review of go/apps/api/routes/v2_keys_reroll_key/401_test.go.

Applied to files:

  • svc/vault/internal/storage/memory_test.go
📚 Learning: 2025-10-21T09:45:47.560Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 4107
File: go/pkg/clickhouse/key_verifications_test.go:20-20
Timestamp: 2025-10-21T09:45:47.560Z
Learning: In ClickHouse tests (e.g., go/pkg/clickhouse/key_verifications_test.go), parallel execution with t.Parallel() is safe when tests use workspaceID-based isolation. Each test generates a unique workspaceID (via uid.New(uid.WorkspacePrefix)), and all database operations are scoped to that workspaceID, providing logical isolation even when tests share a single ClickHouse instance.

Applied to files:

  • svc/vault/internal/storage/memory_test.go
🧬 Code graph analysis (1)
svc/vault/internal/storage/memory_test.go (2)
svc/vault/internal/storage/interface.go (1)
  • Storage (15-30)
svc/vault/internal/storage/memory.go (2)
  • NewMemory (24-33)
  • MemoryConfig (20-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Packages / Test
  • GitHub Check: Lint Go Code / Lint
  • GitHub Check: Bazel / Test
  • GitHub Check: Bazel / Build
  • GitHub Check: autofix
🔇 Additional comments (5)
svc/vault/internal/storage/memory_test.go (5)

13-101: LGTM!

The basic storage tests are well-structured with clear documentation. Good coverage of edge cases including empty data, nil data, and non-existent keys.


103-144: LGTM!

Good test coverage for binary data preservation and large data handling.


146-185: LGTM!

Good coverage of prefix listing with multiple test cases. The tests verify count rather than exact keys, which is appropriate for testing prefix matching behavior.


187-227: LGTM!

Clean test structure with good use of subtests for the special characters case. The helper function tests verify the expected key format conventions.


323-330: LGTM!

Clean helper function that properly uses t.Helper() for accurate test failure line reporting.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @svc/vault/internal/storage/BUILD.bazel:
- Around line 22-35: The test file s3_test.go is being built into both go_test
targets storage_test and storage_integration_test, causing duplicate runs and
failures where Docker isn't available; fix by removing "s3_test.go" from the
srcs list of the storage_test go_test (leaving it only in
storage_integration_test), or alternatively add an integration build tag at the
top of s3_test.go (e.g., //go:build integration) and update the BUILD.bazel
targets to only include integration-tagged files in storage_integration_test so
Docker-dependent tests (which import dockertest) are only run by the integration
target.
🧹 Nitpick comments (1)
svc/vault/internal/storage/BUILD.bazel (1)

37-47: Consider adding tags to gate integration test execution.

The integration test lacks a tags attribute to exclude it from default bazel test //... runs. Docker-dependent tests typically use tags = ["integration"] or tags = ["manual", "external"] to prevent them from running in environments without Docker or during quick CI checks.

♻️ Suggested improvement
 go_test(
     name = "storage_integration_test",
     size = "large",
     srcs = ["s3_test.go"],
     embed = [":storage"],
+    tags = [
+        "integration",
+        "requires-docker",
+    ],
     deps = [
         "//pkg/dockertest",
         "//pkg/otel/logging",
         "@com_github_stretchr_testify//require",
     ],
 )
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a0cc20 and 3d806b6.

📒 Files selected for processing (1)
  • svc/vault/internal/storage/BUILD.bazel
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Lint Go Code / Lint
  • GitHub Check: Test Packages / Test
  • GitHub Check: Bazel / Build
  • GitHub Check: Bazel / Test
  • GitHub Check: autofix
🔇 Additional comments (1)
svc/vault/internal/storage/BUILD.bazel (1)

3-20: LGTM!

The library target is well-structured with appropriate visibility scoping to //svc/vault:__subpackages__ and correctly specified dependencies for S3 storage and observability.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
svc/vault/internal/vault/roll_deks.go (1)

19-45: Consider concurrency and operational improvements.

A few observations:

  1. Error consistency (line 25): ErrObjectNotFound is returned unwrapped, unlike other errors in this function. Consider wrapping for consistency:

    -		return storage.ErrObjectNotFound
    +		return fmt.Errorf("object disappeared during roll: %w", storage.ErrObjectNotFound)
  2. Concurrent execution: If two RollDeks calls run simultaneously or keys are modified during iteration, there's potential for redundant work or race conditions. For a batch operation like this, consider adding a distributed lock or documenting that it should only run as a single scheduled job.

  3. Progress tracking: For large key sets, adding a counter or periodic log of progress would aid operational visibility.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d806b6 and a688be2.

📒 Files selected for processing (13)
  • svc/vault/internal/keyring/BUILD.bazel
  • svc/vault/internal/keyring/create_key.go
  • svc/vault/internal/keyring/fuzz_test.go
  • svc/vault/internal/keyring/get_key.go
  • svc/vault/internal/keyring/get_latest_key.go
  • svc/vault/internal/keyring/get_or_create_key.go
  • svc/vault/internal/keyring/keyring.go
  • svc/vault/internal/keyring/roll_keys.go
  • svc/vault/internal/storage/middleware/BUILD.bazel
  • svc/vault/internal/storage/middleware/tracing.go
  • svc/vault/internal/vault/BUILD.bazel
  • svc/vault/internal/vault/roll_deks.go
  • svc/vault/internal/vault/service.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • svc/vault/internal/keyring/create_key.go
  • svc/vault/internal/keyring/keyring.go
  • svc/vault/internal/vault/service.go
  • svc/vault/internal/keyring/get_latest_key.go
  • svc/vault/internal/keyring/roll_keys.go
  • svc/vault/internal/storage/middleware/tracing.go
🧰 Additional context used
🧬 Code graph analysis (4)
svc/vault/internal/vault/roll_deks.go (2)
svc/vault/internal/vault/service.go (1)
  • Service (21-32)
svc/vault/internal/storage/interface.go (1)
  • ErrObjectNotFound (9-9)
svc/vault/internal/keyring/get_key.go (2)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
svc/vault/internal/storage/interface.go (1)
  • ErrObjectNotFound (9-9)
svc/vault/internal/keyring/fuzz_test.go (2)
svc/vault/internal/keyring/keyring.go (3)
  • Keyring (11-18)
  • New (28-36)
  • Config (20-26)
gen/proto/vault/v1/object.pb.go (16)
  • KeyEncryptionKey (192-200)
  • KeyEncryptionKey (213-213)
  • KeyEncryptionKey (228-230)
  • DataEncryptionKey (67-76)
  • DataEncryptionKey (89-89)
  • DataEncryptionKey (104-106)
  • EncryptedDataEncryptionKey (130-138)
  • EncryptedDataEncryptionKey (151-151)
  • EncryptedDataEncryptionKey (166-168)
  • Encrypted (254-268)
  • Encrypted (281-281)
  • Encrypted (296-298)
  • Algorithm (24-24)
  • Algorithm (50-52)
  • Algorithm (54-56)
  • Algorithm (63-65)
svc/vault/internal/keyring/get_or_create_key.go (2)
svc/vault/internal/keyring/keyring.go (1)
  • Keyring (11-18)
svc/vault/internal/storage/interface.go (1)
  • ErrObjectNotFound (9-9)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Lint Go Code / Lint
  • GitHub Check: Test Packages / Test
  • GitHub Check: Bazel / Test
  • GitHub Check: Bazel / Build
  • GitHub Check: autofix
🔇 Additional comments (11)
svc/vault/internal/storage/middleware/BUILD.bazel (1)

1-14: LGTM!

The BUILD file is well-structured with appropriate visibility scoping to vault subpackages. Dependencies are correctly declared for a tracing middleware that wraps storage operations.

svc/vault/internal/vault/BUILD.bazel (3)

35-71: LGTM!

The test target is well-structured:

  • Comprehensive test coverage including unit tests, fuzz tests, and integration-style tests
  • Correctly embeds the vault library
  • Appropriate use of glob with allow_empty = True for testdata
  • Test dependencies include //svc/vault/internal/keys separately from the library (appropriate for test-only helpers)

16-16: Good visibility scoping.

Restricting visibility to //svc/vault:__subpackages__ is appropriate for an internal package, ensuring this implementation detail isn't accidentally imported by other services.


5-14: No issues found. The srcs list in the BUILD.bazel file is complete and accurate—config.go does not exist in the vault directory, so it should not be included.

svc/vault/internal/keyring/get_key.go (1)

13-37: LGTM!

The implementation is clean with proper tracing, error handling, and error wrapping. The kekID discard on line 31 is appropriate since callers of GetKey only need the DEK itself.

svc/vault/internal/keyring/fuzz_test.go (3)

23-53: Well-designed test setup.

The setupTestKeyring helper provides good isolation with in-memory storage and a deterministic test KEK. The pattern of registering the same KEK for both encryption and decryption ensures the roundtrip tests work correctly.


55-102: Excellent roundtrip fuzz test.

This test correctly validates the core invariant that encrypted DEKs decrypt back to identical values. The preconditions (32-byte key, valid UTF-8 ID) appropriately filter invalid inputs.


204-283: Good coverage for KEK isolation.

FuzzDecodeWithWrongKEK correctly verifies that data encrypted with one KEK cannot be decrypted with a different KEK. This is an important security property to test.

svc/vault/internal/vault/roll_deks.go (1)

11-17: LGTM on the core approach.

The overall pattern of listing keys, checking KEK freshness, and re-encrypting stale keys is sound. Good use of tracing at the operation level.

svc/vault/internal/keyring/BUILD.bazel (1)

29-42: No action needed. The testdata directory exists and contains valid fuzz corpus files that are referenced by the test.

Likely an incorrect or invalid review comment.

svc/vault/internal/keyring/get_or_create_key.go (1)

14-25: The code is correct as written. The keyID parameter is used to look up an existing key via GetKey() on line 18. CreateKey() intentionally generates its own key ID (via keys.GenerateKey()) when the requested key is not found; it does not accept a keyID parameter by design. This is the intended behavior for the "get or create" pattern here.

Likely an incorrect or invalid review comment.

@vercel vercel bot temporarily deployed to Preview – engineering January 13, 2026 11:20 Inactive
Copy link
Member

@Flo4604 Flo4604 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some cleanup

@chronark chronark requested a review from Flo4604 January 14, 2026 08:59
@vercel vercel bot temporarily deployed to Preview – engineering January 14, 2026 09:01 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard January 14, 2026 09:09 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard January 14, 2026 09:36 Inactive
@chronark chronark merged commit 64c849f into main Jan 14, 2026
16 checks passed
@chronark chronark deleted the vault branch January 14, 2026 09:44
@coderabbitai coderabbitai bot mentioned this pull request Feb 12, 2026
19 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants