Skip to content

[client,management] Feature/client service expose#5411

Merged
mlsmaycon merged 36 commits intomainfrom
feature/client-service-expose
Feb 24, 2026
Merged

[client,management] Feature/client service expose#5411
mlsmaycon merged 36 commits intomainfrom
feature/client-service-expose

Conversation

@mlsmaycon
Copy link
Copy Markdown
Collaborator

@mlsmaycon mlsmaycon commented Feb 21, 2026

Describe your changes

  • Add netbird expose CLI command allowing peers to expose local HTTP services via the NetBird reverse proxy with TTL-based session management
  • Implement 3 unary management RPCs (CreateExpose, RenewExpose, StopExpose) replacing the initial bidirectional streaming design
  • Add permission validation, per-peer rate limiting, and expiration handling for peer-exposed services
  • Introduce PeerExposeEnabled and PeerExposeGroups account settings to control which peers can expose services

Architecture

CLI (expose cmd) ──server-stream──► Daemon ──3 unary RPCs (encrypted)──► Management
  Ctrl+C cancels ctx                 CreateExpose → send Ready event       creates ephemeral service
                                     30s ticker → RenewExpose              updates lastRenewed in-memory
                                     defer → StopExpose (bg ctx)           deletes service, cleans up
                                                                           TTL reaper (30s interval, 90s TTL)

Key changes

CLI (client/cmd/expose.go)

  • New netbird expose <port> command with flags: --with-pin, --with-password, --with-user-groups, --with-custom-domain, --with-name-prefix, --protocol
  • Streams events from daemon, displays service URL on ready, handles Ctrl+C gracefully

Daemon (client/server/server.go)

  • Server-streaming ExposeService handler: calls CreateExpose, starts 30s renewal ticker, defers StopExpose for cleanup on all exit paths
  • Uses background context for management client so StopExpose works after CLI disconnect

Management gRPC (expose_service.go)

  • Three unary encrypted handlers: CreateExpose, RenewExpose, StopExpose
  • In-memory activeExposes sync.Map with mutex-protected lastRenewed per session
  • TTL reaper goroutine (30s interval, 90s TTL) expires stale sessions
  • Per-peer rate limit (max 10 concurrent exposes)
  • Delegates to reverseproxy manager for all business logic

Reverse proxy manager (manager.go)

  • ValidateExposePermission: checks PeerExposeEnabled setting and peer group membership
  • CreateServiceFromPeer: creates ephemeral service, resolves peer IP for targets, stores activity event
  • DeleteServiceFromPeer / ExpireServiceFromPeer: validates source peer ownership, deletes service, stores activity event
  • FromExposeRequest: constructs Service from proto request (mirrors FromAPIRequest pattern)
  • Source types renamed from api/peer to permanent/ephemeral

Account settings

  • PeerExposeEnabled (bool) and PeerExposeGroups ([]string) added to ExtraSettings
  • OpenAPI spec updated with new fields
  • Account handler passes settings through to API responses

Activity codes

  • PeerServiceExposed (111), PeerServiceUnexposed (112), PeerServiceExposeExpired (113)

Test plan

  • go test ./management/internals/modules/reverseproxy/... — 17 tests including TestGenerateExposeName, TestFromExposeRequest, TestValidateExposePermission, TestCreateServiceFromPeer, TestGetGroupIDsFromNames
  • go test ./management/internals/shared/grpc/ — expose tests including TestReapExpiredExposes, TestCountPeerExposes, TestCleanupExpose, concurrent safety tests
  • go build ./... and go vet ./... pass clean
  • Manual: netbird expose 8080 creates service, shows URL, Ctrl+C cleans up
  • Manual: service expires after 90s without renewal
  • Manual: disable peer expose in settings, verify CLI gets permission denied

Issue ticket number and link

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

netbirdio/docs#633

Summary by CodeRabbit

  • New Features

    • CLI: new expose command to publish a local port with flags for PIN, password, user groups, custom domain, name prefix and protocol (HTTP default).
    • Management/API: create/renew/stop expose sessions (streamed status), automatic naming/domain, TTL renewals, background expiration, new management RPCs and client methods.
    • UI/API: account settings now include peer_expose_enabled and peer_expose_groups; new activity codes for peer expose events.
  • Tests

    • Extensive tests covering lifecycle, naming, permissions, reaper/expiration and concurrency.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 21, 2026

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. By default, CodeRabbit skips reviewing PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
📝 Walkthrough

Walkthrough

Adds end-to-end "expose" functionality: client CLI and daemon streaming RPC, management encrypted Create/Renew/Stop expose APIs with active-expose lifecycle and reaper, reverse-proxy manager peer APIs and name/domain handling, account settings to control peer exposes, and tests and wiring across services.

Changes

Cohort / File(s) Summary
Client CLI & daemon
client/cmd/expose.go, client/cmd/root.go, client/proto/daemon.proto, client/server/server.go, client/internal/expose/*, client/internal/engine.go
Adds expose Cobra command, daemon ExposeService streaming RPC handler, client expose manager, request/response bridges, engine wiring, and CLI stream lifecycle with signal handling.
Management encrypted RPCs & server-side expose lifecycle
shared/management/proto/management.proto, management/internals/shared/grpc/expose_service.go, management/internals/shared/grpc/expose_service_test.go, management/internals/shared/grpc/server.go, management/internals/server/boot.go
Adds encrypted CreateExpose/RenewExpose/StopExpose handlers, activeExpose tracking with per-expose locking, reaper for TTL expiry, and wiring of reverse-proxy manager into the gRPC server.
Reverse-proxy manager API & implementation
management/internals/modules/reverseproxy/interface.go, management/internals/modules/reverseproxy/manager/manager.go, management/internals/modules/reverseproxy/manager/manager_test.go, management/internals/modules/reverseproxy/domain/...
Adds peer-focused APIs (ValidateExposePermission, CreateServiceFromPeer, DeleteServiceFromPeer, ExpireServiceFromPeer, GetClusterDomains), group resolution, random domain selection, settings wiring, and tests; constructor signature updated to include settingsManager.
Reverse-proxy types & helpers
management/internals/modules/reverseproxy/reverseproxy.go, management/internals/modules/reverseproxy/reverseproxy_test.go
Extends Service model (Source, SourcePeer, pointer timestamps), adds FromExposeRequest and GenerateExposeName, auth helpers, and unit tests for name generation and request conversion.
Management wiring & call sites
management/internals/server/modules.go, many tests and test tooling (management/server/*, management/internals/shared/grpc/*, proxy/*, management/server/http/testing/*)
Updates NewManager constructor usage to accept settingsManager across initialization and tests; updates many test doubles and stubs to satisfy new manager interface.
Account settings & API schema
management/server/types/settings.go, management/server/http/handlers/accounts/accounts_handler.go, shared/management/http/api/openapi.yml, shared/management/http/api/types.gen.go
Adds PeerExposeEnabled and PeerExposeGroups to Settings, validates updates, propagates to API schema and generated types.
Management events & activity codes
management/server/activity/codes.go, management/server/account.go
Adds activity codes for peer expose/unexpose/expire and emits account-level enable/disable events when peer-expose toggles.
Management client & mocks
shared/management/client/client.go, shared/management/client/grpc.go, shared/management/client/mock.go
Adds CreateExpose/RenewExpose/StopExpose client methods/types (encrypted flows) and mock hooks.
Server state additions
management/internals/shared/grpc/server.go
Adds server fields: activeExposes sync.Map, exposeCreateMu sync.Mutex, reverseProxyManager reverseproxy.Manager, and reverseProxyMu sync.RWMutex.
Expose manager (client) & tests
client/internal/expose/manager.go, client/internal/expose/request.go, client/internal/expose/*_test.go
Adds client-side expose Manager implementing Expose and KeepAlive (renew) with timeouts and stop behavior; request/response bridging and tests.
Tests & test doubles updates
many management/..., proxy/..., client/internal/*_test.go
Extensive tests for expose flows, reaper, reverse-proxy manager; many test doubles extended/stubbed for new manager methods and constructor changes.
Minor adjustments
management/server/store/sql_store.go, management/server/mock_server/account_mock.go
CertificateIssuedAt now a pointer; mock delegation fix for GetGroupByName.

Sequence Diagram(s)

sequenceDiagram
    participant Peer as Peer/Client
    participant Daemon as Daemon
    participant Mgmt as Management Server
    participant ReverseProxy as Reverse Proxy Manager
    participant Store as Store

    Peer->>Daemon: ExposeServiceRequest(port, protocol, auth...)
    Daemon->>Daemon: setup context & signal handling
    Daemon->>Mgmt: CreateExpose (encrypted)
    rect rgba(100,150,200,0.5)
      Mgmt->>ReverseProxy: ValidateExposePermission(accountID, peerID)
      ReverseProxy-->>Mgmt: ok
      Mgmt->>ReverseProxy: CreateServiceFromPeer(accountID, peerID, service)
      ReverseProxy->>Store: persist service & record event
      ReverseProxy-->>Mgmt: service created
    end
    Mgmt->>Mgmt: track activeExpose with TTL
    Mgmt-->>Daemon: ExposeServiceEvent(Ready: service_name, service_url, domain)
    Daemon->>Peer: Ready event printed
    loop periodic renew
      Daemon->>Mgmt: RenewExpose(domain)
      Mgmt-->>Daemon: success
    end
    Peer->>Daemon: Ctrl+C
    Daemon->>Mgmt: StopExpose(domain)
    Mgmt->>ReverseProxy: DeleteServiceFromPeer(accountID, peerID, serviceID)
    ReverseProxy->>Store: delete & record event
    ReverseProxy-->>Mgmt: ok
    Mgmt-->>Daemon: success
    Daemon->>Peer: Stopped event
Loading
sequenceDiagram
    participant Reaper as Expose Reaper
    participant Mgmt as Management Server
    participant ReverseProxy as Reverse Proxy Manager
    participant Store as Store

    rect rgba(200,150,100,0.5)
      loop periodic scan
        Reaper->>Mgmt: scan activeExposes
        alt expired
          Reaper->>ReverseProxy: ExpireServiceFromPeer(accountID, peerID, serviceID)
          ReverseProxy->>Store: delete & record expire event
          ReverseProxy-->>Reaper: ok
          Reaper->>Mgmt: remove activeExpose
        else still valid
          Reaper-->>Mgmt: keep
        end
      end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • pascal-fischer
  • crn4
  • bcmmbaga

Poem

🐰 I hop where ports and pathways meet,

I craft a name both short and neat.
I whisper to daemon, bind the stream,
a reaper trims what time may deem.
Hooray — your expose hops free.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title '[client,management] Feature/client service expose' clearly describes the main feature added—a new 'expose' capability spanning both client and management components.
Description check ✅ Passed The pull request description comprehensively covers all required template sections: changes are detailed, feature enhancement checkbox is marked, tests are listed, and documentation PR link is provided.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
shared/management/http/api/types.gen.go (1)

36-148: ⚠️ Potential issue | 🟡 Minor

Add missing service.peer.* activity code constants to OpenAPI enum

The PR introduces activity codes 111–113 for peer service exposure events in management/server/activity/codes.go, with string values "service.peer.expose", "service.peer.unexpose", and "service.peer.expose.expire". These codes are actively used by the reverse proxy manager but do not appear in the activity_code enum in shared/management/http/api/openapi.yml (lines 2261–2330).

Since shared/management/http/api/types.gen.go is code-generated from the OpenAPI spec, the three missing codes must be added to the activity_code enum in openapi.yml, then codegen must be re-run to generate the corresponding EventActivityCode* constants. Without them, API consumers cannot reference these events by name.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shared/management/http/api/types.gen.go` around lines 36 - 148, Add the three
missing activity codes "service.peer.expose", "service.peer.unexpose", and
"service.peer.expose.expire" to the activity_code enum in the OpenAPI spec and
then re-run the OpenAPI->Go code generator so types.gen.go includes the
corresponding constants (e.g. EventActivityCodeServicePeerExpose,
EventActivityCodeServicePeerUnexpose, EventActivityCodeServicePeerExposeExpire)
used alongside the new codes in management/server/activity/codes.go and
referenced by the reverse proxy manager.
🧹 Nitpick comments (7)
management/internals/modules/reverseproxy/reverseproxy.go (1)

460-466: isAuthEnabled() checks pointer existence, not the Enabled flag.

Auth configs created via FromAPIRequest can have Enabled: false, so a non-nil pointer doesn't necessarily mean auth is active. Since this is only used for event metadata, the impact is cosmetic — but it could mislead audit logs.

Suggested fix
 func (s *Service) isAuthEnabled() bool {
-	return s.Auth.PasswordAuth != nil || s.Auth.PinAuth != nil || s.Auth.BearerAuth != nil
+	return (s.Auth.PasswordAuth != nil && s.Auth.PasswordAuth.Enabled) ||
+		(s.Auth.PinAuth != nil && s.Auth.PinAuth.Enabled) ||
+		(s.Auth.BearerAuth != nil && s.Auth.BearerAuth.Enabled)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/reverseproxy.go` around lines 460 -
466, The isAuthEnabled() helper used by Service.EventMeta currently treats
non-nil Auth pointers as enabled; change it to check each auth object's Enabled
boolean (e.g., Auth.PasswordAuth.Enabled, Auth.PinAuth.Enabled,
Auth.BearerAuth.Enabled) and return true only if any of those exist and have
Enabled == true; update the isAuthEnabled() implementation referenced by
Service.EventMeta to handle nil pointers safely (nil -> false) and read the
Enabled field when present so event metadata accurately reflects active auth.
management/internals/server/boot.go (1)

156-156: StartExposeReaper goroutine is not stoppable on graceful shutdown

Passing context.Background() means the reaper goroutine will run for the entire process lifetime with no cancellation path. While harmless in normal server operation, it becomes a goroutine leak in tests or scenarios where the gRPC server is stopped and restarted (e.g. integration tests that spin up/tear down a BaseServer). Consider threading a cancellable context through the BaseServer lifecycle (or using the existing shutdown mechanism) so the reaper can be stopped cleanly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/server/boot.go` at line 156, The call to
srv.StartExposeReaper(context.Background()) starts a reaper goroutine with no
cancellation; change it to use a cancellable context tied to the BaseServer
lifecycle (e.g. a server-level context or shutdown context) so the reaper can be
stopped on graceful shutdown: modify the BaseServer to create/store a context
with cancel (or reuse its existing shutdown context) and pass that stored
context to StartExposeReaper instead of context.Background(), and ensure the
cancel is invoked when the server Stop/Shutdown path runs.
management/server/account_test.go (1)

3125-3125: Wire the settings manager into reverse proxy manager creation.

createManager already has settingsMockManager; passing nil here can mask settings-dependent behavior or trigger a nil deref if expose permission checks get exercised in tests. Prefer wiring the mock to mirror production.

♻️ Suggested update
- manager.SetServiceManager(reverseproxymanager.NewManager(store, manager, permissionsManager, nil, nil, nil))
+ manager.SetServiceManager(reverseproxymanager.NewManager(store, manager, permissionsManager, settingsMockManager, nil, nil))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/server/account_test.go` at line 3125, The reverse proxy manager is
being created with nil for the settings manager which can hide
settings-dependent behavior or cause nil derefs; locate the call to
reverseproxymanager.NewManager inside the createManager test setup (the line
with manager.SetServiceManager(reverseproxymanager.NewManager(...))) and replace
the nil placeholder with the existing settingsMockManager used in createManager
so the mock settings manager is wired into reverseproxymanager.NewManager
alongside store, manager, and permissionsManager.
management/internals/modules/reverseproxy/reverseproxy_test.go (1)

461-548: Add HTTPS protocol coverage to guard protocol mapping.

Protocol is user-facing; adding a non-HTTP case helps ensure FromExposeRequest honors requested protocol values.

➕ Add HTTPS case
 func TestFromExposeRequest(t *testing.T) {
 	t.Run("basic HTTP service", func(t *testing.T) {
 		req := &proto.ExposeServiceRequest{
 			Port:     8080,
 			Protocol: proto.ExposeProtocol_EXPOSE_HTTP,
 		}
@@
 		assert.Equal(t, "account-1", target.AccountID)
 	})
+
+	t.Run("HTTPS service", func(t *testing.T) {
+		req := &proto.ExposeServiceRequest{
+			Port:     443,
+			Protocol: proto.ExposeProtocol_EXPOSE_HTTPS,
+		}
+
+		service := FromExposeRequest(req, "acc", "peer", "svc")
+		require.Len(t, service.Targets, 1)
+		assert.Equal(t, "https", service.Targets[0].Protocol)
+	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/reverseproxy_test.go` around lines
461 - 548, Add a new subtest in TestFromExposeRequest to cover the HTTPS
protocol: call FromExposeRequest with Protocol set to
proto.ExposeProtocol_EXPOSE_HTTPS (similar to the "basic HTTP service" test),
then assert that the resulting service.Targets[0].Protocol equals "https" (and
that Port, TargetId, TargetType, Enabled, AccountID remain correct); this
ensures FromExposeRequest correctly maps EXPOSE_HTTPS to the "https" target
protocol.
client/cmd/expose.go (1)

60-68: Consider deferring signal.Stop to clean up the notification registration.

Without signal.Stop, the notification channel remains registered for the lifetime of the process. This is minor since the command exits after exposeFn returns, but it's good hygiene.

Proposed fix
 	sigCh := make(chan os.Signal, 1)
 	signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
+	defer signal.Stop(sigCh)
 	go func() {
 		<-sigCh
 		cancel()
 	}()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/cmd/expose.go` around lines 60 - 68, The signal notification channel
(sigCh) is never unregistered; after calling signal.Notify(sigCh,
syscall.SIGINT, syscall.SIGTERM) defer signal.Stop(sigCh) to unregister the
channel when exposeFn exits—place the defer immediately after the signal.Notify
call (alongside the existing ctx, cancel and defer cancel) so the registration
is cleaned up for sigCh used in the goroutine that calls cancel().
shared/management/client/grpc.go (1)

724-768: RenewExpose and StopExpose are consistent with other one-shot RPCs.

Both follow the encrypt-call pattern. Note that RenewExpose will be called every 30s by the daemon's renewal ticker, and each call fetches the server public key (an extra RPC round-trip). This is fine for now since it's consistent with how the client handles all unary RPCs, but if renewal frequency ever increases, consider caching the server key.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shared/management/client/grpc.go` around lines 724 - 768, RenewExpose (and
StopExpose) call GetServerPublicKey on every invocation, causing an extra RPC
round-trip per call which can become expensive when RenewExpose is invoked
frequently; to fix, add a cached server public key in GrpcClient (e.g.,
serverPubKeyCache field) and modify RenewExpose and StopExpose to use the cached
key if present and valid, falling back to GetServerPublicKey only when the cache
is empty or stale, and ensure cache population/refresh logic (and any necessary
locking) lives in GetServerPublicKey or a new helper method to keep these
methods' flow unchanged.
management/internals/modules/reverseproxy/manager/manager.go (1)

585-636: Custom domains from peer requests bypass cluster domain validation.

When service.Domain != "" (peer supplies a custom domain via --with-custom-domain), buildRandomDomain is skipped but there's no validation that the domain belongs to a known cluster. initializeServiceForCreateDeriveClusterFromDomain will error if the domain can't be mapped, which fails the request — but the error message surfaces as a PreconditionFailed status (line 173) rather than an InvalidArgument. Consider adding an explicit domain allowlist check before creating the service, or at minimum clarifying in the error that the custom domain is not within an allowed cluster.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager.go` around lines
585 - 636, CreateServiceFromPeer allows peers to supply a custom domain without
explicit cluster-allow validation, causing DeriveClusterFromDomain to later fail
and surface as the wrong error; update CreateServiceFromPeer to explicitly
validate service.Domain when non-empty by invoking the
cluster-derivation/allowlist logic (e.g., call the same DeriveClusterFromDomain
or an allowlist check used by initializeServiceForCreate) before persisting, and
if the domain cannot be mapped return an InvalidArgument-style error with a
clear message that the provided custom domain is not within a known/allowed
cluster (reference CreateServiceFromPeer, initializeServiceForCreate,
DeriveClusterFromDomain).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/cmd/expose.go`:
- Around line 51-54: After parsing args[0] into port (using strconv.ParseUint as
shown), add a range check to ensure the numeric port value is between 1 and
65535 inclusive; if port == 0 or port > 65535 return an error like "invalid port
number: must be between 1 and 65535" so callers of the routine (the code that
sets variable port after ParseUint) cannot provide 0 or >65535 values.
- Around line 102-116: The switch on event.Event (handling
*proto.ExposeServiceEvent_Ready, _Error, _Stopped) lacks a default branch so
unexpected or nil variants silently fall through; add a default case in the
switch (or an explicit nil check before switching) that returns or logs an
informative error (e.g., "unexpected expose event: %T" or "received nil event")
so the caller is notified and the loop doesn't continue silently—update the
switch around event.Event in expose.go to include that default handling.

In `@client/proto/daemon.proto`:
- Around line 808-823: FromExposeRequest in reverseproxy.go currently ignores
req.Protocol and always uses "http", causing ExposeProtocol enum values
(EXPOSE_HTTPS, EXPOSE_TCP, EXPOSE_UDP) to be unused; update FromExposeRequest to
map ExposeProtocol -> target scheme/protocol (e.g., EXPOSE_HTTP->"http",
EXPOSE_HTTPS->"https", EXPOSE_TCP->"tcp", EXPOSE_UDP->"udp") and use that mapped
value when building the target (host/URL/scheme) and any transport setup (enable
TLS for "https", use raw TCP/UDP dialing for "tcp"/"udp" or return a clear error
if not yet supported). Alternatively, if only HTTP is supported, remove the
extra enum values from ExposeProtocol or add a comment/documentation and
validate req.Protocol to reject unsupported values; ensure references to
ExposeProtocol and req.Protocol are updated accordingly.

In `@client/server/server.go`:
- Around line 1316-1402: The RenewExpose loop uses mgmCtx (a background context)
so mgmClient.RenewExpose may block indefinitely; wrap each renewal call in a
short per-call context with timeout (e.g., ctx, cancel :=
context.WithTimeout(mgmCtx, X*time.Second); defer cancel()) and call
mgmClient.RenewExpose(ctx, resp.Domain) instead of using mgmCtx directly; ensure
cancel is called after each attempt so the goroutine can exit and the deferred
StopExpose cleanup will run when the stream is cancelled.

In `@management/internals/modules/reverseproxy/manager/manager_test.go`:
- Around line 771-783: Update the misleading inline comment in the test case
inside the t.Run labeled "disallowed when no groups configured" to accurately
state the test intent: that expose is enabled but no peer expose groups are
configured (empty slice) and therefore ValidateExposePermission (method) should
return an error; locate the setup using setupIntegrationTest, the account
settings variable s and its s.Extra.PeerExposeGroups assignment, and replace the
comment on that block to reflect that empty groups means disallow, not "allow
all".

In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 653-661: The buildRandomDomain method dereferences
m.clusterDeriver without a nil check causing a panic; update
managerImpl.buildRandomDomain to first check if m.clusterDeriver == nil and
return a clear error (e.g., "clusterDeriver is nil for service %s") instead of
calling GetClusterDomains, or alternatively ensure initializeServiceForCreate is
invoked before buildRandomDomain is ever called from CreateServiceFromPeer—pick
the safer fix by adding the nil check in buildRandomDomain and return an error
when clusterDeriver is nil so callers can handle it gracefully.

In `@management/internals/shared/grpc/expose_service.go`:
- Around line 72-74: The countPeerExposes check in CreateExpose is racy with the
subsequent LoadOrStore insert and can allow concurrent CreateExpose calls to
exceed maxExposesPerPeer; fix this by serializing reservations per peer—either
acquire a per-peer lock around the check+reservation or perform an atomic
reservation in the activeExpose map before persisting to the DB (i.e., try to
insert a placeholder into activeExpose using LoadOrStore and only proceed to
create/persist the service if the reservation succeeds), and release the
reservation on failure so countPeerExposes and LoadOrStore are no longer
race-prone.
- Around line 89-99: When LoadOrStore on s.activeExposes finds an existing key
(the loaded branch after calling exposeKey and LoadOrStore), the newly created
service (created) is currently cleaned up via deleteExposeService(ctx,
accountID, peer.ID, created) but any error is only debug-logged and can leave
orphaned DB records; update this branch in expose_service.go so that you call
deleteExposeService, capture its returned error, and either return that error
(e.g., wrap and return a status.Errorf/WithDetails) or at minimum log it at
warn/error level using the same logger so failures are visible; reference the
LoadOrStore check, the activeExpose struct, and deleteExposeService(...) call
when making the change.
- Around line 213-228: authenticateExposePeer currently returns raw errors from
GetAccountIDForPeerKey (which can leak internals) and uses inconsistent gRPC
codes for the same "peer not registered" condition; change the non-NotFound
error return from GetAccountIDForPeerKey to a gRPC status error (e.g., return
status.Errorf(codes.Internal, "failed to lookup account for peer")) instead of
returning err directly, and make the not-found handling consistent by using the
same code for both lookups (pick one — e.g., use codes.PermissionDenied for the
NotFound case returned from GetAccountIDForPeerKey and change the
GetPeerByPeerPubKey branch to also return status.Errorf(codes.PermissionDenied,
"peer is not registered")). Ensure you update the returns in
authenticateExposePeer accordingly.
- Line 8: Replace the deprecated import pb "github.com/golang/protobuf/proto"
with the modern module pb "google.golang.org/protobuf/proto" in
expose_service.go; update the import line (keeping the pb alias and // nolint if
desired), then run go mod tidy and build to catch any minor API differences and
update any usages of pb.* to the google.golang.org/protobuf equivalents if the
compiler reports changes.

In `@management/internals/shared/grpc/validate_session_test.go`:
- Around line 321-322: Remove the suspicious inline comment
"//nbx_V0roAjwD71s3SMxY1C1qaZTmxhsXtn3V7r3Z" found in validate_session_test.go
and replace it with either nothing or a benign explanatory comment (e.g., "//
test artifact removed"); also scan the surrounding test (e.g.,
TestValidateSession or any functions in validate_session_test.go) for other
similar leaked tokens and remove or sanitize them to avoid secret leakage and
secret-scanner alerts.

In `@shared/management/http/api/openapi.yml`:
- Around line 381-398: The schema marks peer_expose_enabled and
peer_expose_groups as required which can break clients that reuse
AccountExtraSettings for requests; remove these two from the required list (or
create a separate response schema like AccountExtraSettingsResponse) and instead
declare sensible defaults (peer_expose_enabled: false, peer_expose_groups: [] )
in the schema so omitted fields are safe for partial updates; update any request
payload usage of AccountExtraSettings (e.g., account update endpoints) to rely
on the optional fields or switch request bodies to the request-specific schema
while keeping the response schema with required fields if truly necessary.

---

Outside diff comments:
In `@shared/management/http/api/types.gen.go`:
- Around line 36-148: Add the three missing activity codes
"service.peer.expose", "service.peer.unexpose", and "service.peer.expose.expire"
to the activity_code enum in the OpenAPI spec and then re-run the OpenAPI->Go
code generator so types.gen.go includes the corresponding constants (e.g.
EventActivityCodeServicePeerExpose, EventActivityCodeServicePeerUnexpose,
EventActivityCodeServicePeerExposeExpire) used alongside the new codes in
management/server/activity/codes.go and referenced by the reverse proxy manager.

---

Nitpick comments:
In `@client/cmd/expose.go`:
- Around line 60-68: The signal notification channel (sigCh) is never
unregistered; after calling signal.Notify(sigCh, syscall.SIGINT,
syscall.SIGTERM) defer signal.Stop(sigCh) to unregister the channel when
exposeFn exits—place the defer immediately after the signal.Notify call
(alongside the existing ctx, cancel and defer cancel) so the registration is
cleaned up for sigCh used in the goroutine that calls cancel().

In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 585-636: CreateServiceFromPeer allows peers to supply a custom
domain without explicit cluster-allow validation, causing
DeriveClusterFromDomain to later fail and surface as the wrong error; update
CreateServiceFromPeer to explicitly validate service.Domain when non-empty by
invoking the cluster-derivation/allowlist logic (e.g., call the same
DeriveClusterFromDomain or an allowlist check used by
initializeServiceForCreate) before persisting, and if the domain cannot be
mapped return an InvalidArgument-style error with a clear message that the
provided custom domain is not within a known/allowed cluster (reference
CreateServiceFromPeer, initializeServiceForCreate, DeriveClusterFromDomain).

In `@management/internals/modules/reverseproxy/reverseproxy_test.go`:
- Around line 461-548: Add a new subtest in TestFromExposeRequest to cover the
HTTPS protocol: call FromExposeRequest with Protocol set to
proto.ExposeProtocol_EXPOSE_HTTPS (similar to the "basic HTTP service" test),
then assert that the resulting service.Targets[0].Protocol equals "https" (and
that Port, TargetId, TargetType, Enabled, AccountID remain correct); this
ensures FromExposeRequest correctly maps EXPOSE_HTTPS to the "https" target
protocol.

In `@management/internals/modules/reverseproxy/reverseproxy.go`:
- Around line 460-466: The isAuthEnabled() helper used by Service.EventMeta
currently treats non-nil Auth pointers as enabled; change it to check each auth
object's Enabled boolean (e.g., Auth.PasswordAuth.Enabled, Auth.PinAuth.Enabled,
Auth.BearerAuth.Enabled) and return true only if any of those exist and have
Enabled == true; update the isAuthEnabled() implementation referenced by
Service.EventMeta to handle nil pointers safely (nil -> false) and read the
Enabled field when present so event metadata accurately reflects active auth.

In `@management/internals/server/boot.go`:
- Line 156: The call to srv.StartExposeReaper(context.Background()) starts a
reaper goroutine with no cancellation; change it to use a cancellable context
tied to the BaseServer lifecycle (e.g. a server-level context or shutdown
context) so the reaper can be stopped on graceful shutdown: modify the
BaseServer to create/store a context with cancel (or reuse its existing shutdown
context) and pass that stored context to StartExposeReaper instead of
context.Background(), and ensure the cancel is invoked when the server
Stop/Shutdown path runs.

In `@management/server/account_test.go`:
- Line 3125: The reverse proxy manager is being created with nil for the
settings manager which can hide settings-dependent behavior or cause nil derefs;
locate the call to reverseproxymanager.NewManager inside the createManager test
setup (the line with
manager.SetServiceManager(reverseproxymanager.NewManager(...))) and replace the
nil placeholder with the existing settingsMockManager used in createManager so
the mock settings manager is wired into reverseproxymanager.NewManager alongside
store, manager, and permissionsManager.

In `@shared/management/client/grpc.go`:
- Around line 724-768: RenewExpose (and StopExpose) call GetServerPublicKey on
every invocation, causing an extra RPC round-trip per call which can become
expensive when RenewExpose is invoked frequently; to fix, add a cached server
public key in GrpcClient (e.g., serverPubKeyCache field) and modify RenewExpose
and StopExpose to use the cached key if present and valid, falling back to
GetServerPublicKey only when the cache is empty or stale, and ensure cache
population/refresh logic (and any necessary locking) lives in GetServerPublicKey
or a new helper method to keep these methods' flow unchanged.

Comment thread client/cmd/expose.go Outdated
Comment thread client/cmd/expose.go
Comment thread client/proto/daemon.proto
Comment thread client/server/server.go
Comment thread management/internals/modules/reverseproxy/manager/manager_test.go
Comment thread management/internals/shared/grpc/expose_service.go Outdated
Comment thread management/internals/shared/grpc/expose_service.go
Comment thread management/internals/shared/grpc/expose_service.go
Comment thread management/internals/shared/grpc/validate_session_test.go Outdated
Comment thread shared/management/http/api/openapi.yml Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
management/internals/modules/reverseproxy/manager/manager_test.go (1)

421-427: newProxyServer duplicates the inline creation logic in setupIntegrationTest.

The same three-line pattern (NewOneTimeTokenStoreNewProxyServiceServert.Cleanup(srv.Close)) appears both here (as a closure) and inlined in setupIntegrationTest (lines 707–709). Promoting this to a package-level helper would consolidate both sites.

♻️ Proposed refactor
+func newTestProxyServer(t *testing.T) *nbgrpc.ProxyServiceServer {
+	t.Helper()
+	tokenStore := nbgrpc.NewOneTimeTokenStore(1 * time.Hour)
+	srv := nbgrpc.NewProxyServiceServer(nil, tokenStore, nbgrpc.ProxyOIDCConfig{}, nil, nil)
+	t.Cleanup(srv.Close)
+	return srv
+}

Then replace both call sites:

-	newProxyServer := func(t *testing.T) *nbgrpc.ProxyServiceServer {
-		t.Helper()
-		tokenStore := nbgrpc.NewOneTimeTokenStore(1 * time.Hour)
-		srv := nbgrpc.NewProxyServiceServer(nil, tokenStore, nbgrpc.ProxyOIDCConfig{}, nil, nil)
-		t.Cleanup(srv.Close)
-		return srv
-	}

and

-	tokenStore := nbgrpc.NewOneTimeTokenStore(1 * time.Hour)
-	proxySrv := nbgrpc.NewProxyServiceServer(nil, tokenStore, nbgrpc.ProxyOIDCConfig{}, nil, nil)
-	t.Cleanup(proxySrv.Close)
+	proxySrv := newTestProxyServer(t)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager_test.go` around
lines 421 - 427, The closure newProxyServer duplicates the inline proxy server
setup in setupIntegrationTest; extract this logic into a package-level helper
(e.g., createProxyServer or newProxyServer) that constructs a
nbgrpc.NewOneTimeTokenStore, calls nbgrpc.NewProxyServiceServer with the same
args, registers t.Cleanup(srv.Close) and returns the server; then replace the
inline creation in setupIntegrationTest and the existing closure with calls to
that new helper so both sites share the single implementation.
client/cmd/expose.go (1)

66-71: signal.Stop not called — goroutine and channel are leaked until process exit.

When exposeFn returns normally (stream ended before any signal), the goroutine is permanently blocked on <-sigCh and signal.Notify retains a reference to the channel, preventing both from being GC'd. In practice this is benign for a CLI (the OS reclaims everything on exit), but it's still a latent goroutine leak that could matter if exposeFn is ever invoked from within a long-running process or test.

♻️ Suggested fix
 	sigCh := make(chan os.Signal, 1)
 	signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
+	defer signal.Stop(sigCh)
 	go func() {
-		<-sigCh
-		cancel()
+		select {
+		case <-sigCh:
+			cancel()
+		case <-ctx.Done():
+		}
 	}()

Adding defer signal.Stop(sigCh) deregisters the channel on return, and the ctx.Done() arm lets the goroutine exit cleanly when the context is cancelled through any path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/cmd/expose.go` around lines 66 - 71, The signal channel registered by
signal.Notify is never deregistered and the goroutine blocks on <-sigCh, so add
cleanup and a ctx-aware exit: after creating sigCh call defer signal.Stop(sigCh)
to deregister it, and change the goroutine body that currently does <-sigCh;
cancel() to a select that listens on sigCh and ctx.Done() so the goroutine exits
if the context is cancelled; keep calling cancel() when a real signal is
received. Reference sigCh, signal.Notify, signal.Stop, cancel(), and ctx.Done()
in the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@management/internals/modules/reverseproxy/manager/manager_test.go`:
- Around line 699-704: Fix the mocks so each function slot dispatches
independently and the MockStore signature matches SqlStore: update
MockAccountManager.GetGroupByName to check and call GetGroupByNameFunc directly
(do not gate on GetGroupFunc) and ensure MockAccountManager.GetGroup checks
GetGroupFunc separately; change MockStore.GetGroupByName parameter order to
(lockStrength, accountID, groupName) to match SqlStore and update any internal
calls accordingly; finally, stop returning an unconditional &types.Group{} from
GetGroupFunc (return nil/error when no mock is set) so unexpected calls surface
during tests.

---

Duplicate comments:
In `@client/cmd/expose.go`:
- Around line 51-57: The port range validation around strconv.ParseUint(args[0],
10, 32) correctly checks parse errors and enforces 1–65535 via the port
variable; no code change required—leave the ParseUint call, the err check, and
the subsequent port == 0 || port > 65535 validation as-is in expose.go.
- Around line 105-121: Keep the existing default case in the switch over
event.Event — it's correct and should remain to catch unexpected event types;
ensure the switch still covers the cases *proto.ExposeServiceEvent_Ready,
*proto.ExposeServiceEvent_Error, *proto.ExposeServiceEvent_Stopped and the
default path that returns fmt.Errorf("unexpected expose event: %T", event.Event)
so unexpected events are surfaced rather than ignored.

In `@management/internals/modules/reverseproxy/manager/manager_test.go`:
- Around line 771-783: The test's inline comment is misleading; update the
comment in the "disallowed when no groups configured" subtest (around the setup
using setupIntegrationTest, testStore.GetAccountSettings, SaveAccountSettings
and calling mgr.ValidateExposePermission with testAccountID and testPeerID) to
clearly state: "Enable expose with empty groups — no groups configured means no
peer is allowed" so it matches the test behavior and intent.

---

Nitpick comments:
In `@client/cmd/expose.go`:
- Around line 66-71: The signal channel registered by signal.Notify is never
deregistered and the goroutine blocks on <-sigCh, so add cleanup and a ctx-aware
exit: after creating sigCh call defer signal.Stop(sigCh) to deregister it, and
change the goroutine body that currently does <-sigCh; cancel() to a select that
listens on sigCh and ctx.Done() so the goroutine exits if the context is
cancelled; keep calling cancel() when a real signal is received. Reference
sigCh, signal.Notify, signal.Stop, cancel(), and ctx.Done() in the changes.

In `@management/internals/modules/reverseproxy/manager/manager_test.go`:
- Around line 421-427: The closure newProxyServer duplicates the inline proxy
server setup in setupIntegrationTest; extract this logic into a package-level
helper (e.g., createProxyServer or newProxyServer) that constructs a
nbgrpc.NewOneTimeTokenStore, calls nbgrpc.NewProxyServiceServer with the same
args, registers t.Cleanup(srv.Close) and returns the server; then replace the
inline creation in setupIntegrationTest and the existing closure with calls to
that new helper so both sites share the single implementation.

Comment thread management/internals/modules/reverseproxy/manager/manager_test.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
management/internals/modules/reverseproxy/manager/manager_test.go (2)

692-695: storedEvents is written but never read.

storedEvents is appended to inside StoreEventFunc but no test using setupIntegrationTest ever asserts on it. This is dead code that can mislead future readers into thinking events are being verified at the integration level.

Consider either removing it (letting individual tests supply their own StoreEventFunc as they already do in TestDeletePeerService_SourcePeerValidation) or adding explicit event assertions in the integration tests that care about activity recording.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager_test.go` around
lines 692 - 695, Remove the unused dead state by deleting the storedEvents slice
and its append closure inside the mock_server.MockAccountManager StoreEventFunc
in setupIntegrationTest (or stop setting StoreEventFunc there); if integration
tests need to assert events, instead let individual tests provide their own
StoreEventFunc (as in TestDeletePeerService_SourcePeerValidation) and perform
explicit assertions on captured activity.Activity values. Ensure no other code
references storedEvents before deleting it.

793-876: Good integration tests for CreateServiceFromPeer.

Verifying domain derivation, custom domain passthrough, peer-IP host resolution, and store persistence covers the critical paths. One small observation: the "creates service with custom domain" sub-test (line 829) doesn't verify that the service was persisted in the store, unlike the "random domain" sub-test at line 822. If persistence is important for all paths, consider adding that assertion for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager_test.go` around
lines 793 - 876, The "creates service with custom domain" sub-test should also
verify persistence: change the setup call to capture the store (use mgr,
testStore := setupIntegrationTest(t)), then after calling
mgr.CreateServiceFromPeer(ctx, testAccountID, testPeerID, service) fetch the
persisted record via testStore.GetServiceByID(ctx, store.LockingStrengthNone,
testAccountID, created.ID) and add assertions mirroring the first sub-test
(assert.Equal for created.ID == persisted.ID and created.Domain ==
persisted.Domain) to ensure the service was stored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@management/internals/modules/reverseproxy/manager/manager_test.go`:
- Around line 698-704: The test uses a workaround because MockAccountManager's
GetGroupByName dispatch expects a different signature, causing GetGroupFunc vs
GetGroupByNameFunc to be mis-routed and parameters to be swapped; fix by making
the mock signature and dispatch consistent: update the mock implementation for
MockAccountManager (or MockStore) so GetGroupByNameFunc has the correct
parameter order and GetGroupFunc is not required as a shim, or add a small
adapter function (inside the test) that calls testStore.GetGroupByName(ctx,
store.LockingStrengthNone, accountID, groupName) with the correct ordering and
then set GetGroupByNameFunc to that adapter; ensure references include
MockAccountManager.GetGroupByName, GetGroupFunc, GetGroupByNameFunc,
MockStore.GetGroupByName and testStore.GetGroupByName so the mock and the test
use the same parameter order.

---

Nitpick comments:
In `@management/internals/modules/reverseproxy/manager/manager_test.go`:
- Around line 692-695: Remove the unused dead state by deleting the storedEvents
slice and its append closure inside the mock_server.MockAccountManager
StoreEventFunc in setupIntegrationTest (or stop setting StoreEventFunc there);
if integration tests need to assert events, instead let individual tests provide
their own StoreEventFunc (as in TestDeletePeerService_SourcePeerValidation) and
perform explicit assertions on captured activity.Activity values. Ensure no
other code references storedEvents before deleting it.
- Around line 793-876: The "creates service with custom domain" sub-test should
also verify persistence: change the setup call to capture the store (use mgr,
testStore := setupIntegrationTest(t)), then after calling
mgr.CreateServiceFromPeer(ctx, testAccountID, testPeerID, service) fetch the
persisted record via testStore.GetServiceByID(ctx, store.LockingStrengthNone,
testAccountID, created.ID) and add assertions mirroring the first sub-test
(assert.Equal for created.ID == persisted.ID and created.Domain ==
persisted.Domain) to ensure the service was stored.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
management/server/mock_server/account_mock.go (1)

871-876: ⚠️ Potential issue | 🔴 Critical

Pre-existing NPE: GetExternalCacheManager calls GetExternalCacheManagerFunc() unconditionally.

The guard condition invokes the function pointer rather than testing it for nil. If GetExternalCacheManagerFunc is never set, this panics instead of returning nil. Every other method in this file follows the if am.XFunc != nil { return am.XFunc(...) } pattern.

🐛 Proposed fix
 func (am *MockAccountManager) GetExternalCacheManager() account.ExternalCacheManager {
-	if am.GetExternalCacheManagerFunc() != nil {
-		return am.GetExternalCacheManagerFunc()
-	}
-	return nil
+	if am.GetExternalCacheManagerFunc != nil {
+		return am.GetExternalCacheManagerFunc()
+	}
+	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/server/mock_server/account_mock.go` around lines 871 - 876, The
method MockAccountManager.GetExternalCacheManager currently calls
am.GetExternalCacheManagerFunc() inside the if condition which can panic when
the function pointer is nil; change the guard to check the function pointer
itself (if am.GetExternalCacheManagerFunc != nil) and only then invoke it,
returning its result, otherwise return nil — follow the same pattern used by
other methods in the file and adjust MockAccountManager.GetExternalCacheManager
to reference the GetExternalCacheManagerFunc field rather than calling it in the
conditional.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@management/server/mock_server/account_mock.go`:
- Around line 871-876: The method MockAccountManager.GetExternalCacheManager
currently calls am.GetExternalCacheManagerFunc() inside the if condition which
can panic when the function pointer is nil; change the guard to check the
function pointer itself (if am.GetExternalCacheManagerFunc != nil) and only then
invoke it, returning its result, otherwise return nil — follow the same pattern
used by other methods in the file and adjust
MockAccountManager.GetExternalCacheManager to reference the
GetExternalCacheManagerFunc field rather than calling it in the conditional.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
management/internals/modules/reverseproxy/reverseproxy.go (1)

464-466: isAuthEnabled() ignores the Enabled field — may produce misleading activity metadata

The method returns true whenever any auth config struct is non-nil, even if Enabled == false. Because this feeds EventMeta() → activity records, a service with auth disabled but structurally present would be logged as "auth": true.

♻️ Proposed fix
 func (s *Service) isAuthEnabled() bool {
-    return s.Auth.PasswordAuth != nil || s.Auth.PinAuth != nil || s.Auth.BearerAuth != nil
+    return (s.Auth.PasswordAuth != nil && s.Auth.PasswordAuth.Enabled) ||
+        (s.Auth.PinAuth != nil && s.Auth.PinAuth.Enabled) ||
+        (s.Auth.BearerAuth != nil && s.Auth.BearerAuth.Enabled)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/reverseproxy.go` around lines 464 -
466, isAuthEnabled() currently returns true if any auth config pointer is
non-nil, which ignores each config's Enabled flag and can misreport auth in
EventMeta; update Service.isAuthEnabled to return true only when the config
pointer is non-nil AND its Enabled field is true (e.g., replace the current OR
of s.Auth.PasswordAuth != nil || ... with checks like s.Auth.PasswordAuth != nil
&& s.Auth.PasswordAuth.Enabled, and the same for PinAuth and BearerAuth) so
EventMeta and activity records reflect the actual enabled state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 719-723: addPeerInfoToEventMeta currently dereferences peer
without nil checks, which will panic if GetPeerByID returns (nil, nil); update
addPeerInfoToEventMeta to guard against a nil peer (and nil peer.IP) before
accessing peer.Name or calling peer.IP.String(): if peer is nil, return meta
unchanged (or set default values), otherwise set meta["peer_name"]=peer.Name and
meta["peer_ip"]=peer.IP.String() only when peer.IP is non-nil; also review
callers like CreateServiceFromPeer to ensure they can handle the
unchanged/default meta.
- Around line 591-596: The error message inside the block that generates a
random domain refers to service.ID which is not yet assigned; update the error
return in the if service.Domain == "" branch (the call to m.buildRandomDomain in
manager.go) to reference service.Name (or otherwise available identifier)
instead of service.ID, or move the InitNewRecord/initializeServiceForCreate call
to run before this domain-generation block; specifically change the fmt.Errorf
call in the domain creation branch to include service.Name so the logged
identifier is valid.

In `@management/internals/modules/reverseproxy/reverseproxy.go`:
- Around line 321-365: The FromExposeRequest function currently hardcodes
Target.Protocol to "http" instead of using req.Protocol from the
ExposeServiceRequest; change the code in FromExposeRequest to convert
req.Protocol (the ExposeProtocol enum) to a lowercase string and assign that to
the Target.Protocol field (e.g., strings.ToLower(req.Protocol.String())),
ensuring to import the strings package if missing; keep existing behavior for
other fields (Domain, Auth) unchanged and handle empty/unknown enum values
defensively (fall back to "http" if req.Protocol is unset or unknown).

---

Duplicate comments:
In `@management/internals/modules/reverseproxy/manager/manager_test.go`:
- Around line 698-700: The mock GetGroupByNameFunc has its parameter names
swapped (currently func(ctx context.Context, accountID, groupName string)),
causing values to be passed in the wrong order to testStore.GetGroupByName;
change the literal signature to func(ctx context.Context, groupName, accountID
string) so parameters match the actual Manager.GetGroupByName dispatch order,
leaving the inner call testStore.GetGroupByName(ctx, store.LockingStrengthNone,
groupName, accountID) unchanged; verify MockAccountManager.GetGroupByName and
store.Store.GetGroupByName signatures after the change to ensure ordering is
consistent.

In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 654-662: The buildRandomDomain method calls
m.clusterDeriver.GetClusterDomains() without checking if m.clusterDeriver is nil
and thus can panic; modify managerImpl.buildRandomDomain to first check if
m.clusterDeriver == nil and return a descriptive error (e.g., "no clusterDeriver
configured for service %s") or handle the nil by treating cluster domains as
empty, then proceed to call GetClusterDomains only when clusterDeriver is
non-nil and keep the existing empty-slice length check and error path; reference
the buildRandomDomain method, the managerImpl type and the
clusterDeriver/GetClusterDomains call when making the change.

---

Nitpick comments:
In `@management/internals/modules/reverseproxy/reverseproxy.go`:
- Around line 464-466: isAuthEnabled() currently returns true if any auth config
pointer is non-nil, which ignores each config's Enabled flag and can misreport
auth in EventMeta; update Service.isAuthEnabled to return true only when the
config pointer is non-nil AND its Enabled field is true (e.g., replace the
current OR of s.Auth.PasswordAuth != nil || ... with checks like
s.Auth.PasswordAuth != nil && s.Auth.PasswordAuth.Enabled, and the same for
PinAuth and BearerAuth) so EventMeta and activity records reflect the actual
enabled state.

Comment thread management/internals/modules/reverseproxy/manager/manager.go
Comment thread management/internals/modules/reverseproxy/manager/manager.go
Comment thread management/internals/modules/reverseproxy/reverseproxy.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
management/server/http/handlers/accounts/accounts_handler.go (1)

369-370: Potential nil serialization for PeerExposeGroups.

If settings.Extra.PeerExposeGroups is nil, it will serialize to JSON null rather than an empty array []. This is consistent with how FlowGroups (line 367) is handled but differs from JWTAllowGroups which is explicitly normalized to an empty slice (lines 329-331).

If the API schema expects a non-null array for peer_expose_groups, consider normalizing nil to an empty slice for consistency:

♻️ Optional: Normalize nil to empty slice
+		peerExposeGroups := settings.Extra.PeerExposeGroups
+		if peerExposeGroups == nil {
+			peerExposeGroups = []string{}
+		}
 		apiSettings.Extra = &api.AccountExtraSettings{
 			PeerApprovalEnabled:                settings.Extra.PeerApprovalEnabled,
 			UserApprovalRequired:               settings.Extra.UserApprovalRequired,
 			NetworkTrafficLogsEnabled:          settings.Extra.FlowEnabled,
 			NetworkTrafficLogsGroups:           settings.Extra.FlowGroups,
 			NetworkTrafficPacketCounterEnabled: settings.Extra.FlowPacketCounterEnabled,
 			PeerExposeEnabled:                  settings.Extra.PeerExposeEnabled,
-			PeerExposeGroups:                   settings.Extra.PeerExposeGroups,
+			PeerExposeGroups:                   peerExposeGroups,
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/server/http/handlers/accounts/accounts_handler.go` around lines
369 - 370, Normalize PeerExposeGroups the same way JWTAllowGroups is normalized:
when building the response in accounts_handler (the struct fields including
PeerExposeEnabled and PeerExposeGroups), replace direct use of
settings.Extra.PeerExposeGroups with a nil-safe value (e.g., if
settings.Extra.PeerExposeGroups == nil then use an empty []string{}), so
peer_expose_groups serializes as [] instead of null; follow the same pattern
used for JWTAllowGroups to keep behavior consistent.
management/internals/modules/reverseproxy/reverseproxy.go (2)

475-477: isAuthEnabled checks for config existence, not enabled state.

The function returns true if any auth config struct is non-nil, but doesn't check the Enabled flag within each config. This means a service with PasswordAuth{Enabled: false} would still report auth as enabled.

♻️ Proposed fix to check Enabled flag
 func (s *Service) isAuthEnabled() bool {
-	return s.Auth.PasswordAuth != nil || s.Auth.PinAuth != nil || s.Auth.BearerAuth != nil
+	return (s.Auth.PasswordAuth != nil && s.Auth.PasswordAuth.Enabled) ||
+		(s.Auth.PinAuth != nil && s.Auth.PinAuth.Enabled) ||
+		(s.Auth.BearerAuth != nil && s.Auth.BearerAuth.Enabled)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/reverseproxy.go` around lines 475 -
477, The isAuthEnabled method on Service incorrectly treats any non-nil auth
config as enabled; update Service.isAuthEnabled to check each auth config's
Enabled boolean (e.g., s.Auth.PasswordAuth != nil &&
s.Auth.PasswordAuth.Enabled, likewise for s.Auth.PinAuth and s.Auth.BearerAuth)
and return true only if at least one of those configs exists AND has Enabled ==
true; keep short-circuit logic so it still returns false when Auth itself or all
Enabled flags are false or nil.

367-376: Consider handling TCP/UDP protocols explicitly or returning an error.

The ExposeProtocol enum includes EXPOSE_TCP and EXPOSE_UDP values, but these fall through to the default case returning "http". If TCP/UDP are not yet supported, consider either:

  1. Logging a warning when an unsupported protocol is used
  2. Adding explicit cases that return an error or "tcp"/"udp" strings

This would prevent silent protocol substitution that could cause confusing behavior.

♻️ Proposed fix (option 1: explicit unsupported handling)
 func exposeProtocolToString(p proto.ExposeProtocol) string {
 	switch p {
 	case proto.ExposeProtocol_EXPOSE_HTTP:
 		return "http"
 	case proto.ExposeProtocol_EXPOSE_HTTPS:
 		return "https"
+	case proto.ExposeProtocol_EXPOSE_TCP, proto.ExposeProtocol_EXPOSE_UDP:
+		// TCP/UDP not yet supported, fall back to HTTP
+		log.Warnf("unsupported expose protocol %v, defaulting to http", p)
+		return "http"
 	default:
 		return "http"
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/reverseproxy.go` around lines 367 -
376, The exposeProtocolToString function currently maps EXPOSE_TCP and
EXPOSE_UDP to the default "http" silently; update exposeProtocolToString (which
takes proto.ExposeProtocol) to handle EXPOSE_TCP and EXPOSE_UDP explicitly
instead of returning "http" by default—either return "tcp"/"udp" strings for
those cases or return an error/empty string and log a warning via the existing
logger when an unsupported protocol is encountered; ensure the switch includes
cases for proto.ExposeProtocol_EXPOSE_TCP and proto.ExposeProtocol_EXPOSE_UDP
and that callers handle the new error/unsupported return appropriately.
management/internals/modules/reverseproxy/manager/manager.go (1)

639-652: Return nil instead of empty slice when returning an error.

At line 641, the function returns []string{}, fmt.Errorf(...). When returning an error, conventionally the value should be nil to clearly indicate failure rather than an empty but valid slice.

♻️ Proposed fix
 func (m *managerImpl) getGroupIDsFromNames(ctx context.Context, accountID string, groupNames []string) ([]string, error) {
 	if len(groupNames) == 0 {
-		return []string{}, fmt.Errorf("no group names provided")
+		return nil, fmt.Errorf("no group names provided")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager.go` around lines
639 - 652, In getGroupIDsFromNames (method on managerImpl) change the error
return for the empty-input check to return a nil slice instead of an empty
slice; specifically replace the current return of []string{} with nil when
returning the fmt.Errorf("no group names provided") error so callers can
distinguish failure (nil) from a successful empty result, leaving the loop error
returns (which already return nil) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 599-605: The error formatting references service.ID before it’s
set; either move the initialization so service.ID is populated before the
getGroupIDsFromNames call or change the error to not use service.ID. Concretely,
call initializeServiceForCreate (or service.InitNewRecord()) prior to the block
that invokes m.getGroupIDsFromNames with
service.Auth.BearerAuth.DistributionGroups so service.ID is assigned, or modify
the fmt.Errorf call around m.getGroupIDsFromNames to use a stable identifier
(e.g., service.Name or accountID) instead of service.ID.

---

Duplicate comments:
In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 654-662: buildRandomDomain currently calls
m.clusterDeriver.GetClusterDomains() without checking m.clusterDeriver for nil;
add a nil-check at the start of buildRandomDomain and return a clear error
(e.g., "no cluster deriver configured for service <name>") if m.clusterDeriver
is nil to avoid a nil-pointer panic. Update buildRandomDomain to validate
m.clusterDeriver before calling GetClusterDomains, keep the existing
empty-clusterDomains check, and ensure callers like
CreateServiceFromPeer/initializeServiceForCreate propagate the error rather than
assuming a domain was returned.

---

Nitpick comments:
In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 639-652: In getGroupIDsFromNames (method on managerImpl) change
the error return for the empty-input check to return a nil slice instead of an
empty slice; specifically replace the current return of []string{} with nil when
returning the fmt.Errorf("no group names provided") error so callers can
distinguish failure (nil) from a successful empty result, leaving the loop error
returns (which already return nil) unchanged.

In `@management/internals/modules/reverseproxy/reverseproxy.go`:
- Around line 475-477: The isAuthEnabled method on Service incorrectly treats
any non-nil auth config as enabled; update Service.isAuthEnabled to check each
auth config's Enabled boolean (e.g., s.Auth.PasswordAuth != nil &&
s.Auth.PasswordAuth.Enabled, likewise for s.Auth.PinAuth and s.Auth.BearerAuth)
and return true only if at least one of those configs exists AND has Enabled ==
true; keep short-circuit logic so it still returns false when Auth itself or all
Enabled flags are false or nil.
- Around line 367-376: The exposeProtocolToString function currently maps
EXPOSE_TCP and EXPOSE_UDP to the default "http" silently; update
exposeProtocolToString (which takes proto.ExposeProtocol) to handle EXPOSE_TCP
and EXPOSE_UDP explicitly instead of returning "http" by default—either return
"tcp"/"udp" strings for those cases or return an error/empty string and log a
warning via the existing logger when an unsupported protocol is encountered;
ensure the switch includes cases for proto.ExposeProtocol_EXPOSE_TCP and
proto.ExposeProtocol_EXPOSE_UDP and that callers handle the new
error/unsupported return appropriately.

In `@management/server/http/handlers/accounts/accounts_handler.go`:
- Around line 369-370: Normalize PeerExposeGroups the same way JWTAllowGroups is
normalized: when building the response in accounts_handler (the struct fields
including PeerExposeEnabled and PeerExposeGroups), replace direct use of
settings.Extra.PeerExposeGroups with a nil-safe value (e.g., if
settings.Extra.PeerExposeGroups == nil then use an empty []string{}), so
peer_expose_groups serializes as [] instead of null; follow the same pattern
used for JWTAllowGroups to keep behavior consistent.

Comment thread management/internals/modules/reverseproxy/manager/manager.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
management/internals/shared/grpc/expose_service.go (3)

9-9: Deprecated github.com/golang/protobuf/proto import.

This was flagged in a prior review. The github.com/golang/protobuf module is deprecated in favor of google.golang.org/protobuf. If the encryption.EncryptMessage signature accepts google.golang.org/protobuf/proto.Message, you can switch; otherwise, the constraint comes from upstream and should be noted.

#!/bin/bash
# Check what type encryption.EncryptMessage expects
ast-grep --pattern 'func EncryptMessage($_, $_, $_ $_) ($_, $_) {
  $$$
}'
# Also check the import used in encryption package
rg -n 'import' encryption/message.go -A 10
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/shared/grpc/expose_service.go` at line 9, Replace the
deprecated import pb "github.com/golang/protobuf/proto" in expose_service.go
with the modern google.golang.org/protobuf/proto package (e.g., import proto
"google.golang.org/protobuf/proto"), then update any uses that reference
pb.Message to proto.Message so they match the new API; if
encryption.EncryptMessage currently expects the old golang/protobuf types, open
the encryption package (function EncryptMessage) and change its parameter types
to google.golang.org/protobuf/proto.Message (or add a conversion wrapper) so
callers in expose_service.go compile, otherwise document that the upstream
encryption API must be updated and keep the old import until that upstream
change is made.

92-118: Race window between count check and LoadOrStore — concurrent creates can exceed maxExposesPerPeer.

The exposeCreateMu lock (Lines 95-100) is released before CreateServiceFromPeer (Line 102) and LoadOrStore (Line 109). Two concurrent requests for the same peer with different domains can both pass the count check and insert, exceeding the limit.

This was flagged in a prior review. The current approach is a reasonable trade-off (avoids holding a global lock during I/O), but the limit enforcement is best-effort rather than strict. If strict enforcement is needed in the future, consider a per-peer lock or a two-phase reservation pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/shared/grpc/expose_service.go` around lines 92 - 118,
There is a race between the count check and LoadOrStore that can let concurrent
CreateExpose calls exceed maxExposesPerPeer; fix by making a short reservation
under the existing exposeCreateMu: after verifying countPeerExposes(peer.ID) <
maxExposesPerPeer, insert a temporary reservation entry into s.activeExposes
(use exposeKey(peer.ID, "<reserved-UUID>") or a tombstone/placeholder) while
still holding exposeCreateMu, then release the lock and call
reverseProxyMgr.CreateServiceFromPeer; on success replace the reservation with
the real activeExpose (use created.Domain for the final key) and on failure
remove the reservation (or call s.deleteExposeService) to free the slot;
reference functions/fields: exposeCreateMu, countPeerExposes,
reverseProxyMgr.CreateServiceFromPeer, s.activeExposes, exposeKey,
deleteExposeService and ensure cleanup on any error to keep counts correct.

249-257: ⚠️ Potential issue | 🟡 Minor

Cleanup failure logged at debug level — can silently leave orphaned services.

deleteExposeService (Line 255) logs at Debugf on failure. This is called on the duplicate-detection path (Line 116), where a service was already persisted to the DB. If deletion fails, the orphaned service is invisible unless debug logging is enabled. Consider logging at Warn level for this path.

Proposed fix
 func (s *Server) deleteExposeService(ctx context.Context, accountID, peerID string, service *reverseproxy.Service) {
 	reverseProxyMgr := s.getReverseProxyManager()
 	if reverseProxyMgr == nil {
 		return
 	}
 	if err := reverseProxyMgr.DeleteServiceFromPeer(ctx, accountID, peerID, service.ID); err != nil {
-		log.WithContext(ctx).Debugf("failed to delete expose service %s: %v", service.ID, err)
+		log.WithContext(ctx).Warnf("failed to delete expose service %s (potential orphan): %v", service.ID, err)
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/shared/grpc/expose_service.go` around lines 249 - 257,
The deleteExposeService function currently logs failures at Debugf which can
hide orphaned services; update deleteExposeService to log deletion errors at
Warn level (e.g., replace the log.WithContext(ctx).Debugf call with a Warnf) and
include the service.ID and error from reverseProxyMgr.DeleteServiceFromPeer so
failed deletions are visible in normal logs; reference deleteExposeService,
reverseProxyMgr.DeleteServiceFromPeer, and the existing
log.WithContext(ctx).Debugf location when making the change.
🧹 Nitpick comments (6)
shared/management/client/client.go (1)

29-30: Parameter name domain shadows the imported domain package.

The parameter identifier domain in RenewExpose and StopExpose clashes with the package-level import "github.com/netbirdio/netbird/shared/management/domain" (line 10). Interface declarations have no body, so this compiles fine here. However, any concrete implementor that also needs the domain package inside these method bodies will hit a name collision and must work around it.

Consider a less ambiguous parameter name:

✏️ Suggested rename
-	RenewExpose(ctx context.Context, domain string) error
-	StopExpose(ctx context.Context, domain string) error
+	RenewExpose(ctx context.Context, serviceDomain string) error
+	StopExpose(ctx context.Context, serviceDomain string) error
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shared/management/client/client.go` around lines 29 - 30, The parameter name
`domain` in the interface methods RenewExpose and StopExpose shadows the
imported package "github.com/netbirdio/netbird/shared/management/domain"; rename
the parameter (for example to `domainName` or `exposeDomain`) in the RenewExpose
and StopExpose signatures so implementations can refer to the imported domain
package without a name collision, and update any callers accordingly.
client/internal/expose/manager.go (2)

36-44: Storing context.Context in a struct — consider documenting the rationale.

Holding ctx in the Manager struct is unconventional in Go (per the context package docs: "Do not store Contexts inside a struct type"). Here it's intentional so stop() can outlive the caller's context. A brief comment on Line 39 explaining why ctx is stored (e.g., "used as parent for StopExpose so cleanup survives caller cancellation") would help future readers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/internal/expose/manager.go` around lines 36 - 44, Add a brief
explanatory comment for the ctx field on the Manager struct to justify storing a
context in the struct (contrary to the usual guidance); mention that ctx is
intentionally kept as the parent context for cleanup/stop operations (e.g., used
by stop() / StopExpose) so those shutdown actions can complete even if the
caller's context is canceled, and place the comment adjacent to the ctx field in
the Manager declaration (affecting Manager and NewManager usage).

60-78: KeepAlive silently succeeds on context cancellation but returns error on renewal failure — consider logging consistency.

When ctx is canceled (Line 67-70), KeepAlive logs and returns nil. When renew fails (Line 72-74), it logs and returns the error. In both cases stop() runs via defer, which is correct. However, a single failed renewal immediately terminates the session — there's no retry/grace period. If the management server has a brief hiccup, the user's expose session is torn down.

Consider whether a short retry or tolerance for transient failures would be appropriate (the server-side TTL is 90s, so a couple of failed 10s renewals could be absorbed).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/internal/expose/manager.go` around lines 60 - 78, KeepAlive currently
returns immediately on the first renewal error from m.renew and logs it, which
tears down the expose session on transient server hiccups; change KeepAlive
(method of Manager) to tolerate a small number of consecutive renew failures
(e.g., 2–3 retries given the 90s TTL and 10s ticker) by counting consecutive
errors, logging them at Warn level, and only calling return err and stop after
the threshold is exceeded; reset the consecutive-failure counter to zero on a
successful m.renew, and ensure m.stop(domain) still runs via the existing defer.
client/cmd/expose.go (1)

179-193: waitForExposeEvents discards all received events — no handling of server-side stop/error signals.

The loop ignores the event content (_, err := stream.Recv()). If the daemon were to send a Stopped or Error event in the future, it would be silently consumed. Currently the proto ExposeServiceEvent oneof only contains ready, so this is technically correct today, but it makes the loop fragile against future proto extensions.

Consider at least logging unexpected events at debug level for observability.

Proposed improvement
 func waitForExposeEvents(cmd *cobra.Command, ctx context.Context, stream proto.DaemonService_ExposeServiceClient) error {
 	for {
-		_, err := stream.Recv()
+		event, err := stream.Recv()
 		if err != nil {
 			if ctx.Err() != nil {
 				cmd.Println("\nService stopped.")
 				//nolint:nilerr
 				return nil
 			}
 			if errors.Is(err, io.EOF) {
 				return fmt.Errorf("connection to daemon closed unexpectedly")
 			}
 			return fmt.Errorf("stream error: %w", err)
 		}
+		log.Debugf("received unexpected expose event: %T", event.Event)
 	}
 }
management/internals/shared/grpc/expose_service.go (1)

215-230: encryptResponse swallows the underlying error — consider logging it server-side.

Line 218 returns a generic "internal error" to the client (good for security), but doesn't log the actual error from GetWGKey(). Same for the encryption failure on Line 223. Adding a server-side log before returning would aid debugging.

Proposed fix
 func (s *Server) encryptResponse(peerKey wgtypes.Key, msg pb.Message) (*proto.EncryptedMessage, error) {
 	wgKey, err := s.secretsManager.GetWGKey()
 	if err != nil {
+		log.Errorf("failed to get WG key for response encryption: %v", err)
 		return nil, status.Errorf(codes.Internal, "internal error")
 	}
 
 	encryptedResp, err := encryption.EncryptMessage(peerKey, wgKey, msg)
 	if err != nil {
+		log.Errorf("failed to encrypt response for peer %s: %v", peerKey.String(), err)
 		return nil, status.Errorf(codes.Internal, "encrypt response")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/shared/grpc/expose_service.go` around lines 215 - 230,
encryptResponse currently returns generic internal errors but drops the
underlying errors from s.secretsManager.GetWGKey() and
encryption.EncryptMessage(), making debugging hard; update encryptResponse to
log the actual error details to the server-side logger before returning the
generic status.Errorf, e.g. capture err after GetWGKey() and after
encryption.EncryptMessage() and call the Server's logger (use the existing
logging field on Server, e.g. s.logger or s.log) to log a descriptive message
with err, then return the same status.Errorf to the client; keep the returned
messages unchanged for security.
shared/management/client/grpc.go (1)

745-789: RenewExpose and StopExpose each call GetServerPublicKey() — consider caching or passing the key.

Each renewal (every 10s) makes a separate GetServerKey RPC to the management server before the actual RenewExpose RPC. That doubles the request rate during keep-alive. This follows the existing pattern in the codebase, so it's not a regression, but for a high-frequency call like renewal it's worth considering caching the server key (it rarely changes).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shared/management/client/grpc.go` around lines 745 - 789, RenewExpose and
StopExpose call GetServerPublicKey() on every invocation causing extra RPCs; to
fix, add a cached server public key on GrpcClient (e.g., a serverPubKey
*wgkey.PublicKey field plus a mutex or atomic) and modify GetServerPublicKey (or
add ensureServerPublicKey()) to return the cached key if present, otherwise
fetch, cache, and return it; update RenewExpose and StopExpose to use the cached
key instead of calling GetServerPublicKey() unconditionally and ensure you
invalidate or refresh the cache on errors or when the server indicates key
rotation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/cmd/expose.go`:
- Around line 146-155: The error string in toExposeProtocol uses inconsistent
quoting: update the fmt.Errorf in toExposeProtocol to use matching single quotes
around both options (e.g., "unsupported protocol %q: only 'http' or 'https' are
supported") so the message is consistent and includes quotes around 'https';
modify the return error construction in function toExposeProtocol accordingly.

In `@client/proto/daemon.proto`:
- Around line 825-843: ExposeServiceStopped and ExposeServiceError are defined
but not referenced in the ExposeServiceEvent oneof and are unused; either remove
these unused messages or wire them up: add ExposeServiceStopped and
ExposeServiceError to the ExposeServiceEvent oneof alongside ExposeServiceReady,
then update server code that currently emits ExposeServiceReady to also emit the
stopped/error variants where appropriate (look for the logic or function that
constructs/sends ExposeServiceReady) and update the CLI/event consumer to handle
ExposeServiceStopped and ExposeServiceError cases when parsing
ExposeServiceEvent so stop/error conditions are surfaced to users.

---

Duplicate comments:
In `@management/internals/shared/grpc/expose_service.go`:
- Line 9: Replace the deprecated import pb "github.com/golang/protobuf/proto" in
expose_service.go with the modern google.golang.org/protobuf/proto package
(e.g., import proto "google.golang.org/protobuf/proto"), then update any uses
that reference pb.Message to proto.Message so they match the new API; if
encryption.EncryptMessage currently expects the old golang/protobuf types, open
the encryption package (function EncryptMessage) and change its parameter types
to google.golang.org/protobuf/proto.Message (or add a conversion wrapper) so
callers in expose_service.go compile, otherwise document that the upstream
encryption API must be updated and keep the old import until that upstream
change is made.
- Around line 92-118: There is a race between the count check and LoadOrStore
that can let concurrent CreateExpose calls exceed maxExposesPerPeer; fix by
making a short reservation under the existing exposeCreateMu: after verifying
countPeerExposes(peer.ID) < maxExposesPerPeer, insert a temporary reservation
entry into s.activeExposes (use exposeKey(peer.ID, "<reserved-UUID>") or a
tombstone/placeholder) while still holding exposeCreateMu, then release the lock
and call reverseProxyMgr.CreateServiceFromPeer; on success replace the
reservation with the real activeExpose (use created.Domain for the final key)
and on failure remove the reservation (or call s.deleteExposeService) to free
the slot; reference functions/fields: exposeCreateMu, countPeerExposes,
reverseProxyMgr.CreateServiceFromPeer, s.activeExposes, exposeKey,
deleteExposeService and ensure cleanup on any error to keep counts correct.
- Around line 249-257: The deleteExposeService function currently logs failures
at Debugf which can hide orphaned services; update deleteExposeService to log
deletion errors at Warn level (e.g., replace the log.WithContext(ctx).Debugf
call with a Warnf) and include the service.ID and error from
reverseProxyMgr.DeleteServiceFromPeer so failed deletions are visible in normal
logs; reference deleteExposeService, reverseProxyMgr.DeleteServiceFromPeer, and
the existing log.WithContext(ctx).Debugf location when making the change.

---

Nitpick comments:
In `@client/internal/expose/manager.go`:
- Around line 36-44: Add a brief explanatory comment for the ctx field on the
Manager struct to justify storing a context in the struct (contrary to the usual
guidance); mention that ctx is intentionally kept as the parent context for
cleanup/stop operations (e.g., used by stop() / StopExpose) so those shutdown
actions can complete even if the caller's context is canceled, and place the
comment adjacent to the ctx field in the Manager declaration (affecting Manager
and NewManager usage).
- Around line 60-78: KeepAlive currently returns immediately on the first
renewal error from m.renew and logs it, which tears down the expose session on
transient server hiccups; change KeepAlive (method of Manager) to tolerate a
small number of consecutive renew failures (e.g., 2–3 retries given the 90s TTL
and 10s ticker) by counting consecutive errors, logging them at Warn level, and
only calling return err and stop after the threshold is exceeded; reset the
consecutive-failure counter to zero on a successful m.renew, and ensure
m.stop(domain) still runs via the existing defer.

In `@management/internals/shared/grpc/expose_service.go`:
- Around line 215-230: encryptResponse currently returns generic internal errors
but drops the underlying errors from s.secretsManager.GetWGKey() and
encryption.EncryptMessage(), making debugging hard; update encryptResponse to
log the actual error details to the server-side logger before returning the
generic status.Errorf, e.g. capture err after GetWGKey() and after
encryption.EncryptMessage() and call the Server's logger (use the existing
logging field on Server, e.g. s.logger or s.log) to log a descriptive message
with err, then return the same status.Errorf to the client; keep the returned
messages unchanged for security.

In `@shared/management/client/client.go`:
- Around line 29-30: The parameter name `domain` in the interface methods
RenewExpose and StopExpose shadows the imported package
"github.com/netbirdio/netbird/shared/management/domain"; rename the parameter
(for example to `domainName` or `exposeDomain`) in the RenewExpose and
StopExpose signatures so implementations can refer to the imported domain
package without a name collision, and update any callers accordingly.

In `@shared/management/client/grpc.go`:
- Around line 745-789: RenewExpose and StopExpose call GetServerPublicKey() on
every invocation causing extra RPCs; to fix, add a cached server public key on
GrpcClient (e.g., a serverPubKey *wgkey.PublicKey field plus a mutex or atomic)
and modify GetServerPublicKey (or add ensureServerPublicKey()) to return the
cached key if present, otherwise fetch, cache, and return it; update RenewExpose
and StopExpose to use the cached key instead of calling GetServerPublicKey()
unconditionally and ensure you invalidate or refresh the cache on errors or when
the server indicates key rotation.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9621c10 and 0494f39.

⛔ Files ignored due to path filters (1)
  • client/proto/daemon.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (11)
  • client/cmd/expose.go
  • client/internal/engine.go
  • client/internal/expose/manager.go
  • client/internal/expose/manager_test.go
  • client/internal/expose/request.go
  • client/proto/daemon.proto
  • client/server/server.go
  • management/internals/shared/grpc/expose_service.go
  • shared/management/client/client.go
  • shared/management/client/grpc.go
  • shared/management/client/mock.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • client/internal/expose/manager_test.go
  • client/internal/engine.go
  • client/server/server.go

Comment thread client/cmd/expose.go
Comment thread client/proto/daemon.proto Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
client/proto/daemon.proto (1)

808-813: EXPOSE_TCP and EXPOSE_UDP remain unreachable from the CLI.

toExposeProtocol in expose.go only maps "http" and "https", so both enum values are currently inaccessible and the underlying reverseproxy.go path is also hardcoded to HTTP. This was flagged in a previous review; the maintainer's call to defer multi-protocol support is acknowledged — just a reminder to revisit when TCP/UDP land.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/proto/daemon.proto` around lines 808 - 813, The ExposeProtocol enum
defines EXPOSE_TCP and EXPOSE_UDP but they are unreachable because
toExposeProtocol in expose.go only maps "http" and "https" and reverseproxy.go
hardcodes an HTTP path; update toExposeProtocol to accept and return the TCP and
UDP enum variants for the corresponding CLI strings (e.g., "tcp", "udp") and
update the reverseproxy logic in reverseproxy.go (or add guards where
reverseproxy is invoked) to either handle non-HTTP protocols or return a clear
unsupported-protocol error until TCP/UDP implementation is added; reference
ExposeProtocol, toExposeProtocol, and reverseproxy.go when making these changes.
management/internals/modules/reverseproxy/manager/manager.go (2)

665-670: ⚠️ Potential issue | 🟡 Minor

service.ID is empty in the error message at line 668 — initializeServiceForCreate hasn't run yet.

service.ID is assigned inside initializeServiceForCreate (line 673). The error at line 668 always prints an empty ID. Use service.Name instead, consistent with how the domain-build error at line 660 was fixed.

🐛 Proposed fix
-			return nil, fmt.Errorf("get group ids for service %s: %w", service.ID, err)
+			return nil, fmt.Errorf("get group ids for service %q: %w", service.Name, err)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager.go` around lines
665 - 670, The error message when getGroupIDsFromNames fails uses service.ID
which is still empty because initializeServiceForCreate hasn't run; update the
fmt.Errorf call in the block that checks service.Auth.BearerAuth (the code
calling m.getGroupIDsFromNames) to reference service.Name instead of service.ID
so the log shows a useful identifier; keep the rest of the error wrap intact and
ensure no other code paths around initializeServiceForCreate rely on service.ID
being present.

720-728: ⚠️ Potential issue | 🔴 Critical

buildRandomDomain dereferences m.clusterDeriver without a nil guard — will panic if clusterDeriver is nil.

CreateServiceFromPeer calls buildRandomDomain at line 658 when service.Domain == "", but initializeServiceForCreate (the only place that previously guarded against nil clusterDeriver, line 170) is not called until line 673 — after buildRandomDomain. If the manager is constructed without a clusterDeriver and a peer expose request arrives without a custom domain, line 721 panics.

🛡️ Proposed fix
 func (m *managerImpl) buildRandomDomain(name string) (string, error) {
+	if m.clusterDeriver == nil {
+		return "", fmt.Errorf("cluster deriver is not configured")
+	}
 	clusterDomains := m.clusterDeriver.GetClusterDomains()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager.go` around lines
720 - 728, buildRandomDomain currently dereferences m.clusterDeriver and will
panic if it's nil; add a nil guard in buildRandomDomain (check m.clusterDeriver
== nil) and return a descriptive error (e.g., "clusterDeriver is nil, cannot
build domain for service <name>") instead of dereferencing, so callers like
CreateServiceFromPeer can handle the error; update callers if needed
(CreateServiceFromPeer) to propagate or handle this error rather than assuming a
domain, and keep initializeServiceForCreate as-is.
management/internals/modules/reverseproxy/manager/manager_test.go (1)

696-698: Verify GetGroupByNameFunc parameter swap is intentional and correct.

The mock parameter names (accountID, groupName) are inverted relative to the values actually received: getGroupIDsFromNames calls m.accountManager.GetGroupByName(ctx, groupName, accountID), so the mock receives groupName as its first parameter (locally named accountID) and accountID as its second (locally named groupName). The body then passes them re-swapped to testStore.GetGroupByName. This is only correct if store.Store.GetGroupByName has signature (ctx, lockStrength, accountID, groupName) — please verify.

#!/bin/bash
# Verify the store.Store.GetGroupByName interface signature
rg -n "GetGroupByName" --type go -A 2 management/server/store/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager_test.go` around
lines 696 - 698, The mock GetGroupByNameFunc has its parameter names swapped
relative to the caller (getGroupIDsFromNames calls
m.accountManager.GetGroupByName(ctx, groupName, accountID)) which risks passing
arguments in the wrong order to testStore.GetGroupByName; verify the actual
signature of store.Store.GetGroupByName (look for
store.Store.GetGroupByName(ctx, lockStrength, accountID, groupName) vs (ctx,
lockStrength, groupName, accountID)), and then either (A) correct the mock’s
parameter order and names so they match the caller (change the func signature to
func(ctx context.Context, groupName, accountID string) and pass them into
testStore.GetGroupByName in the correct order) or (B) if the store signature
expects (accountID, groupName) keep order but rename local parameters to avoid
confusion; update GetGroupByNameFunc, the call sites
(m.accountManager.GetGroupByName), and any tests to use the verified ordering.
🧹 Nitpick comments (3)
client/cmd/expose.go (2)

78-80: strings.ToLower called twice on the same value.

♻️ Suggested refactor
 func isProtocolValid(exposeProtocol string) bool {
-	return strings.ToLower(exposeProtocol) == "http" || strings.ToLower(exposeProtocol) == "https"
+	p := strings.ToLower(exposeProtocol)
+	return p == "http" || p == "https"
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/cmd/expose.go` around lines 78 - 80, The isProtocolValid function
redundantly calls strings.ToLower twice on exposeProtocol; change it to compute
the lowercased value once (e.g., v := strings.ToLower(exposeProtocol)) and
compare v == "http" || v == "https" (or alternatively use strings.EqualFold for
case-insensitive comparisons) to avoid duplicate work and clarify intent.

99-107: Signal goroutine outlives the function when no Ctrl+C is received.

defer cancel() on line 100 cancels the context on return, but the goroutine on lines 104-107 is blocked on <-sigCh with no way to observe context cancellation. It leaks until an OS signal arrives or the process exits. For this CLI use-case the process exits right after, making the leak benign — but select with ctx.Done() is the idiomatic fix.

♻️ Suggested refactor
 sigCh := make(chan os.Signal, 1)
 signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
+defer signal.Stop(sigCh)
 go func() {
-	<-sigCh
-	cancel()
+	select {
+	case <-sigCh:
+		cancel()
+	case <-ctx.Done():
+	}
 }()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/cmd/expose.go` around lines 99 - 107, The spawned goroutine reading
from sigCh can block and leak because it doesn't observe context cancellation;
change the goroutine started after signal.Notify to use a select that listens on
both sigCh and ctx.Done(), calling cancel() only when a signal is received and
returning the goroutine when ctx.Done() is closed; also call signal.Stop(sigCh)
(or close the channel) as part of shutdown to stop further notifications. Target
symbols: ctx, cancel, sigCh, signal.Notify and the anonymous goroutine.
proxy/management_integration_test.go (1)

254-256: CreateServiceFromPeer return value is inconsistent with CreateService.

CreateService (line 206) returns nil, errors.New("not implemented") to signal the method is a deliberate no-op, but CreateServiceFromPeer returns a success value (&reverseproxy.Service{}, nil). This asymmetry could mislead future test authors into thinking expose-create is functional in this test double. Consider aligning them:

♻️ Proposed consistency fix
-func (m *storeBackedServiceManager) CreateServiceFromPeer(_ context.Context, _, _ string, _ *reverseproxy.Service) (*reverseproxy.Service, error) {
-	return &reverseproxy.Service{}, nil
-}
+func (m *storeBackedServiceManager) CreateServiceFromPeer(_ context.Context, _, _ string, _ *reverseproxy.Service) (*reverseproxy.Service, error) {
+	return nil, errors.New("not implemented")
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/management_integration_test.go` around lines 254 - 256, The test
double's CreateServiceFromPeer currently returns a successful empty Service
(&reverseproxy.Service{}, nil) which is inconsistent with CreateService (which
returns nil, errors.New("not implemented")); update CreateServiceFromPeer in
storeBackedServiceManager to mirror CreateService by returning nil and the same
not-implemented error (use the same error value/text) so both methods clearly
signal the deliberate no-op; adjust the return in CreateServiceFromPeer
accordingly and reuse the same error where possible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 629-649: The three plain fmt.Errorf returns in
ValidateExposePermission (the checks against settings.PeerExposeEnabled,
settings.PeerExposeGroups and the final "peer is not in an allowed expose
group") must be converted to gRPC status errors so callers receive
PermissionDenied: replace the fmt.Errorf returns with
status.Errorf(codes.PermissionDenied, "...") using the same human messages, and
keep the store error return as a wrapped gRPC error (e.g.,
status.Errorf(codes.Internal, "get peer groups: %v", err)) so
m.store.GetPeerGroupIDs errors map to an appropriate gRPC code; update imports
if needed and preserve existing log.WithContext(ctx).Errorf call and the
slices.Contains check/loop logic.

In `@management/server/http/handlers/accounts/accounts_handler.go`:
- Around line 346-347: The PeerExposeGroups field is being assigned directly
from settings.PeerExposeGroups which may be nil and will marshal to JSON null;
update the struct assignment where PeerExposeGroups is set (the same block that
sets JwtAllowGroups) to nil-coalesce it to an empty slice (e.g., use the same
pattern used for JwtAllowGroups or a conditional that sets PeerExposeGroups =
[]string{} when settings.PeerExposeGroups == nil) so the JSON output is an empty
array instead of null.

---

Duplicate comments:
In `@client/proto/daemon.proto`:
- Around line 808-813: The ExposeProtocol enum defines EXPOSE_TCP and EXPOSE_UDP
but they are unreachable because toExposeProtocol in expose.go only maps "http"
and "https" and reverseproxy.go hardcodes an HTTP path; update toExposeProtocol
to accept and return the TCP and UDP enum variants for the corresponding CLI
strings (e.g., "tcp", "udp") and update the reverseproxy logic in
reverseproxy.go (or add guards where reverseproxy is invoked) to either handle
non-HTTP protocols or return a clear unsupported-protocol error until TCP/UDP
implementation is added; reference ExposeProtocol, toExposeProtocol, and
reverseproxy.go when making these changes.

In `@management/internals/modules/reverseproxy/manager/manager_test.go`:
- Around line 696-698: The mock GetGroupByNameFunc has its parameter names
swapped relative to the caller (getGroupIDsFromNames calls
m.accountManager.GetGroupByName(ctx, groupName, accountID)) which risks passing
arguments in the wrong order to testStore.GetGroupByName; verify the actual
signature of store.Store.GetGroupByName (look for
store.Store.GetGroupByName(ctx, lockStrength, accountID, groupName) vs (ctx,
lockStrength, groupName, accountID)), and then either (A) correct the mock’s
parameter order and names so they match the caller (change the func signature to
func(ctx context.Context, groupName, accountID string) and pass them into
testStore.GetGroupByName in the correct order) or (B) if the store signature
expects (accountID, groupName) keep order but rename local parameters to avoid
confusion; update GetGroupByNameFunc, the call sites
(m.accountManager.GetGroupByName), and any tests to use the verified ordering.

In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 665-670: The error message when getGroupIDsFromNames fails uses
service.ID which is still empty because initializeServiceForCreate hasn't run;
update the fmt.Errorf call in the block that checks service.Auth.BearerAuth (the
code calling m.getGroupIDsFromNames) to reference service.Name instead of
service.ID so the log shows a useful identifier; keep the rest of the error wrap
intact and ensure no other code paths around initializeServiceForCreate rely on
service.ID being present.
- Around line 720-728: buildRandomDomain currently dereferences m.clusterDeriver
and will panic if it's nil; add a nil guard in buildRandomDomain (check
m.clusterDeriver == nil) and return a descriptive error (e.g., "clusterDeriver
is nil, cannot build domain for service <name>") instead of dereferencing, so
callers like CreateServiceFromPeer can handle the error; update callers if
needed (CreateServiceFromPeer) to propagate or handle this error rather than
assuming a domain, and keep initializeServiceForCreate as-is.

---

Nitpick comments:
In `@client/cmd/expose.go`:
- Around line 78-80: The isProtocolValid function redundantly calls
strings.ToLower twice on exposeProtocol; change it to compute the lowercased
value once (e.g., v := strings.ToLower(exposeProtocol)) and compare v == "http"
|| v == "https" (or alternatively use strings.EqualFold for case-insensitive
comparisons) to avoid duplicate work and clarify intent.
- Around line 99-107: The spawned goroutine reading from sigCh can block and
leak because it doesn't observe context cancellation; change the goroutine
started after signal.Notify to use a select that listens on both sigCh and
ctx.Done(), calling cancel() only when a signal is received and returning the
goroutine when ctx.Done() is closed; also call signal.Stop(sigCh) (or close the
channel) as part of shutdown to stop further notifications. Target symbols: ctx,
cancel, sigCh, signal.Notify and the anonymous goroutine.

In `@proxy/management_integration_test.go`:
- Around line 254-256: The test double's CreateServiceFromPeer currently returns
a successful empty Service (&reverseproxy.Service{}, nil) which is inconsistent
with CreateService (which returns nil, errors.New("not implemented")); update
CreateServiceFromPeer in storeBackedServiceManager to mirror CreateService by
returning nil and the same not-implemented error (use the same error value/text)
so both methods clearly signal the deliberate no-op; adjust the return in
CreateServiceFromPeer accordingly and reuse the same error where possible.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0494f39 and b0b4d52.

⛔ Files ignored due to path filters (1)
  • client/proto/daemon.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (17)
  • client/cmd/expose.go
  • client/proto/daemon.proto
  • management/internals/modules/reverseproxy/interface.go
  • management/internals/modules/reverseproxy/interface_mock.go
  • management/internals/modules/reverseproxy/manager/manager.go
  • management/internals/modules/reverseproxy/manager/manager_test.go
  • management/internals/server/boot.go
  • management/internals/shared/grpc/proxy_group_access_test.go
  • management/server/account.go
  • management/server/account_test.go
  • management/server/http/handlers/accounts/accounts_handler.go
  • management/server/http/handlers/proxy/auth_callback_integration_test.go
  • management/server/store/sql_store.go
  • management/server/types/settings.go
  • proxy/management_integration_test.go
  • shared/management/http/api/openapi.yml
  • shared/management/http/api/types.gen.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • management/server/account.go
  • management/server/store/sql_store.go
  • management/internals/server/boot.go
  • shared/management/http/api/types.gen.go
  • management/server/account_test.go
  • management/internals/shared/grpc/proxy_group_access_test.go
  • management/server/http/handlers/proxy/auth_callback_integration_test.go
  • management/internals/modules/reverseproxy/interface_mock.go

Comment thread management/internals/modules/reverseproxy/manager/manager.go Outdated
Comment thread management/server/http/handlers/accounts/accounts_handler.go
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
management/internals/modules/reverseproxy/manager/manager.go (2)

665-669: ⚠️ Potential issue | 🟡 Minor

Avoid referencing service.ID before it’s initialized.

service.ID is still empty here; the error message loses context. Use service.Name (or initialize first) so logs are actionable.

🐛 Suggested fix
-		return nil, fmt.Errorf("get group ids for service %s: %w", service.ID, err)
+		return nil, fmt.Errorf("get group ids for service %q: %w", service.Name, err)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager.go` around lines
665 - 669, The error message references service.ID which is not yet initialized,
so replace service.ID with a stable identifier (e.g., service.Name) or ensure
the ID is set before this block; in the block that calls m.getGroupIDsFromNames
(inside manager.go where service.Auth.BearerAuth is checked), update the
fmt.Errorf call to include service.Name (or initialize service.ID earlier) so
the returned error contains actionable context.

720-727: ⚠️ Potential issue | 🟠 Major

Guard against nil clusterDeriver to prevent panic.

CreateServiceFromPeer can call buildRandomDomain before initializeServiceForCreate, so a nil clusterDeriver will panic here.

🛡️ Suggested fix
 func (m *managerImpl) buildRandomDomain(name string) (string, error) {
+	if m.clusterDeriver == nil {
+		return "", fmt.Errorf("cluster deriver not available for service %s", name)
+	}
 	clusterDomains := m.clusterDeriver.GetClusterDomains()
 	if len(clusterDomains) == 0 {
 		return "", fmt.Errorf("no cluster domains found for service %s", name)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@management/internals/modules/reverseproxy/manager/manager.go` around lines
720 - 727, The buildRandomDomain method can panic if m.clusterDeriver is nil;
add a nil-check at the start of managerImpl.buildRandomDomain to return a
descriptive error (e.g., "clusterDeriver is nil for service <name>") instead of
dereferencing it, and keep the existing check for empty clusterDomains after
calling m.clusterDeriver.GetClusterDomains(); ensure callers such as
CreateServiceFromPeer/initializeServiceForCreate handle the returned error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@management/internals/modules/reverseproxy/manager/manager.go`:
- Around line 665-669: The error message references service.ID which is not yet
initialized, so replace service.ID with a stable identifier (e.g., service.Name)
or ensure the ID is set before this block; in the block that calls
m.getGroupIDsFromNames (inside manager.go where service.Auth.BearerAuth is
checked), update the fmt.Errorf call to include service.Name (or initialize
service.ID earlier) so the returned error contains actionable context.
- Around line 720-727: The buildRandomDomain method can panic if
m.clusterDeriver is nil; add a nil-check at the start of
managerImpl.buildRandomDomain to return a descriptive error (e.g.,
"clusterDeriver is nil for service <name>") instead of dereferencing it, and
keep the existing check for empty clusterDomains after calling
m.clusterDeriver.GetClusterDomains(); ensure callers such as
CreateServiceFromPeer/initializeServiceForCreate handle the returned error.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0b4d52 and c5c46bf.

📒 Files selected for processing (1)
  • management/internals/modules/reverseproxy/manager/manager.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants