Skip to content

[management] Fix race condition in experimental network map when deleting account#5064

Merged
bcmmbaga merged 2 commits intomainfrom
fix/holder
Jan 8, 2026
Merged

[management] Fix race condition in experimental network map when deleting account#5064
bcmmbaga merged 2 commits intomainfrom
fix/holder

Conversation

@bcmmbaga
Copy link
Copy Markdown
Contributor

@bcmmbaga bcmmbaga commented Jan 8, 2026

Describe your changes

Issue ticket number and link

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

Summary by CodeRabbit

  • Refactor
    • Updated internal method signatures to propagate context through account loading operations.

Note: These are internal code optimizations with no user-facing changes or new functionality.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 8, 2026

📝 Walkthrough

Walkthrough

Context parameters are propagated through the method call chain. The getAccountFromHolderOrInit method in the controller and the LoadOrStoreFunc method in holder.go are updated to accept context.Context as the first parameter, replacing prior usage of context.Background().

Changes

Cohort / File(s) Summary
Context Propagation
management/internals/controllers/network_map/controller/controller.go
Method signature updated: getAccountFromHolderOrInit now accepts ctx context.Context as first parameter; call sites updated to pass context accordingly.
Context Propagation
management/server/types/holder.go
Method signature updated: LoadOrStoreFunc now accepts ctx context.Context as first parameter and passes it to accGetter instead of context.Background().

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Context flows like morning dew,
Through method calls, both old and new,
No more background, just the right way—
Hop by hop, the code's at play! 🌱

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is largely incomplete: critical sections like 'Describe your changes' and 'Issue ticket number and link' are empty, providing no explanation of the fix or context beyond the template structure. Add a detailed description of the race condition being fixed and why propagating context resolves it. Include the issue ticket number/link and explain the behavioral changes to the method signatures.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: fixing a race condition in the experimental network map when deleting an account, which aligns with the signature updates to propagate context through account loading.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24df442 and e708886.

📒 Files selected for processing (2)
  • management/internals/controllers/network_map/controller/controller.go
  • management/server/types/holder.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-28T12:20:47.254Z
Learnt from: bcmmbaga
Repo: netbirdio/netbird PR: 4849
File: management/internals/modules/zones/manager/manager.go:55-86
Timestamp: 2025-11-28T12:20:47.254Z
Learning: In the NetBird management server, DNS zones without records are automatically filtered out in network map generation (filterPeerAppliedZones in management/internals/controllers/network_map/controller/controller.go checks `len(zone.Records) == 0`). Therefore, CreateZone operations don't need to call UpdateAccountPeers since empty zones don't affect the network map.

Applied to files:

  • management/internals/controllers/network_map/controller/controller.go
🧬 Code graph analysis (1)
management/internals/controllers/network_map/controller/controller.go (1)
management/server/types/account.go (1)
  • Account (74-109)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Android / Build
  • GitHub Check: iOS / Build
  • GitHub Check: release_ui_darwin
  • GitHub Check: release_ui
  • GitHub Check: release
  • GitHub Check: Client / Unit
  • GitHub Check: Darwin
  • GitHub Check: Linux
  • GitHub Check: Windows
  • GitHub Check: Build Cache
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: JS / Lint
🔇 Additional comments (3)
management/server/types/holder.go (1)

35-47: LGTM! Context propagation correctly implemented.

The change from context.Background() to the caller-provided ctx enables proper cancellation propagation. This is essential for fixing the race condition during account deletion—operations can now be cancelled when the parent context is cancelled, rather than running to completion with an uncancellable background context.

management/internals/controllers/network_map/controller/controller.go (2)

550-560: LGTM! Context propagation correctly implemented.

The method signature update and context propagation to LoadOrStoreFunc are correct. This enables proper cancellation handling when loading accounts from the holder.


478-486: LGTM! Proper nil handling demonstrated.

This call site correctly checks for nil account and handles it gracefully with a warning log and safe fallback. This is the pattern that should be followed at lines 145 and 417.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
management/internals/controllers/network_map/controller/controller.go (2)

144-156: Critical: Nil pointer dereference risk after context cancellation.

With context propagation enabled by this PR, getAccountFromHolderOrInit can now return nil when the context is cancelled (e.g., during account deletion). However, line 156 accesses account.Peers without checking if account is nil, which will cause a panic.

The non-experimental path (lines 147-150) correctly handles errors, but the experimental path lacks this check.

🐛 Proposed fix to add nil check
 if c.experimentalNetworkMap(accountID) {
     account = c.getAccountFromHolderOrInit(ctx, accountID)
+    if account == nil {
+        return fmt.Errorf("failed to get account %s", accountID)
+    }
 } else {
     account, err = c.requestBuffer.GetAccountWithBackpressure(ctx, accountID)
     if err != nil {
         return fmt.Errorf("failed to get account: %v", err)
     }
 }

416-425: Critical: Nil pointer dereference risk after context cancellation.

Similar to the issue at line 145, getAccountFromHolderOrInit can return nil when context is cancelled, but line 425 accesses account.Id without a nil check, causing a potential panic.

🐛 Proposed fix to add nil check
 if c.experimentalNetworkMap(accountID) {
     account = c.getAccountFromHolderOrInit(ctx, accountID)
+    if account == nil {
+        return nil, nil, nil, 0, fmt.Errorf("failed to get account %s", accountID)
+    }
 } else {
     account, err = c.requestBuffer.GetAccountWithBackpressure(ctx, accountID)
     if err != nil {
         return nil, nil, nil, 0, err
     }
 }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24df442 and e708886.

📒 Files selected for processing (2)
  • management/internals/controllers/network_map/controller/controller.go
  • management/server/types/holder.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-28T12:20:47.254Z
Learnt from: bcmmbaga
Repo: netbirdio/netbird PR: 4849
File: management/internals/modules/zones/manager/manager.go:55-86
Timestamp: 2025-11-28T12:20:47.254Z
Learning: In the NetBird management server, DNS zones without records are automatically filtered out in network map generation (filterPeerAppliedZones in management/internals/controllers/network_map/controller/controller.go checks `len(zone.Records) == 0`). Therefore, CreateZone operations don't need to call UpdateAccountPeers since empty zones don't affect the network map.

Applied to files:

  • management/internals/controllers/network_map/controller/controller.go
🧬 Code graph analysis (1)
management/internals/controllers/network_map/controller/controller.go (1)
management/server/types/account.go (1)
  • Account (74-109)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Android / Build
  • GitHub Check: iOS / Build
  • GitHub Check: release_ui_darwin
  • GitHub Check: release_ui
  • GitHub Check: release
  • GitHub Check: Client / Unit
  • GitHub Check: Darwin
  • GitHub Check: Linux
  • GitHub Check: Windows
  • GitHub Check: Build Cache
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: JS / Lint
🔇 Additional comments (3)
management/server/types/holder.go (1)

35-47: LGTM! Context propagation correctly implemented.

The change from context.Background() to the caller-provided ctx enables proper cancellation propagation. This is essential for fixing the race condition during account deletion—operations can now be cancelled when the parent context is cancelled, rather than running to completion with an uncancellable background context.

management/internals/controllers/network_map/controller/controller.go (2)

550-560: LGTM! Context propagation correctly implemented.

The method signature update and context propagation to LoadOrStoreFunc are correct. This enables proper cancellation handling when loading accounts from the holder.


478-486: LGTM! Proper nil handling demonstrated.

This call site correctly checks for nil account and handles it gracefully with a warning log and safe fallback. This is the pattern that should be followed at lines 145 and 417.

@bcmmbaga bcmmbaga merged commit 00e2689 into main Jan 8, 2026
57 of 59 checks passed
@bcmmbaga bcmmbaga deleted the fix/holder branch January 8, 2026 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants