[management] Fix race condition in experimental network map when deleting account#5064
[management] Fix race condition in experimental network map when deleting account#5064
Conversation
Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>
📝 WalkthroughWalkthroughContext parameters are propagated through the method call chain. The Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🧰 Additional context used🧠 Learnings (1)📚 Learning: 2025-11-28T12:20:47.254ZApplied to files:
🧬 Code graph analysis (1)management/internals/controllers/network_map/controller/controller.go (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
management/internals/controllers/network_map/controller/controller.go (2)
144-156: Critical: Nil pointer dereference risk after context cancellation.With context propagation enabled by this PR,
getAccountFromHolderOrInitcan now returnnilwhen the context is cancelled (e.g., during account deletion). However, line 156 accessesaccount.Peerswithout checking ifaccountisnil, which will cause a panic.The non-experimental path (lines 147-150) correctly handles errors, but the experimental path lacks this check.
🐛 Proposed fix to add nil check
if c.experimentalNetworkMap(accountID) { account = c.getAccountFromHolderOrInit(ctx, accountID) + if account == nil { + return fmt.Errorf("failed to get account %s", accountID) + } } else { account, err = c.requestBuffer.GetAccountWithBackpressure(ctx, accountID) if err != nil { return fmt.Errorf("failed to get account: %v", err) } }
416-425: Critical: Nil pointer dereference risk after context cancellation.Similar to the issue at line 145,
getAccountFromHolderOrInitcan returnnilwhen context is cancelled, but line 425 accessesaccount.Idwithout a nil check, causing a potential panic.🐛 Proposed fix to add nil check
if c.experimentalNetworkMap(accountID) { account = c.getAccountFromHolderOrInit(ctx, accountID) + if account == nil { + return nil, nil, nil, 0, fmt.Errorf("failed to get account %s", accountID) + } } else { account, err = c.requestBuffer.GetAccountWithBackpressure(ctx, accountID) if err != nil { return nil, nil, nil, 0, err } }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
management/internals/controllers/network_map/controller/controller.gomanagement/server/types/holder.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-28T12:20:47.254Z
Learnt from: bcmmbaga
Repo: netbirdio/netbird PR: 4849
File: management/internals/modules/zones/manager/manager.go:55-86
Timestamp: 2025-11-28T12:20:47.254Z
Learning: In the NetBird management server, DNS zones without records are automatically filtered out in network map generation (filterPeerAppliedZones in management/internals/controllers/network_map/controller/controller.go checks `len(zone.Records) == 0`). Therefore, CreateZone operations don't need to call UpdateAccountPeers since empty zones don't affect the network map.
Applied to files:
management/internals/controllers/network_map/controller/controller.go
🧬 Code graph analysis (1)
management/internals/controllers/network_map/controller/controller.go (1)
management/server/types/account.go (1)
Account(74-109)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Android / Build
- GitHub Check: iOS / Build
- GitHub Check: release_ui_darwin
- GitHub Check: release_ui
- GitHub Check: release
- GitHub Check: Client / Unit
- GitHub Check: Darwin
- GitHub Check: Linux
- GitHub Check: Windows
- GitHub Check: Build Cache
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: JS / Lint
🔇 Additional comments (3)
management/server/types/holder.go (1)
35-47: LGTM! Context propagation correctly implemented.The change from
context.Background()to the caller-providedctxenables proper cancellation propagation. This is essential for fixing the race condition during account deletion—operations can now be cancelled when the parent context is cancelled, rather than running to completion with an uncancellable background context.management/internals/controllers/network_map/controller/controller.go (2)
550-560: LGTM! Context propagation correctly implemented.The method signature update and context propagation to
LoadOrStoreFuncare correct. This enables proper cancellation handling when loading accounts from the holder.
478-486: LGTM! Proper nil handling demonstrated.This call site correctly checks for
nilaccount and handles it gracefully with a warning log and safe fallback. This is the pattern that should be followed at lines 145 and 417.



Describe your changes
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
Note: These are internal code optimizations with no user-facing changes or new functionality.
✏️ Tip: You can customize this high-level summary in your review settings.