-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[client] add reset for management backoff #4935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: netbirdio/netbird
Length of output: 95
🏁 Script executed:
Repository: netbirdio/netbird
Length of output: 933
🏁 Script executed:
Repository: netbirdio/netbird
Length of output: 126
🏁 Script executed:
sed -n '130,170p' shared/management/client/grpc.goRepository: netbirdio/netbird
Length of output: 1346
Do not reset backoff from within the operation function.
Calling
backOff.Reset()insidehandleStream(line 163) whilebackoff.Retryis managing the same instance corrupts retry state. The ExponentialBackOff implementation is not thread-safe. Use a separate backoff instance for the stream handler, or avoid resetting within the operation.🤖 Prompt for AI Agents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this specific case, handleStream is only ever called from Sync, and the entire call chain (backoff.Retry → operation → handleStream) runs within a single goroutine. There is no concurrent access to the backOff instance, so calling Reset() here does not introduce a race condition.
This pattern is also already used in the signal client with the same retry/backoff approach:
netbird/shared/signal/client/grpc.go
Lines 165 to 178 in b7e98ac
I think backoff.Retry API is primarily designed for short-lived, one-shot operations (e.g. HTTP requests), where the lifecycle of the backoff instance clearly maps to a single attempt sequence. In our case, however, we are using it to manage a long-lived streaming connection, where a single “attempt” may last minutes or hours, and a successful stream session is logically equivalent to a completed retry cycle. If we want to completely avoid any semantic ambiguity or incorrect assumptions about backoff ownership, the clean solution would be a structural change. But they would also require a broader refactor rather than a localized fix.
Would it be acceptable to keep the current PR as-is and revisit all retry/backoff structure in Netbird client in feature PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Length of output: 935
🏁 Script executed:
Length of output: 0
🏁 Script executed:
Length of output: 0
🏁 Script executed:
Length of output: 3783
@gamerslouis, you're absolutely right. I verified the signal client implementation and it uses the exact same pattern with
backOff.Reset()at line 175-177, including a helpful comment explaining the rationale:Your points are well-taken:
backoff.Retryto manage long-lived streaming connectionsSuggestion: Consider adding a similar explanatory comment at line 163 in
handleStreamto document why the reset is needed, making the intent explicit for future maintainers:The PR is acceptable as-is. If you'd like to track the broader structural refactor you mentioned (potentially moving away from
backoff.Retryfor streaming connections), I can help create a GitHub issue to document the discussion and proposed improvements for future work. Would that be helpful?🧬 Code Graph Analysis Results
[shared/management/client/grpc.go:37-60]
Summary: NewClient uses a short-lived retry loop to establish a gRPC connection to the Management Service.
Rationale: This shows how a single connection establishment workflow uses the project's backoff/backoff.Retry mechanism to manage transient failures during setup, aligning with the user comment about Retry semantics and backoff ownership.
[shared/management/client/grpc.go:65-85]
Summary: defaultBackoff constructs a backoff strategy used for general retry/backoff logic.
Rationale: This captures how the code defines a long-lived, context-aware backoff strategy used by Retry calls elsewhere (e.g., in Sync and Login flows). It’s central to understanding how backoff ownership and lifecycle are intended to map to the operation lifecycle.
[shared/management/client/grpc.go:102-140]
Summary: handleStream uses a streaming RPC with a backoff object and explicitly resets the backoff after a stream attempt completes.
Rationale: This directly relates to the user’s point about Reset() on backOff within a single-goroutine flow, the absence of races for this usage, and the broader discussion about how backoff/backoff.Retry is intended to be used with streaming connections. It also documents the exact control flow around stream handling, error categorization, and retry triggering.
[shared/management/client/grpc.go: additional note]
Context: The user comment references a pattern used in the signal client with the same retry/backoff approach (and a link to a specific lines range). This file’s own usage of backoff and the handleStream pattern is the most directly relevant to that discussion and to understanding how backoff is intended to be owned and Reset in this codebase. If you want to compare, inspect shared/signal/client/grpc.go around the analogous connection/setup code (not provided here).