-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add connectionbroker package #1850
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Would writing unit tests for this package be worthwhile?
Are you wanting to merge this PR before the work to integrate it is done, or is the PR just opened to get review?I see the original PR now.
|
||
// Remotes returns the remotes interface used by the broker, so the caller | ||
// can make observations or see weights directly. | ||
func (b *Broker) Remotes() remotes.Remotes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case for this method? It seems to me that this breaks separation of concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The agent needs access to Remotes
to update the set of managers based on the list it gets from the dispatcher.
The alternative to exposing Remotes
through this method is to implement Observe
, Weights
, and Remove
as connectionbroker methods, and that feels like it would pollute the connectionbroker method set, instead of keeping that functionality specific to Remotes
.
Or we could pass the underlying Remotes
into the agent separately, but I don't like that much either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
Looks good to me. |
// records a positive experience with the remote peer if success is true, | ||
// otherwise it records a negative experience. If a local connection is in use, | ||
// Close is a noop. | ||
func (c *Conn) Close(success bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little counter-intuitive to close an "unsuccessful" connection. Would it be better to move if success
logic to SelectRemote
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Close
won't be called when SelectRemote
returns an error. The success
parameter is determined later on. For example, if something tries an RPC and that RPC fails, success
will be set to false when closing the connection so that a negative observation is made and future attempts will prefer a different remote.
I think Dial
generally doesn't fail immediately (since Dial
is an async call), but it's probably a good idea to automatically perform a negative observation if Dial
returns an error. I'll add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an observation if Dial
fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can success
be internal to Conn
, like derived from operation failure on Conn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not, because Conn
doesn't know whether RPCs that use it fail or not.
We could split it out into a separate Observe
call on Conn
. But I think it's nice to have it built into Close
. That way, the use of a connection has to be either successful or unsuccessful. You can't forget to call Observe
(we've had bugs like this before).
If it's weird for Close
to take a parameter, we could rename this to something else like CloseAndObserve
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because Conn doesn't know whether RPCs that use it fail or not.
Could this be moved to where failure is observed? If Conn
cannot observe failures, maybe weight handling shouldn't be done here. How about doing it at RPC calls? When RPC fails, decrease the node's weight. On RPC success, adjust weight according to previous state (no need to increase weight on continuous success operations).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be moved to where failure is observed?
We can do that, but as I mentioned above, I think it's better this way. We had a several bugs before where certain code paths returned early and skipped adjusting the weight. If weight adjustment is part of closing the connection, it can't be skipped accidentally without leaking a connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. We have similar problem in class Swarm where operation failure should adjust node preference. It's hard to do that on all the calls.
This package is a small abstraction on top of Remotes that provides a gRPC connection to a manager. If running on a manager, it uses the unix socket, otherwise it will pick a remote manager using Remotes. This will allow things like agent and certificate renewal to work even on a single node with no TCP port bound. Signed-off-by: Aaron Lehmann <[email protected]>
6d7ff1f
to
e32ca49
Compare
LGTM |
This package is a small abstraction on top of
Remotes
that provides agRPC connection to a manager. If running on a manager, it uses the unix
socket, otherwise it will pick a remote manager using
Remotes
. Thiswill allow things like agent and certificate renewal to work even on a
single node with no TCP port bound.
This is a piece of #1826. The next PR will convert the CA and agent to
use connectionbroker.