Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle various edge cases around cluster memberships. #2791

Merged
merged 2 commits into from
Nov 29, 2018

Conversation

manishrjain
Copy link
Contributor

@manishrjain manishrjain commented Nov 29, 2018

This change is Reviewable

errUpdatedMember = errors.New("Cluster member has updated credentials.")
errServerShutDown = errors.New("Server is being shut down.")
// errUnknownMember = errors.New("Unknown cluster member")
errUpdatedMember = errors.New("Cluster member has updated credentials.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errUpdatedMember is unused (from varcheck)

// removing a freshly added node.
for _, member := range g.state.Removed {
if member.GroupId == g.Node.gid && g.Node.AmLeader() {
go g.Node.ProposePeerRemoval(context.Background(), member.Id)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error return value of g.Node.ProposePeerRemoval is not checked (from errcheck)

@manishrjain manishrjain merged commit 9e1ce3d into master Nov 29, 2018
@manishrjain manishrjain deleted the mrjn/handle-membership-issues branch November 29, 2018 01:11
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
)

- We had a bug where we'll infinite remove and connect to an address which is present in both active nodes and removed nodes. This PR checks for active nodes first, before removing a connection.
- Make the errors a lot more descriptive. Use a unique string to determine when a node must crash, due to permanent failures. Refactor those conditions out in `x.ShouldCrash`.
- Removing a live node using `/removeNode` should now automatically crash that node, and cause others to remove the connection to that node, which is convenient.
- Trying to connect using the same ID or the same Address (while prev conn is valid), would also crash the new process (wanted behavior).
- If Zero already has a member set for a given Address (and ID=0), it would return the member info directly, instead of proposing first.

In general, this PR handles a bunch of edge cases and bugs around membership.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants