Error while proposing node removal #4056

danielmai · 2019-09-25T00:21:35Z

What version of Dgraph are you using?

v1.1.0

Have you tried reproducing the issue with the latest release?

Yes

What is the hardware spec (RAM, OS)?

Ubuntu Linux

Steps to reproduce the issue (command/config used to run Dgraph).

Run a Dgraph cluster with multiple Alpha replicas in a group.

# From dgraph-io/dgraph repo root directory
cd ./compose
./run.sh

Remove an Alpha.

curl localhost:6180/removeNode?id=1&group=1

The Alpha is removed successfully according to the logs and /state. But the new Alpha leader continues to print the following log every second:

E0925 00:17:12.688470       1 groups.go:322] Error while proposing node removal: Node 0x1 not part of group
github.com/dgraph-io/dgraph/conn.(*Node).ProposePeerRemoval
	/tmp/go/src/github.com/dgraph-io/dgraph/conn/node.go:594
github.com/dgraph-io/dgraph/worker.(*groupi).applyState.func1
	/tmp/go/src/github.com/dgraph-io/dgraph/worker/groups.go:320
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1337

Expected behaviour and actual result.

A successful node removal should not make an error log show up repeatedly.

The text was updated successfully, but these errors were encountered:

raasss · 2019-09-26T12:57:23Z

Me having the same problem. But I think I did alpha node rebuild the wrong way.

I stopped 1 of 3 alpha nodes.
Wiped node database data and started node again.
The node started to panic because I forgot to do rest api call to remove node from cluster.
Stopped alpha node again
Wiped node database data
I did api call and removed node with http://:6080/removeNode?id=3&group=1'
Started node and he joined the cluster but I started getting same error log

martinmr · 2019-11-07T18:47:01Z

There doesn't to be an actual issue but I agree that the logs should be cleaned up. Here's what's happening.

Node is removed via the zero endpoint.
A new leader is elected.
The leader receives an update with the other node marked as removed. Node is removed successfully.
Subsequent updates try to remove the node again. It's already removed and not a peer of the new leader so an error is returned.

The solution is to make the new leader know that it's already removed the node so that it doesn't try again.

martinmr self-assigned this Nov 5, 2019

martinmr mentioned this issue Nov 7, 2019

Change member removal logic to remove members only once. #4254

Merged

martinmr closed this as completed in #4254 Nov 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while proposing node removal #4056

Error while proposing node removal #4056

danielmai commented Sep 25, 2019

raasss commented Sep 26, 2019 •

edited

Loading

martinmr commented Nov 7, 2019

Error while proposing node removal #4056

Error while proposing node removal #4056

Comments

danielmai commented Sep 25, 2019

What version of Dgraph are you using?

Have you tried reproducing the issue with the latest release?

What is the hardware spec (RAM, OS)?

Steps to reproduce the issue (command/config used to run Dgraph).

Expected behaviour and actual result.

raasss commented Sep 26, 2019 • edited Loading

martinmr commented Nov 7, 2019

raasss commented Sep 26, 2019 •

edited

Loading