-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple fixes to improve cluster ops #3337
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @manishrjain and @martinmr)
conn/node.go, line 380 at r1 (raw file):
// Exit after this deadline. Let BatchAndSendMessages create another goroutine, if needed. deadline := time.Now().Add(60 * time.Second) ticker := time.NewTicker(time.Second)
PR says the ticker is 500ms but this says 1 second?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 7 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain)
edgraph/config.go, line 20 at r1 (raw file):
import ( "expvar"
why are the changes in this file part of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @danielmai and @martinmr)
conn/node.go, line 380 at r1 (raw file):
Previously, danielmai (Daniel Mai) wrote…
PR says the ticker is 500ms but this says 1 second?
Will update the PR description.
edgraph/config.go, line 20 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
why are the changes in this file part of this PR?
No longer need to set the conf var, I just log all the options. It was broken to begin with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @danielmai and @martinmr)
conn/node.go, line 380 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Will update the PR description.
Done.
edgraph/config.go, line 20 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
No longer need to set the conf var, I just log all the options. It was broken to begin with.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @danielmai and @martinmr)
…timeout error during conn creation on a stopped node.
… for Delta field during proposal.
Multiple fixes to improve cluster ops: 1. A delta proposal should bypass the throttle pending proposal mechanism. This is important to ensure that the latest updates from Zero are always proposed. 2. Do not retry connection to a dead server in a busy-wait loop. Instead, use a ticker (1s) to pause before a retry. On error, only log once. Return after 60s. 3. Output all the configs on start to log file. Remove a broken dgraph_conf var. Changes: * Fix the flakiness of blockade test. Was happening because of the 10s timeout error during conn creation on a stopped node. * Simplify the change by removing the Important bit. Instead just check for Delta field during proposal.
Multiple fixes to improve cluster ops: 1. A delta proposal should bypass the throttle pending proposal mechanism. This is important to ensure that the latest updates from Zero are always proposed. 2. Do not retry connection to a dead server in a busy-wait loop. Instead, use a ticker (1s) to pause before a retry. On error, only log once. Return after 60s. 3. Output all the configs on start to log file. Remove a broken dgraph_conf var. Changes: * Fix the flakiness of blockade test. Was happening because of the 10s timeout error during conn creation on a stopped node. * Simplify the change by removing the Important bit. Instead just check for Delta field during proposal.
Multiple fixes to improve cluster ops:
This change is