-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guard against multiple consuming segments for same partition #6483
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6483 +/- ##
==========================================
+ Coverage 66.44% 73.10% +6.65%
==========================================
Files 1075 1334 +259
Lines 54773 65595 +10822
Branches 8168 9567 +1399
==========================================
+ Hits 36396 47952 +11556
+ Misses 15700 14499 -1201
- Partials 2677 3144 +467
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
// These conditions can happen again due to manual operations considered as fixes in Issues #5559 and #5263 | ||
// The following check prevents the table from going into such a state (but does not prevent the root cause | ||
// of attempting such a zk update). | ||
if (instanceStatesMap != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instanceStatesMap
won't be null
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed null check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instanceStateMap
could be null if the table is newly created, i.e. this method gets invoked by setUpNewTable
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helix returns an empty TreeMap() when we create new table
@@ -793,6 +793,27 @@ void updateInstanceStatesForNewConsumingSegment(Map<String, Map<String, String>> | |||
LOGGER.info("Updating segment: {} to ONLINE state", committingSegmentName); | |||
} | |||
|
|||
// There used to be a race condition in pinot (caused by heavy GC on the controller during segment commit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this caused by controller running into GC and leader changed, and 2 controllers trying to create the segment with the same seq num? If so, this change won't really work because the instanceStatesMap
won't be up-to-date
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last time this happened, it was caused by gc on the controller, and the same controller waking up to complete the operation it is scheduled to do.
I could not dig up the PR since it happened (I think) when the source code was under com.linkedin domain, but it was fixed by @jackjlli by adding a Stat check when we update metadata for the committing segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can only happen in the following orders:
- The current leader controller start committing a segment
- The current leader controller runs into GC and not responding, leadership changes to another controller
- The new leader controller committed the segment
- The previous leader controller wakes up and commit the segment again
If this is the order, the fix won't work because the instanceStatesMap
is not up-to-date if it is already calculated before the controller runs into GC. The right fix might be checking the leadership again before actually committing the segment.
@jackjlli What was the fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix is the usage of Stat
class. This is the PR I made(#2361). Now it's in persistSegmentZKMetadata
method.
If the former lead controller tried to persist its change, it'd fail since the stat version has already been changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, it picks up the latest Idealstate and checks on that znode. If a different controller added a segment, then even if the old controller gets back control, it will not be able to add the new segment
Fixes being considered for issues apache#5559 and apache#5263 may end up resurfacing some race conditions that we have strived to avoid via automated mechanisms. The race conditions may lead to multiple consuming segments in the same partition. This commit is a catch-all safeguard in case some automaticl operation interferes with other manual primitives provided, so that the table never reaches such a state.
4e8644f
to
03f5924
Compare
@Jackie-Jiang a couple of files did not have license headers, so my commit was failing. I added them now . can you review please? |
03f5924
to
0469e82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. The purpose of this PR is for protection instead of fixing the race condition. LGTM
Fixes being considered for issues #5559 and #5263 may end up
resurfacing some race conditions that we have strived to avoid
via automated mechanisms. The race conditions may lead to multiple
consuming segments in the same partition.
This commit is a catch-all safeguard in case some automaticl operation
interferes with other manual primitives provided, so that the table
never reaches such a state.
Description
Add a description of your PR here.
A good description should include pointers to an issue or design document, etc.
Upgrade Notes
Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)
backward-incompat
, and complete the section below on Release Notes)Does this PR fix a zero-downtime upgrade introduced earlier?
backward-incompat
, and complete the section below on Release Notes)Does this PR otherwise need attention when creating release notes? Things to consider:
release-notes
and complete the section on Release Notes)Release Notes
If you have tagged this as either backward-incompat or release-notes,
you MUST add text here that you would like to see appear in release notes of the
next release.
If you have a series of commits adding or enabling a feature, then
add this section only in final commit that marks the feature completed.
Refer to earlier release notes to see examples of text
Documentation
If you have introduced a new feature or configuration, please add it to the documentation as well.
See https://docs.pinot.apache.org/developers/developers-and-contributors/update-document