Skip to content
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
d687f51
Introduce primary context
jasontedor Jun 2, 2017
8d871cf
Fix test
jasontedor Jun 8, 2017
a197a64
Add assertion messages
jasontedor Jun 8, 2017
1e4255c
Javadocs
jasontedor Jun 8, 2017
c1588bb
Barrier between marking a shard in sync and relocating
jasontedor Jun 9, 2017
0b47386
Fix misplaced call
jasontedor Jun 9, 2017
73bc0d1
Paranoia
jasontedor Jun 9, 2017
76162e4
Better latch countdown
jasontedor Jun 9, 2017
08bb6fa
Merge branch 'master' into primary-context
jasontedor Jun 10, 2017
a799e69
Catch any exception
jasontedor Jun 10, 2017
58adfa0
Fix comment
jasontedor Jun 10, 2017
03b863b
Fix wait for cluster state relocation test
jasontedor Jun 10, 2017
40a3fcc
Merge branch 'master' into primary-context
jasontedor Jun 12, 2017
8504b82
Update knowledge via upate local checkpoint API
jasontedor Jun 13, 2017
5f05d92
toString
jasontedor Jun 14, 2017
a35e497
Visibility
jasontedor Jun 14, 2017
b54a8d6
Refactor permit
jasontedor Jun 14, 2017
48157cd
Push down
jasontedor Jun 14, 2017
9a58ff4
Imports
jasontedor Jun 14, 2017
62966c5
Docs
jasontedor Jun 14, 2017
da4c9aa
Merge branch 'master' into primary-context
jasontedor Jun 14, 2017
a4dda93
Fix compilation
jasontedor Jun 14, 2017
ef1d345
Remove assertion
jasontedor Jun 14, 2017
51ad65b
Fix compilation
jasontedor Jun 14, 2017
e3f1886
Merge branch 'master' into primary-context
jasontedor Jun 15, 2017
4ba8d5c
Remove context wrapper
jasontedor Jun 15, 2017
48572a6
Move PrimaryContext to new package
jasontedor Jun 15, 2017
c9bd0d7
Piping for cluster state version
jasontedor Jun 15, 2017
acca222
Remove unused import
jasontedor Jun 15, 2017
d459f16
Implement versioning in tracker
jasontedor Jun 15, 2017
4471dc2
Fix test
jasontedor Jun 15, 2017
56f3c17
Unneeded public
jasontedor Jun 15, 2017
f97fd92
Imports
jasontedor Jun 15, 2017
29ca82a
Promote on our own
jasontedor Jun 15, 2017
d7a8021
Add tests
jasontedor Jun 16, 2017
9683757
Import
jasontedor Jun 16, 2017
d9dda28
Merge branch 'master' of github.com:elastic/elasticsearch into primar…
jasontedor Jun 16, 2017
5422f6e
Newline
jasontedor Jun 16, 2017
d04f14a
Update comment
jasontedor Jun 16, 2017
99597d1
Merge branch 'master' into primary-context
jasontedor Jun 22, 2017
52af334
Serialization
jasontedor Jun 22, 2017
e1de296
Assertion message
jasontedor Jun 22, 2017
643fa8a
Update stale comment
jasontedor Jun 22, 2017
4e1fa9c
Remove newline
jasontedor Jun 22, 2017
a275167
Less verbose
jasontedor Jun 22, 2017
00e4083
Remove redundant assertion
jasontedor Jun 22, 2017
5f16884
Tracking -> in-sync
jasontedor Jun 22, 2017
3a53fd3
Assertions
jasontedor Jun 22, 2017
ccde798
Just say no
jasontedor Jun 23, 2017
e6bbe8b
Extra newline
jasontedor Jun 23, 2017
ff54eec
Add allocation ID to assertion
jasontedor Jun 23, 2017
6ea61ef
Rename method
jasontedor Jun 23, 2017
f6f6acb
Another rename
jasontedor Jun 23, 2017
de862cd
Introduce sealing
jasontedor Jun 23, 2017
dc377fa
Merge branch 'master' into primary-context
jasontedor Jun 23, 2017
077b3f4
Sealing tests
jasontedor Jun 23, 2017
4e05714
One more assertion
jasontedor Jun 23, 2017
b30dbcc
Merge branch 'master' into primary-context
jasontedor Jun 26, 2017
744b03a
Fix imports
jasontedor Jun 26, 2017
3c2731f
Safer sealing
jasontedor Jun 26, 2017
e19c8b3
Remove check
jasontedor Jun 26, 2017
0d33a88
Remove another sealed check
jasontedor Jun 26, 2017
7bd3c1b
Merge branch 'master' into primary-context
jasontedor Jun 26, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,12 @@
import com.carrotsearch.hppc.ObjectLongMap;
import com.carrotsearch.hppc.cursors.ObjectLongCursor;
import org.elasticsearch.common.SuppressForbidden;
import org.elasticsearch.common.collect.HppcMaps;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unused

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

import org.elasticsearch.index.IndexSettings;
import org.elasticsearch.index.shard.AbstractIndexShardComponent;
import org.elasticsearch.index.shard.ShardId;

import java.util.Arrays;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As does this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

import java.util.HashSet;
import java.util.Locale;
import java.util.Set;
Expand Down Expand Up @@ -248,6 +250,79 @@ public synchronized void updateAllocationIdsFromMaster(
updateGlobalCheckpointOnPrimary();
}

/**
* Updates the known allocation IDs and the local checkpoints for the corresponding allocations from a primary relocation source.
*
* @param seqNoPrimaryContext the sequence number context
*/
synchronized void updateAllocationIdsFromPrimaryContext(final SeqNoPrimaryContext seqNoPrimaryContext) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is tricky. The master is the one that drives the set of shards that are allocated. If the copy was removed by the master we shouldn't re-add it because of a primary handoff that happens concurrently. I think we should make the primary context be a recovery level thing that uses existing shard API (updateLocalCheckPoint/markAllocationIdAsInSync)

/*
* We are gathered here today to witness the relocation handoff transferring knowledge from the relocation source to the relocation
* target. We need to consider the possibility that the version of the cluster state on the relocation source when the primary
* context was sampled is different than the version of the cluster state on the relocation target at this exact moment. We define
* the following values:
* - version(source) = the cluster state version on the relocation source used to ensure a minimum cluster state version on the
* relocation target
* - version(context) = the cluster state version on the relocation source when the primary context was sampled
* - version(target) = the current cluster state version on the relocation target
*
* We know that version(source) <= version(target) and version(context) < version(target), version(context) = version(target), and
* version(target) < version(context) are all possibilities.
*
* The case of version(context) = version(target) causes no issues as in this case the knowledge of the in-sync and initializing
* shards the target receives from the master will be equal to the knowledge of the in-sync and initializing shards the target
* receives from the relocation source via the primary context.
*
* In the case when version(context) < version(target) or version(target) < version(context), we first consider shards that could be
* contained in the primary context but not contained in the cluster state applied on the target.
*
* Suppose there is such a shard and that it is an in-sync shard. However, marking a shard as in-sync requires an operation permit
* on the primary shard. Such a permit can not be obtained after the relocation handoff has started as the relocation handoff blocks
* all operations. Therefore, there can not be such a shard that is marked in-sync.
*
* Now consider the case of an initializing shard that is contained in the primary context but not contained in the cluster state
* applied on the target.
*
* If version(context) < version(target) it means that the shard has been removed by a later cluster state update that is already
* applied on the target and we only need to ensure that we do not add it to the tracking map on the target. The call to
* GlobalCheckpointTracker#updateLocalCheckpoint(String, long) is a no-op for such shards and this is safe.
*
* If version(target) < version(context) it means that the shard has started initializing by a later cluster state update has not
* yet arrived on the target. However, there is a delay on recoveries before we ensure that version(source) <= version(target).
* Therefore, such a shard can never initialize from the relocation source and will have to await the handoff completing. As such,
* these shards are not problematic.
*
* Now we consider shards that are contained in the cluster state applied on the target but not contained in the primary context.
*
* If version(context) < version(target) it means that the target has learned of an initializing shard that the source is not aware
* of. As explained above, this initialization can only succeed after the relocation is complete, and only with the target as the
* source of the recovery.
*
* Otherwise, if version(target) < version(context) it only means that the global checkpoint on the target will be held back until a
* later cluster state update arrives because the target will not learn of the removal until later.
*
* In both cases, no calls to update the local checkpoint for such shards will be made. This case is safe too.
*/
for (final ObjectLongCursor<String> cursor : seqNoPrimaryContext.inSyncLocalCheckpoints()) {
updateLocalCheckpoint(cursor.key, cursor.value);
assert cursor.value >= globalCheckpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also assert that the key is not found in the in sync map? in fact, the sync map should be empty no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, why would the in-sync map be empty? The target could have applied a cluster state containing the source as an active allocation ID? I don't think we can make any assertion here at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. What I was thinking of that any "promotion" to in sync should go through the primary but this currently not the case. I do think that this is a better & simpler model that we can switch to as a follow up (the current model is based on the think we had back when we built around a background sync and didn't the locked down clean hand offs we have now). For now, I think we can assert that all the values in the in sync map are unknown. Not sure how much it's worth though. Up to you.

: "local checkpoint [" + cursor.value + "] violates being at least the global checkpoint [" + globalCheckpoint + "]";
try {
markAllocationIdAsInSync(cursor.key, cursor.value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for started shards because we have a check in markAllocationIdAsInSync that ignores this call if it can't find the aId in the tracking map.

Given that the insync map is empty, I wonder if this will be simpler to work with if change seqNoPrimaryContext to be a map of aid->local checkpoint + a set of in sync aids. We will then loop on the map and call updateLocalCheckpoint for everything in it. Then we can do manual promotion, instead of calling markAllocationIdAsInSync and suffer everything it brings with it. Concretely we'll just do:

if (trackingLocalCheckpoints.containsKey(allocationId)) {
    long current = trackingLocalCheckpoints.remove(allocationId);
    inSyncLocalCheckpoints.put(allocationId, current);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer the sequence number related state of the primary context look like the state of the global checkpoint tracker. It's easier to think about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer the sequence number related state of the primary context look like the state of the global checkpoint tracker

Ok. I would still prefer not using markAllocationIdAsInSync as is and all the complexity it brings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

} catch (final InterruptedException e) {
/*
* Since the local checkpoint already exceeds the global checkpoint here, we never blocking waiting for advancement. This
* means that we can never be interrupted. If we are bail, something is catastrophically wrong.
*/
throw new AssertionError(e);
}
}

for (final ObjectLongCursor<String> cursor : seqNoPrimaryContext.trackingLocalCheckpoints()) {
updateLocalCheckpoint(cursor.key, cursor.value);
}
}

/**
* Marks the shard with the provided allocation ID as in-sync with the primary shard. This method will block until the local checkpoint
* on the specified shard advances above the current global checkpoint.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.elasticsearch.index.seqno;

import com.carrotsearch.hppc.ObjectLongHashMap;
import com.carrotsearch.hppc.ObjectLongMap;
import com.carrotsearch.hppc.cursors.ObjectLongCursor;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.common.io.stream.Writeable;

import java.io.IOException;

/**
* Represents the sequence number component of the primary context. This is the knowledge on the primary of the in-sync and initializing
* shards and their local checkpoints.
*/
public class SeqNoPrimaryContext implements Writeable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java docs pls?


private ObjectLongMap<String> inSyncLocalCheckpoints;

public ObjectLongMap<String> inSyncLocalCheckpoints() {
return inSyncLocalCheckpoints;
}

private ObjectLongMap<String> trackingLocalCheckpoints;

public ObjectLongMap<String> trackingLocalCheckpoints() {
return trackingLocalCheckpoints;
}

public SeqNoPrimaryContext(final ObjectLongMap<String> inSyncLocalCheckpoints, final ObjectLongMap<String> trackingLocalCheckpoints) {
this.inSyncLocalCheckpoints = inSyncLocalCheckpoints;
this.trackingLocalCheckpoints = trackingLocalCheckpoints;
}

public SeqNoPrimaryContext(StreamInput in) throws IOException {
inSyncLocalCheckpoints = readMap(in);
trackingLocalCheckpoints = readMap(in);
}

private static ObjectLongMap<String> readMap(final StreamInput in) throws IOException {
final int length = in.readInt();
final ObjectLongMap<String> map = new ObjectLongHashMap<>(length);
for (int i = 0; i < length; i++) {
final String key = in.readString();
final long value = in.readZLong();
map.addTo(key, value);
}
return map;
}

@Override
public void writeTo(final StreamOutput out) throws IOException {
writeMap(out, inSyncLocalCheckpoints);
writeMap(out, trackingLocalCheckpoints);
}

private static void writeMap(final StreamOutput out, final ObjectLongMap<String> map) throws IOException {
out.writeInt(map.size());
for (ObjectLongCursor<String> cursor : map) {
out.writeString(cursor.key);
out.writeZLong(cursor.value);
}
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@

package org.elasticsearch.index.seqno;

import com.carrotsearch.hppc.ObjectLongHashMap;
import com.carrotsearch.hppc.ObjectLongMap;
import org.elasticsearch.index.IndexSettings;
import org.elasticsearch.index.shard.AbstractIndexShardComponent;
import org.elasticsearch.index.shard.ShardId;
Expand Down Expand Up @@ -174,6 +176,15 @@ public void updateAllocationIdsFromMaster(final Set<String> activeAllocationIds,
globalCheckpointTracker.updateAllocationIdsFromMaster(activeAllocationIds, initializingAllocationIds);
}

/**
* Updates the known allocation IDs and the local checkpoints for the corresponding allocations from a primary relocation source.
*
* @param seqNoPrimaryContext the sequence number context
*/
public void updateAllocationIdsFromPrimaryContext(final SeqNoPrimaryContext seqNoPrimaryContext) {
globalCheckpointTracker.updateAllocationIdsFromPrimaryContext(seqNoPrimaryContext);
}

/**
* Check if there are any recoveries pending in-sync.
*
Expand All @@ -183,4 +194,18 @@ public boolean pendingInSync() {
return globalCheckpointTracker.pendingInSync();
}

/**
* Get the sequence number primary context for the shard. This includes the state of the global checkpoint tracker.
*
* @return the sequence number primary context
*/
public SeqNoPrimaryContext seqNoPrimaryContext() {
synchronized (globalCheckpointTracker) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should push this down to the tracker. Then construction and application are the same. Also the external lock is ugly :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I figured you'd say that.

final ObjectLongMap<String> inSyncLocalCheckpoints = new ObjectLongHashMap<>(globalCheckpointTracker.inSyncLocalCheckpoints);
final ObjectLongMap<String> trackingLocalCheckpoints =
new ObjectLongHashMap<>(globalCheckpointTracker.trackingLocalCheckpoints);
return new SeqNoPrimaryContext(inSyncLocalCheckpoints, trackingLocalCheckpoints);
}
}

}
97 changes: 84 additions & 13 deletions core/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
import org.elasticsearch.common.util.BigArrays;
import org.elasticsearch.common.util.concurrent.AbstractRunnable;
import org.elasticsearch.common.util.concurrent.AsyncIOProcessor;
import org.elasticsearch.common.util.concurrent.CountDown;
import org.elasticsearch.index.Index;
import org.elasticsearch.index.IndexModule;
import org.elasticsearch.index.IndexNotFoundException;
Expand Down Expand Up @@ -95,6 +96,7 @@
import org.elasticsearch.index.refresh.RefreshStats;
import org.elasticsearch.index.search.stats.SearchStats;
import org.elasticsearch.index.search.stats.ShardSearchStats;
import org.elasticsearch.index.seqno.SeqNoPrimaryContext;
import org.elasticsearch.index.seqno.SeqNoStats;
import org.elasticsearch.index.seqno.SequenceNumbersService;
import org.elasticsearch.index.similarity.SimilarityService;
Expand Down Expand Up @@ -478,7 +480,7 @@ public IndexShardState markAsRecovering(String reason, RecoveryState recoverySta
}
}

public void relocated(String reason) throws IllegalIndexShardStateException, InterruptedException {
public void relocated(final String reason, final Runnable onBlocked) throws IllegalIndexShardStateException, InterruptedException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java doc the onBlock pls? it's a vague name ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

assert shardRouting.primary() : "only primaries can be marked as relocated: " + shardRouting;
try {
indexShardOperationPermits.blockOperations(30, TimeUnit.MINUTES, () -> {
Expand All @@ -498,6 +500,7 @@ public void relocated(String reason) throws IllegalIndexShardStateException, Int
throw new IllegalIndexShardStateException(shardId, IndexShardState.STARTED,
": shard is no longer relocating " + shardRouting);
}
onBlocked.run();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now have network operations under mutex :-/
I would prefer to call this before the mutex section to change the state. We can still repeat the state checks if (state != IndexShardState.STARTED) { ... before that etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good observation, and I agree we should not do this. It can implicitly block the cluster state update thread on this network operation. I pushed ccde798.

changeState(IndexShardState.RELOCATED, reason);
}
});
Expand All @@ -510,6 +513,17 @@ public void relocated(String reason) throws IllegalIndexShardStateException, Int
}
}

/**
* Obtain the primary context for the shard. The shard must be serving as the relocation source for a primary shard.
*
* @return the primary for the shard
*/
public PrimaryContext primaryContext() {
verifyPrimary();
assert shardRouting.relocating() : "primary context can only be obtained from a relocating primary but was " + shardRouting;
assert !shardRouting.isRelocationTarget() : "primary context can only be obtained from relocation source but was " + shardRouting;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows from the assertion in the previous line (hence is a noop). A relocating shard is always a source shard per definition of ShardRouting.relocating

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 00e4083.

return new PrimaryContext(getEngine().seqNoService().seqNoPrimaryContext());
}

public IndexShardState state() {
return state;
Expand Down Expand Up @@ -1249,16 +1263,16 @@ private void ensureWriteAllowed(Engine.Operation op) throws IllegalIndexShardSta

private void verifyPrimary() {
if (shardRouting.primary() == false) {
throw new IllegalStateException("shard is not a primary " + shardRouting);
throw new IllegalStateException("shard " + shardRouting + " is not a primary");
}
}

private void verifyReplicationTarget() {
final IndexShardState state = state();
if (shardRouting.primary() && shardRouting.active() && state != IndexShardState.RELOCATED) {
// must use exception that is not ignored by replication logic. See TransportActions.isShardNotAvailableException
throw new IllegalStateException("active primary shard cannot be a replication target before " +
" relocation hand off " + shardRouting + ", state is [" + state + "]");
throw new IllegalStateException("active primary shard " + shardRouting + " cannot be a replication target before " +
"relocation hand off, state is [" + state + "]");
}
}

Expand Down Expand Up @@ -1524,19 +1538,62 @@ public void waitForOpsToComplete(final long seqNo) throws InterruptedException {
/**
* Marks the shard with the provided allocation ID as in-sync with the primary shard. See
* {@link org.elasticsearch.index.seqno.GlobalCheckpointTracker#markAllocationIdAsInSync(String, long)}
* for additional details.
* for additional details. Because this operation could be completed asynchronously on another thread, the caller must provide a latch;
* this latch will be counted down after the operation completes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated comment? where is the latch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 643fa8a.

*
* @param allocationId the allocation ID of the shard to mark as in-sync
* @param localCheckpoint the current local checkpoint on the shard
* @param allocationId the allocation ID of the shard to mark as in-sync
* @param localCheckpoint the current local checkpoint on the shard
* @param latch a latch that is counted down after the operation completes
* @param onFailureConsumer a callback if an exception occurs completing the operation
*/
public void markAllocationIdAsInSync(final String allocationId, final long localCheckpoint) throws InterruptedException {
verifyPrimary();
getEngine().seqNoService().markAllocationIdAsInSync(allocationId, localCheckpoint);
public void markAllocationIdAsInSync(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this api change. All other api does acquire a permit inline. How about using this code in RecoverySourceHandler#finalizeRecovery:

PlainActionFuture<Releasable> permitFuture = new PlainActionFuture<>();
shard.acquirePrimaryOperationPermit(permitFuture, ThreadPool.Names.SAME);
try (Releasable permit = permitFuture.actionGet()) {
    if (shard.state() == IndexShardState.RELOCATED) {
        throw new DelayRecoveryException();
    }
    shard.markAllocationIdAsInSync(request.targetAllocationId(), targetLocalCheckpoint);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

final String allocationId,
final long localCheckpoint,
final CountDownLatch latch,
final Consumer<Exception> onFailureConsumer) {
/*
* We could have blocked waiting for the replica to catch up that we fell idle and there will not be a background sync to the
* replica; mark our self as active to force a future background sync.
* Before marking the shard as in-sync we acquire an operation permit. We do this so that there is a barrier between marking a shard
* as in-sync and relocating a shard. If we acquire the permit then no relocation handoff can complete before we are done marking
* the shard as in-sync. If the relocation handoff holds all the permits then after the handoff completes and we acquire the permit
* then the state of the shard will be relocated and this recovery will fail.
*/
active.compareAndSet(false, true);
acquirePrimaryOperationPermit(
new ActionListener<Releasable>() {
@Override
public void onResponse(final Releasable releasable) {
// we could have been relocated while waiting for a permit
if (state() == IndexShardState.RELOCATED) {
onFailure(new IndexShardRelocatedException(shardId));
}
boolean success = false;
try {
getEngine().seqNoService().markAllocationIdAsInSync(allocationId, localCheckpoint);
/*
* We could have blocked so long waiting for the replica to catch up that we fell idle and there will not be a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain we this is needed? so we don't have the background sync? what happens?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is only moved, I'm not addressing it in this PR.

* background sync to the replica; mark our self as active to force a future background sync.
*/
active.compareAndSet(false, true);
success = true;
} catch (final Exception e) {
onFailure(e);
} finally {
releasable.close();
if (success) {
latch.countDown();
}
}
}

@Override
public void onFailure(final Exception e) {
try {
onFailureConsumer.accept(e);
} finally {
latch.countDown();
}
}
},
ThreadPool.Names.GENERIC);
}

/**
Expand Down Expand Up @@ -1599,6 +1656,20 @@ public void updateAllocationIdsFromMaster(final Set<String> activeAllocationIds,
}
}

/**
* Updates the known allocation IDs and the local checkpoints for the corresponding allocations from a primary relocation source.
*
* @param seqNoPrimaryContext the sequence number context
*/
public void updateAllocationIdsFromPrimaryContext(final SeqNoPrimaryContext seqNoPrimaryContext) {
verifyPrimary();
assert shardRouting.isRelocationTarget();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add shardrouting to message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed e1de296.

final Engine engine = getEngineOrNull();
if (engine != null) {
engine.seqNoService().updateAllocationIdsFromPrimaryContext(seqNoPrimaryContext);
}
}

/**
* Check if there are any recoveries pending in-sync.
*
Expand Down
Loading