@@ -300,15 +300,18 @@ synchronized void updateAllocationIdsFromPrimaryContext(final PrimaryContext pri
300300 * shards the target receives from the master will be equal to the knowledge of the in-sync and initializing shards the target
301301 * receives from the relocation source via the primary context.
302302 *
303- * In the case when version(context) < version(target) or version(target) < version(context), we first consider shards that could be
304- * contained in the primary context but not contained in the cluster state applied on the target.
303+ * Let us now consider the case that version(context) < version(target). In this case, the active allocation IDs in the primary
304+ * context can be a superset of the active allocation IDs contained in the applied cluster state. This is because no new shards can
305+ * have been started as marking a shard as in-sync is blocked during relocation handoff. Note however that the relocation target
306+ * itself will have been marked in-sync during recovery and therefore is an active allocation ID from the perspective of the primary
307+ * context.
305308 *
306- * Suppose there is such a shard and that it is an in-sync shard. However, marking a shard as in-sync requires an operation permit
307- * on the primary shard. Such a permit can not be obtained after the relocation handoff has started as the relocation handoff blocks
308- * all operations. Therefore, there can not be such a shard that is marked in-sync .
309+ * Finally, we consider the case that version(target) < version(context). In this case, the active allocation IDs in the primary
310+ * context can be a subset of the active allocation IDs contained the applied cluster state. This is again because no new shards can
311+ * have been started. Moreover, existing active allocation IDs could have been removed from the cluster state .
309312 *
310- * Now consider the case of an initializing shard that is contained in the primary context but not contained in the cluster state
311- * applied on the target.
313+ * In each of these latter two cases, consider initializing shards that are contained in the primary context but not contained in
314+ * the cluster state applied on the target.
312315 *
313316 * If version(context) < version(target) it means that the shard has been removed by a later cluster state update that is already
314317 * applied on the target and we only need to ensure that we do not add it to the tracking map on the target. The call to
@@ -319,16 +322,16 @@ synchronized void updateAllocationIdsFromPrimaryContext(final PrimaryContext pri
319322 * Therefore, such a shard can never initialize from the relocation source and will have to await the handoff completing. As such,
320323 * these shards are not problematic.
321324 *
322- * Now we consider shards that are contained in the cluster state applied on the target but not contained in the primary context.
325+ * Lastly, again in these two cases, what about initializing shards that are contained in cluster state applied on the target but
326+ * not contained in the cluster state applied on the target.
323327 *
324- * If version(context) < version(target) it means that the target has learned of an initializing shard that the source is not aware
325- * of. As explained above, this initialization can only succeed after the relocation is complete, and only with the target as the
326- * source of the recovery .
328+ * If version(context) < version(target) it means that a shard has started initializing by a later cluster state that is applied on
329+ * the target but not yet known to what would be the relocation source. As recoveries are delayed at this time, these shards can not
330+ * cause a problem and we do not mutate remove these shards from the tracking map, so we are safe here .
327331 *
328- * Otherwise, if version(target) < version(context) it only means that the global checkpoint on the target will be held back until a
329- * later cluster state update arrives because the target will not learn of the removal until later.
330- *
331- * In both cases, no calls to update the local checkpoint for such shards will be made. This case is safe too.
332+ * If version(target) < version(context) it means that a shard has started initializing but was removed by a later cluster state. In
333+ * this case, as the cluster state version on the primary context exceeds the applied cluster state version, we replace the tracking
334+ * map and are safe here too.
332335 */
333336
334337 if (primaryContext .clusterStateVersion () > appliedClusterStateVersion ) {
0 commit comments