Skip to content
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
91819a0
Add a setting to control dangling index allocation
pugnascotia Oct 28, 2019
fedf771
Merge remote-tracking branch 'upstream/master' into allocate-dangling…
pugnascotia Nov 10, 2019
982be6f
Merge remote-tracking branch 'upstream/master' into allocate-dangling…
pugnascotia Nov 11, 2019
865831e
Fix allocate setting and implement ITs
pugnascotia Nov 14, 2019
32d3646
Merge remote-tracking branch 'upstream/master' into allocate-dangling…
pugnascotia Nov 15, 2019
e377772
Finish fixing unit tests
pugnascotia Nov 15, 2019
6393357
Rename the new setting
pugnascotia Nov 15, 2019
7950e86
WIP - trying to make new setting static
pugnascotia Nov 15, 2019
ce3b5f4
Add docs for gateway.auto_import_dangling_indices
pugnascotia Nov 15, 2019
77b4f09
Fix ITs
pugnascotia Nov 15, 2019
3a453e9
Add missing license
pugnascotia Nov 15, 2019
54db504
Fix test
pugnascotia Nov 19, 2019
44834b8
Merge remote-tracking branch 'upstream/master' into allocate-dangling…
pugnascotia Nov 19, 2019
6bf3e9b
Merge remote-tracking branch 'upstream/master' into allocate-dangling…
pugnascotia Nov 21, 2019
71bcbde
Address review feedback
pugnascotia Nov 21, 2019
d156385
Address review feedback
pugnascotia Nov 27, 2019
f4f3340
Address review comments
pugnascotia Nov 27, 2019
7cd372a
Merge remote-tracking branch 'upstream/master' into allocate-dangling…
pugnascotia Nov 27, 2019
a9dfc48
Checkstyle
pugnascotia Nov 27, 2019
85f7f57
Address review comments
pugnascotia Nov 28, 2019
2a40971
Address review feedback
pugnascotia Nov 29, 2019
77a7649
Merge remote-tracking branch 'upstream/master' into allocate-dangling…
pugnascotia Nov 29, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 19 additions & 6 deletions docs/reference/modules/gateway.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,22 @@ NOTE: These settings only take effect on a full cluster restart.
[[modules-gateway-dangling-indices]]
=== Dangling indices

When a node joins the cluster, any shards stored in its local data
directory which do not already exist in the cluster will be imported into the
cluster. This functionality is intended as a best effort to help users who
lose all master nodes. If a new master node is started which is unaware of
the other indices in the cluster, adding the old nodes will cause the old
indices to be imported, instead of being deleted.
When a node joins the cluster, it will search for any shards that are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given my previous comment about preserving existing behaviour, I think we should leave the docs alone for now. I don't think we should put much effort into improving the docs for today's behaviour, and we'll need to make more changes here when we add these APIs.

stored in its local data directory and do not already exist in the
cluster. If the static setting `gateway.auto_import_dangling_indices` is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps call this node setting

`true` (the default is `false`), then those shards will be imported into
the cluster. This functionality is intended as a best effort to help users
who lose all master nodes. If a new master node is started which is unaware
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should explain this in terms of unsafe cluster bootstrapping. As far as I'm aware the only remaining use of dangling indices in 7.x is for the situation where a cluster has lost all master-eligible nodes and where the unsafe-bootstrap and/or detach-cluster commands have been used.

of the other indices in the cluster, adding the old nodes will cause the
old indices to be imported, instead of being deleted.

Enabling `gateway.auto_import_dangling_indices` should only be done if
absolutely necessary, after understanding the possible consequences (this is not an exhaustive list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should strongly recommend against using auto import of dangling indices as it can cause inconsistencies in a cluster (e.g. automatically necroing deleted indices) and instead refer to our tooling (to be created in follow-ups).

Given that we don't provide the tooling at this time, I wonder if we should leave this setting undocumented for the time being and only write the docs once we can present the full story on this.


* A deleted index might suddenly reappear when a node joins the cluster.
* You might delete an index and see the immediate creation of another index
with the same name, containing stale mappings and old data.
* New documents could be written to the index before anyone realises that
it has been recovered
* {es} may not be able to find copies of all of the shards of the index,
resulting in a red cluster state.
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@
import org.elasticsearch.discovery.SettingsBasedSeedHostsProvider;
import org.elasticsearch.env.Environment;
import org.elasticsearch.env.NodeEnvironment;
import org.elasticsearch.gateway.DanglingIndicesState;
import org.elasticsearch.gateway.GatewayService;
import org.elasticsearch.gateway.IncrementalClusterStateWriter;
import org.elasticsearch.http.HttpTransportSettings;
Expand Down Expand Up @@ -184,6 +185,7 @@ public void apply(Settings value, Settings current, Settings previous) {
BalancedShardsAllocator.THRESHOLD_SETTING,
ClusterRebalanceAllocationDecider.CLUSTER_ROUTING_ALLOCATION_ALLOW_REBALANCE_SETTING,
ConcurrentRebalanceAllocationDecider.CLUSTER_ROUTING_ALLOCATION_CLUSTER_CONCURRENT_REBALANCE_SETTING,
DanglingIndicesState.AUTO_IMPORT_DANGLING_INDICES_SETTING,
EnableAllocationDecider.CLUSTER_ROUTING_ALLOCATION_ENABLE_SETTING,
EnableAllocationDecider.CLUSTER_ROUTING_REBALANCE_ENABLE_SETTING,
FilterAllocationDecider.CLUSTER_ROUTING_INCLUDE_GROUP_SETTING,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import org.elasticsearch.cluster.metadata.MetaData;
import org.elasticsearch.cluster.service.ClusterService;
import org.elasticsearch.common.inject.Inject;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.util.concurrent.ConcurrentCollections;
import org.elasticsearch.env.NodeEnvironment;
import org.elasticsearch.index.Index;
Expand All @@ -55,19 +56,33 @@ public class DanglingIndicesState implements ClusterStateListener {

private static final Logger logger = LogManager.getLogger(DanglingIndicesState.class);

public static final Setting<Boolean> AUTO_IMPORT_DANGLING_INDICES_SETTING = Setting.boolSetting(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's deprecate this setting right away, so that we can remove it in 8.0 if we want

"gateway.auto_import_dangling_indices",
false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this PR we should preserve the existing behaviour and default this to true, since we do not have any other alternatives for finding or cleaning up dangling indices. This means we can use this setting as a feature flag for testing the new behaviour while it's still in flight. Once we've added the APIs we have planned we can adjust the default to false.

Setting.Property.NodeScope
);

private final NodeEnvironment nodeEnv;
private final MetaStateService metaStateService;
private final LocalAllocateDangledIndices allocateDangledIndices;

private final Map<Index, IndexMetaData> danglingIndices = ConcurrentCollections.newConcurrentMap();

private boolean allocateDanglingIndices;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this field is superfluous now?


@Inject
public DanglingIndicesState(NodeEnvironment nodeEnv, MetaStateService metaStateService,
LocalAllocateDangledIndices allocateDangledIndices, ClusterService clusterService) {
this.nodeEnv = nodeEnv;
this.metaStateService = metaStateService;
this.allocateDangledIndices = allocateDangledIndices;
clusterService.addListener(this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simpler might be to only conditionally add the listener based on AUTO_IMPORT_DANGLING_INDICES_SETTING


this.allocateDanglingIndices = AUTO_IMPORT_DANGLING_INDICES_SETTING.get(clusterService.getSettings());
}

public void setAllocateDanglingIndicesSetting(boolean allocateDanglingIndices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is superfluous now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes, good catch.

this.allocateDanglingIndices = allocateDanglingIndices;
}

/**
Expand Down Expand Up @@ -171,10 +186,15 @@ private IndexMetaData stripAliases(IndexMetaData indexMetaData) {
* Allocates the provided list of the dangled indices by sending them to the master node
* for allocation.
*/
private void allocateDanglingIndices() {
void allocateDanglingIndices() {
if (this.allocateDanglingIndices == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition is superfluous now?

return;
}

if (danglingIndices.isEmpty()) {
return;
}

try {
allocateDangledIndices.allocateDangled(Collections.unmodifiableCollection(new ArrayList<>(danglingIndices.values())),
new ActionListener<>() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
import java.util.Locale;

import static org.elasticsearch.action.support.WriteRequest.RefreshPolicy.IMMEDIATE;
import static org.elasticsearch.gateway.DanglingIndicesState.AUTO_IMPORT_DANGLING_INDICES_SETTING;
import static org.elasticsearch.index.query.QueryBuilders.matchAllQuery;
import static org.elasticsearch.indices.recovery.RecoverySettings.INDICES_RECOVERY_MAX_BYTES_PER_SEC_SETTING;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertHitCount;
Expand Down Expand Up @@ -396,8 +397,13 @@ public boolean clearData(String nodeName) {
}
});

final Settings settingsWithAutoImport = Settings.builder()
.put(dataNodeDataPathSettings)
.put(AUTO_IMPORT_DANGLING_INDICES_SETTING.getKey(), true)
.build();

logger.info("--> start data-only only node and ensure 2 nodes stable cluster");
internalCluster().startDataOnlyNode(dataNodeDataPathSettings);
internalCluster().startDataOnlyNode(settingsWithAutoImport);
ensureStableCluster(2);

logger.info("--> verify that the dangling index exists and has green status");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
import org.elasticsearch.cluster.metadata.IndexMetaData;
import org.elasticsearch.cluster.metadata.MetaData;
import org.elasticsearch.cluster.service.ClusterService;
import org.elasticsearch.common.settings.ClusterSettings;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.env.NodeEnvironment;
import org.elasticsearch.index.Index;
Expand All @@ -35,51 +36,73 @@
import java.nio.file.StandardCopyOption;
import java.util.Map;

import static org.elasticsearch.gateway.DanglingIndicesState.AUTO_IMPORT_DANGLING_INDICES_SETTING;
import static org.hamcrest.Matchers.equalTo;
import static org.mockito.Matchers.any;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.never;
import static org.mockito.Mockito.verify;
import static org.mockito.Mockito.when;

public class DanglingIndicesStateTests extends ESTestCase {

private static Settings indexSettings = Settings.builder()
.put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, 1)
.put(IndexMetaData.SETTING_NUMBER_OF_REPLICAS, 0)
.put(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT)
.build();
.put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid reformatting code unrelated to the PR.

.put(IndexMetaData.SETTING_NUMBER_OF_REPLICAS, 0)
.put(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT)
.build();

public void testCleanupWhenEmpty() throws Exception {
try (NodeEnvironment env = newNodeEnvironment()) {
MetaStateService metaStateService = new MetaStateService(env, xContentRegistry());
DanglingIndicesState danglingState = createDanglingIndicesState(env, metaStateService);

// Given an empty state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am, personally, not a big fan of having these comments everywhere. The given/when/then phases of these tests seem pretty clear to me as written, and far too often do comments like these fall out of sync with the code and end up being misleading to readers.

There's a certain amount of judgement needed here to identify cases where there's something surprising that needs the reader's attention, because all readers are different. But in this specific test case for me they add, rather than remove, confusion.

assertTrue(danglingState.getDanglingIndices().isEmpty());

// When passed an empty metadata
MetaData metaData = MetaData.builder().build();
danglingState.cleanupAllocatedDangledIndices(metaData);

// Then the state remains empty
assertTrue(danglingState.getDanglingIndices().isEmpty());
}
}

public void testDanglingIndicesDiscovery() throws Exception {
try (NodeEnvironment env = newNodeEnvironment()) {
MetaStateService metaStateService = new MetaStateService(env, xContentRegistry());
DanglingIndicesState danglingState = createDanglingIndicesState(env, metaStateService);

// Given an empty state
assertTrue(danglingState.getDanglingIndices().isEmpty());

// When passed a metdata with an unknown index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata we pass in is empty, but we've created a dangling index on disk.

MetaData metaData = MetaData.builder().build();
final Settings.Builder settings = Settings.builder().put(indexSettings).put(IndexMetaData.SETTING_INDEX_UUID, "test1UUID");
IndexMetaData dangledIndex = IndexMetaData.builder("test1").settings(settings).build();
metaStateService.writeIndex("test_write", dangledIndex);
Map<Index, IndexMetaData> newDanglingIndices = danglingState.findNewDanglingIndices(metaData);

// Then that index is considered dangling
assertTrue(newDanglingIndices.containsKey(dangledIndex.getIndex()));

// And when passed another metadata with that index
metaData = MetaData.builder().put(dangledIndex, false).build();
newDanglingIndices = danglingState.findNewDanglingIndices(metaData);

// Then then index is not considered to be a new dangling index for a second time
assertFalse(newDanglingIndices.containsKey(dangledIndex.getIndex()));
}
}

public void testInvalidIndexFolder() throws Exception {
try (NodeEnvironment env = newNodeEnvironment()) {
// Given an empty state
MetaStateService metaStateService = new MetaStateService(env, xContentRegistry());
DanglingIndicesState danglingState = createDanglingIndicesState(env, metaStateService);

// When passed settings for an index whose folder does not exist
MetaData metaData = MetaData.builder().build();
final String uuid = "test1UUID";
final Settings.Builder settings = Settings.builder().put(indexSettings).put(IndexMetaData.SETTING_INDEX_UUID, uuid);
Expand All @@ -90,6 +113,8 @@ public void testInvalidIndexFolder() throws Exception {
Files.move(path, path.resolveSibling("invalidUUID"), StandardCopyOption.ATOMIC_MOVE);
}
}

// Then an exception is thrown describing the problem
try {
danglingState.findNewDanglingIndices(metaData);
fail("no exception thrown for invalid folder name");
Expand Down Expand Up @@ -145,25 +170,31 @@ public void testDanglingProcessing() throws Exception {

public void testDanglingIndicesNotImportedWhenTombstonePresent() throws Exception {
try (NodeEnvironment env = newNodeEnvironment()) {
// Given an empty state
MetaStateService metaStateService = new MetaStateService(env, xContentRegistry());
DanglingIndicesState danglingState = createDanglingIndicesState(env, metaStateService);

// When passed a dangling index
final Settings.Builder settings = Settings.builder().put(indexSettings).put(IndexMetaData.SETTING_INDEX_UUID, "test1UUID");
IndexMetaData dangledIndex = IndexMetaData.builder("test1").settings(settings).build();
metaStateService.writeIndex("test_write", dangledIndex);

// And there is a tombstone for that index
final IndexGraveyard graveyard = IndexGraveyard.builder().addTombstone(dangledIndex.getIndex()).build();
final MetaData metaData = MetaData.builder().indexGraveyard(graveyard).build();
assertThat(danglingState.findNewDanglingIndices(metaData).size(), equalTo(0));

// Then that index is not imported
assertThat(danglingState.findNewDanglingIndices(metaData).size(), equalTo(0));
}
}

public void testDanglingIndicesStripAliases() throws Exception {
try (NodeEnvironment env = newNodeEnvironment()) {
// Given an empty state
MetaStateService metaStateService = new MetaStateService(env, xContentRegistry());
DanglingIndicesState danglingState = createDanglingIndicesState(env, metaStateService);

// When passed an index that has an alias
final Settings.Builder settings = Settings.builder().put(indexSettings).put(IndexMetaData.SETTING_INDEX_UUID, "test1UUID");
IndexMetaData dangledIndex = IndexMetaData.builder("test1")
.settings(settings)
Expand All @@ -174,14 +205,86 @@ public void testDanglingIndicesStripAliases() throws Exception {

final MetaData metaData = MetaData.builder().build();
Map<Index, IndexMetaData> newDanglingIndices = danglingState.findNewDanglingIndices(metaData);

// Then the index is identifying as dangling
assertThat(newDanglingIndices.size(), equalTo(1));
Map.Entry<Index, IndexMetaData> entry = newDanglingIndices.entrySet().iterator().next();
assertThat(entry.getKey().getName(), equalTo("test1"));

// And the alias is removed
assertThat(entry.getValue().getAliases().size(), equalTo(0));
}
}

public void testDanglingIndicesAreNotAllocatedWhenDisabled() throws Exception {
try (NodeEnvironment env = newNodeEnvironment()) {
MetaStateService metaStateService = new MetaStateService(env, xContentRegistry());
LocalAllocateDangledIndices localAllocateDangledIndices = mock(LocalAllocateDangledIndices.class);
DanglingIndicesState danglingState = createDanglingIndicesState(env, metaStateService, localAllocateDangledIndices);

assertTrue(danglingState.getDanglingIndices().isEmpty());

// Given a metadata that does not enable allocation of dangling indices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether dangling indices importing is enabled is not a property of the metadata as suggested by this comment. Could we make the enabled/disabled property an explicit argument to createDanglingIndicesState() instead?

MetaData metaData = MetaData.builder().build();

final Settings.Builder settings = Settings.builder().put(indexSettings).put(IndexMetaData.SETTING_INDEX_UUID, "test1UUID");
IndexMetaData dangledIndex = IndexMetaData.builder("test1").settings(settings).build();
metaStateService.writeIndex("test_write", dangledIndex);

danglingState.findNewAndAddDanglingIndices(metaData);

// When calling the allocate method
danglingState.allocateDanglingIndices();

// Then allocation is not attempted
verify(localAllocateDangledIndices, never()).allocateDangled(any(), any());
}
}

public void testDanglingIndicesAreAllocatedWhenEnabled() throws Exception {
try (NodeEnvironment env = newNodeEnvironment()) {
MetaStateService metaStateService = new MetaStateService(env, xContentRegistry());
LocalAllocateDangledIndices localAllocateDangledIndices = mock(LocalAllocateDangledIndices.class);
DanglingIndicesState danglingState = createDanglingIndicesState(env, metaStateService, localAllocateDangledIndices);

assertTrue(danglingState.getDanglingIndices().isEmpty());

// Given a state where automatic allocation is enabled
danglingState.setAllocateDanglingIndicesSetting(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make the enabled/disabled property an explicit argument to createDanglingIndicesState() instead?


MetaData metaData = MetaData.builder().build();

final Settings.Builder settings = Settings.builder().put(indexSettings).put(IndexMetaData.SETTING_INDEX_UUID, "test1UUID");
IndexMetaData dangledIndex = IndexMetaData.builder("test1").settings(settings).build();
metaStateService.writeIndex("test_write", dangledIndex);

danglingState.findNewAndAddDanglingIndices(metaData);

// When calling the allocate method
danglingState.allocateDanglingIndices();

// Ensure that allocation is attempted
verify(localAllocateDangledIndices).allocateDangled(any(), any());
}
}

private DanglingIndicesState createDanglingIndicesState(NodeEnvironment env, MetaStateService metaStateService) {
return new DanglingIndicesState(env, metaStateService, null, mock(ClusterService.class));
return createDanglingIndicesState(env, metaStateService, null);
}

private DanglingIndicesState createDanglingIndicesState(
NodeEnvironment env,
MetaStateService metaStateService,
LocalAllocateDangledIndices indexAllocator
) {
final Settings allocateSettings = Settings.builder().put(AUTO_IMPORT_DANGLING_INDICES_SETTING.getKey(), false).build();

final ClusterService clusterServiceMock = mock(ClusterService.class);
when(clusterServiceMock.getClusterSettings()).thenReturn(
new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS)
);
when(clusterServiceMock.getSettings()).thenReturn(allocateSettings);

return new DanglingIndicesState(env, metaStateService, indexAllocator, clusterServiceMock);
}
}
Loading