Skip to content

Conversation

@andrershov
Copy link
Contributor

@andrershov andrershov commented Nov 16, 2018

Today we have a way to atomically persist global MetaData and
IndexMetaData to disk when new ClusterState is received. All other
ClusterState fields are not persisted.
However, there are other parts of ClusterState that should be
persisted, namely:

  1. version
  2. term
  3. lastCommittedConfiguration
  4. lastAcceptedConfiguration
  5. votingTombstones

version is changed frequently, other fields are not. We decided
to group term, lastCommittedConfiguration,
lastAcceptedConfiguration and votingTombstones into
CoordinationMetaData class and make CoordinationMetaData a field
inside MetaData.
MetaData.toXContent and MetaData.fromXContent should take care of
CoordinationMetaData.
version stays as a top level field in ClusterState and will be
persisted as part of Manifest in a follow-up PR.
Also MetaData.isGlobalStateEquals should be extended to include
coordinationMetaData in comparison.

This PR favors to expose getters, such as getTerm directly in
ClusterState to avoid massive code changes.

An example of CoordinationMetaState.toXContent:

{
  "term": 1,
  "last_committed_config": [
    "TiIuBcbBtpuXyDDVHXeD",
    "ZIAoVbkjjLPLUuYLaTkw"
  ],
  "last_accepted_config": [
    "OwkXbXZNOZPJqccdFHdz",
    "LouzsGYwmQzpeQMrboZe",
    "fCKGRZdjLTqzXAqPUtGL",
    "pLoxshjpJXwDhbgjfYJy",
    "SjINLwFIlIEFZCbjrSFo",
    "MDkVncJEVyZLJktopWje"
  ]
}

@andrershov andrershov added >enhancement :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Nov 16, 2018
@andrershov andrershov requested a review from ywelsch November 16, 2018 09:56
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. It is a little sad that building these cluster states gets so much more verbose in places, and there's a few places where I'm not sure we're using the builders quite right yet. I left a selection of small comments/requests.

resolvedNodes = resolveNodesAndCheckMaximum(request, currentState);

final Builder builder = ClusterState.builder(currentState);
CoordinationMetaData.Builder builder = CoordinationMetaData.builder(currentState.coordinationMetaData());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, could this be final please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


public VotingConfiguration getLastAcceptedConfiguration() {
return lastAcceptedConfiguration;
public CoordinationMetaData.VotingConfiguration getLastAcceptedConfiguration() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could just import VotingConfiguration rather than qualifying it like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setLastAcceptedState(ClusterState.builder(lastAcceptedState)
.lastCommittedConfiguration(lastAcceptedState.getLastAcceptedConfiguration())
.build());
final CoordinationMetaData cmd = CoordinationMetaData.builder(lastAcceptedState.coordinationMetaData())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: cmd reads as "command" to me. Could we just give it the full type name coordinationMetaData?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builder.lastAcceptedConfiguration(votingConfiguration);
builder.lastCommittedConfiguration(votingConfiguration);
final CoordinationMetaData coordinationMetaData = CoordinationMetaData.builder()
.term(currentState.term())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currentState.term() is 0 here, so I think this isn't required. But I also wonder if it'd be better to say CoordinationMetaData.builder(currentState.coordinationMetaData()) in the line above to be clear that we're updating the existing metadata rather than starting afresh.

Copy link
Contributor Author

@andrershov andrershov Nov 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return builder(currentState)
.lastAcceptedConfiguration(votingConfiguration)
.lastCommittedConfiguration(votingConfiguration).build();
.metaData(MetaData.builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discards the metadata in the cluster state (except it copies the coordination metadata over)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new ClusterState.VotingConfiguration(Sets.newHashSet(generateRandomStringArray(10, 10, false))));
metaBuilder.lastAcceptedConfiguration(
new CoordinationMetaData.VotingConfiguration(Sets.newHashSet(generateRandomStringArray(10, 10, false))));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I missed the voting tombstones here. Could you add one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private ClusterState.Builder randomVotingConfiguration(ClusterState clusterState) {
private ClusterState.Builder randomCoordinationMetaData(ClusterState clusterState) {
ClusterState.Builder builder = ClusterState.builder(clusterState);
CoordinationMetaData.Builder metaBuilder = CoordinationMetaData.builder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to put this builder into the other builder, otherwise it's just discarded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
}
clusterState = builder.incrementVersion().term(randomLong()).build();
clusterState = builder.incrementVersion().build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts that various parts of the coordination metadata are equal. I think it should now just assert that the whole thing is equal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (in.getVersion().onOrAfter(Version.V_7_0_0)) {
coordinationMetaData = new CoordinationMetaData(in);
} else {
coordinationMetaData = CoordinationMetaData.EMPTY_META_DATA;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how we plan to do rolling upgrades, this may or may not be safe, and might need backporting. We're not sure yet. Could you leave a // TODO check that this is safe here so we know to revisit it later on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaveCTurner
Copy link
Contributor

Ah yes and there's another place where we assert the literal JSON representation of the cluster state: org.elasticsearch.xpack.monitoring.collector.cluster.ClusterStatsMonitoringDocTests.testToXContent. That's caught me out before.

@andrershov andrershov removed the request for review from ywelsch November 19, 2018 16:59
@andrershov
Copy link
Contributor Author

@DaveCTurner thanks for the diligent review! I've made changes that you're asking for. Ready for the next pass.

@andrershov andrershov mentioned this pull request Nov 19, 2018
6 tasks
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left three further suggestions.

However I am not too sure about XContent parsing so would like @ywelsch to look over that part specifically.

return (long)termAndConfigs[0];
}

private static VotingConfiguration lastCommittedConfig(Object[] termAndConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add @SuppressWarnings("unchecked") to this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return new VotingConfiguration(new HashSet<>(nodeIds));
}

private static VotingConfiguration lastAcceptedConfig(Object[] termAndConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add @SuppressWarnings("unchecked") to this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


public CoordinationMetaData build() {
return new CoordinationMetaData(term, lastCommittedConfiguration, lastAcceptedConfiguration,
Collections.unmodifiableSet(new HashSet<>(votingTombstones)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I see all this in one place, I see that in VotingConfiguration we copy the set and wrap it in Collections.unmodifiableSet in the constructor whereas here I did it in the builder. Could you move it to within the constructor like with VotingConfiguration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaveCTurner DaveCTurner requested a review from ywelsch November 20, 2018 15:10
@andrershov
Copy link
Contributor Author

I'm done with the changes. Waiting for @ywelsch to take a look at CoordinationMetaData.toXContent and CoordinationMetaData.fromXContent, as well as MetaData.toXContent and MetaData.fromXContent.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some smaller comments. I would like for this PR to implement toXContent/fromXContent for the voting tombstones as well.


@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
return builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an example to the PR description how the new toXContent will look like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrershov andrershov merged commit a056bd8 into elastic:zen2 Nov 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >enhancement v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants