Skip to content

Conversation

@xunliu
Copy link
Member

@xunliu xunliu commented Sep 19, 2018

What is this PR for?

By using the Raft protocol, multiple Zeppelin-Server groups are built into a Zeppelin cluster, the cluster State Machine is maintained through the Raft protocol, and the services in the cluster are agreed upon. The Zeppelin-Server and Zeppelin-Interperter services and processes are stored in the Cluster MetaData. Metadata information;

What type of PR is it?

[Feature]

Todos

  • add raft algorithm atomix jar
  • add cluster state machine
  • add state machine query command
  • add state machine delete command
  • add state machine put command
  • Isolate the netty JAR package introduced by atomix

What is the Jira issue?

How should this be tested?

CI pass

Screenshots (if appropriate)

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? Yes

<version>${atomix.version}</version>
<exclusions>
<exclusion>
<groupId>io.netty</groupId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the netty version of io.atomic and instead use another version of netty ?

Copy link
Member Author

@xunliu xunliu Sep 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If don't modify the dependencies, compiling the zengine module will fail, I won't know how to fix it.

[INFO] Zeppelin: Zengine .................................. FAILURE [ 3.343 s]
[INFO] Zeppelin: Server 0.9.0-SNAPSHOT .................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 05:27 min
[INFO] Finished at: 2018-09-20T10:05:32+08:00
[INFO] ------------------------------------------------------------------------
[WARNING] The requested profile "hadoop-2.7" could not be activated because it does not exist.
[WARNING] The requested profile "yarn" could not be activated because it does not exist.
[WARNING] The requested profile "pyspark" could not be activated because it does not exist.
[WARNING] The requested profile "sparkr" could not be activated because it does not exist.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.3.1:enforce (enforce) on project zeppelin-zengine: org.apache.maven.plugins.enforcer.DependencyConvergence failed with message:
[ERROR] Failed while enforcing releasability the error(s) are [
[ERROR] Dependency convergence error for com.google.guava:guava:22.0 paths to dependency are:
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-org.apache.zeppelin:zeppelin-interpreter:0.9.0-SNAPSHOT
[ERROR] +-io.atomix:atomix:3.0.0-rc4
[ERROR] +-io.atomix:atomix-utils:3.0.0-rc4
[ERROR] +-com.google.guava:guava:22.0
[ERROR] and
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-com.google.guava:guava:20.0
[ERROR] ,
[ERROR] Dependency convergence error for org.apache.commons:commons-lang3:3.7 paths to dependency are:
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-org.apache.zeppelin:zeppelin-interpreter:0.9.0-SNAPSHOT
[ERROR] +-io.atomix:atomix:3.0.0-rc4
[ERROR] +-io.atomix:atomix-utils:3.0.0-rc4
[ERROR] +-org.apache.commons:commons-lang3:3.7
[ERROR] and
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-org.apache.commons:commons-lang3:3.4
[ERROR] ,
[ERROR] Dependency convergence error for io.netty:netty-handler:4.1.27.Final paths to dependency are:
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-org.apache.zeppelin:zeppelin-interpreter:0.9.0-SNAPSHOT
[ERROR] +-io.atomix:atomix:3.0.0-rc4
[ERROR] +-io.atomix:atomix-cluster:3.0.0-rc4
[ERROR] +-io.netty:netty-handler:4.1.27.Final
[ERROR] and
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-org.apache.zeppelin:spark-interpreter:0.9.0-SNAPSHOT
[ERROR] +-org.apache.zeppelin:zeppelin-python:0.9.0-SNAPSHOT
[ERROR] +-io.grpc:grpc-netty:1.4.0
[ERROR] +-io.netty:netty-codec-http2:4.1.11.Final
[ERROR] +-io.netty:netty-handler:4.1.11.Final
[ERROR] ,
[ERROR] Dependency convergence error for io.netty:netty-transport:4.1.27.Final paths to dependency are:
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-org.apache.zeppelin:zeppelin-interpreter:0.9.0-SNAPSHOT
[ERROR] +-io.atomix:atomix:3.0.0-rc4
[ERROR] +-io.atomix:atomix-cluster:3.0.0-rc4
[ERROR] +-io.netty:netty-transport:4.1.27.Final
[ERROR] and
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-org.apache.zeppelin:zeppelin-interpreter:0.9.0-SNAPSHOT
[ERROR] +-io.atomix:atomix:3.0.0-rc4
[ERROR] +-io.atomix:atomix-cluster:3.0.0-rc4
[ERROR] +-io.netty:netty-codec:4.1.27.Final
[ERROR] +-io.netty:netty-transport:4.1.27.Final
[ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.9.0-SNAPSHOT
[ERROR] +-org.apache.zeppelin:zeppelin-interpreter:0.9.0-SNAPSHOT
[ERROR] +-io.atomix:atomix:3.0.0-rc4
[ERROR] +-io.atomix:atomix-cluster:3.0.0-rc4
[ERROR] +-io.netty:netty-transport-native-epoll:4.1.27.Final
[ERROR] +-io.netty:netty-transport:4.1.27.Final
[ERROR] ]
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn -rf :zeppelin-zengine

import java.util.function.Consumer;

/**
* Broadcast Service Adapter
Copy link
Contributor

@zjffdu zjffdu Sep 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more doc to explain this class's responsibility ? e.g. How does it communicate with other components.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more doc about who broadcast message to whom

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is added in line 25, So it didn't show up.
The next few places that need to be modified are also the problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean to be more specific on which node this service should run. Overall there's 2 kinds of nodes. Zeppelin Server nodes and Interpreter Process node. And regarding the broadcast method, who broadcast message to whom

* 3. Cluster monitoring
*/
public abstract class ClusterManager {
private static Logger logger = LoggerFactory.getLogger(ClusterManager.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger --> LOGGER
This is just code convention zeppelin use. Static fields should use uppercase

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

protected LocalRaftProtocolFactory protocolFactory
= new LocalRaftProtocolFactory(protocolSerializer);
protected List<MessagingService> messagingServices = new ArrayList<>();
protected Collection<MemberId> clusterMemberIds = new ArrayList<MemberId>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collection --> List (to make code style consistent)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
} catch (UnknownHostException e) {
e.printStackTrace();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use LOGGER.error instead of e.printStackTrace

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

e.printStackTrace();
} catch (SocketException e) {
e.printStackTrace();
return;
Copy link
Contributor

@zjffdu zjffdu Sep 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return is not necessary for constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

raftClient.close().get(3, TimeUnit.SECONDS);
}
} catch (InterruptedException e) {
e.printStackTrace();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

} catch (ExecutionException e) {
e.printStackTrace();
} catch (TimeoutException e) {
e.printStackTrace();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import io.atomix.primitive.service.ServiceConfig;

/**
* Cluster primitive type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need more doc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import java.util.Map;

/**
* Cluster State Machine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need more doc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import org.slf4j.LoggerFactory;

/**
* Zeppelin ClusterMembershipEventListener
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need more doc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import java.util.function.Function;

/**
* Cluster Raft client protocol.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need more docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


@Test
public void testColumnAliasQuery() throws IOException {
Properties properties = new Properties();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something must be wrong, this is not supposed to be in your commit. It is someone else's commit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the code I submitted, I did not modify this, is it caused by Git's bug?

By using the Raft protocol, multiple Zeppelin-Server groups are built into a Zeppelin cluster, the cluster State Machine is maintained through the Raft protocol, and the services in the cluster are agreed upon. The Zeppelin-Server and Zeppelin-Interperter services and processes are stored in the Cluster MetaData. Metadata information;

[Feature]

* [x] add raft algorithm atomix jar
* [x] add cluster state machine
* [x] add state machine query command
* [x] add state machine delete command
* [x] add state machine put command
* [x] Isolate the netty JAR package introduced by atomix

* https://issues.apache.org/jira/browse/ZEPPELIN-3610

CI pass

* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes
@xunliu
Copy link
Member Author

xunliu commented Sep 26, 2018

Has passed the CI

<!--<configuration>-->
<!--<skip>true</skip>-->
<!--</configuration>-->
<!--</plugin>-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you uncomment this ? It is my mistake in my PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not modified this pom.xml.

Copy link
Contributor

@zjffdu zjffdu Sep 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to my PR which you merged

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

* 1. RaftClient as the raft client
* 2. Threading to provide retry after cluster metadata submission failure
* 3. Cluster monitoring
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this class run ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClusterManager.java is the base class for cluster management.
My next PR will submit ClusterManagerClient.java and ClusterManagerServer.java that are extends from ClusterManager.java.

@zjffdu
Copy link
Contributor

zjffdu commented Sep 27, 2018

LGTM, will merge if no more comments

@asfgit asfgit closed this in f28e963 Sep 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants