-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[ZEPPELIN-1604] Add Neo4j interpreter and Network visualization #1582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| <dependency> | ||
| <groupId>org.neo4j.driver</groupId> | ||
| <artifactId>neo4j-java-driver</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the license for these? could you add to the LICENSE file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felixcheung this dependency has Apache 2 license.
There is another one
<dependency>
<groupId>org.neo4j.test</groupId>
<artifactId>neo4j-harness</artifactId>
<version>${neo4j.version}</version>
<scope>test</scope>
</dependency>
which is listed both AGPL 3.0 and GPL 3.0, which of two have i to include in LICENSE file?
Another question, also must i include the license for sigma.js right?
Thanks for the support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While org.neo4j.test:neo4j-harness is 'test' scope and will not bundled into the binary package, we don't need include it in zeppelin-distribution/src/bin_license/LICENSE file.
Right, sigma.js is going to be bundled in binary package and we need to include it in zeppelin-distribution/src/bin_license/LICENSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
|
Thank you for the great contribution. Could you check https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/development/writingzeppelininterpreter.html#contributing-a-new-interpreter-to-zeppelin-releases Specifically,
|
zeppelin-web/bower.json
Outdated
| "select2": "^4.0.3", | ||
| "github-markdown-css": "^2.4.0" | ||
| "github-markdown-css": "^2.4.0", | ||
| "sigma.js": "https://github.com/jacomyal/sigma.js.git#master" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a specific tag instead of master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| SVG, | ||
| NULL | ||
| NULL, | ||
| NETWORK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a new interpreter result type means adding a new DisplaySystem
And every display system should accessible from interpreter's output. for example, Table DisplaySystem,
%spark
print(s"""%table
key\tvalue
sun\t100
moon\t20""")
%sh echo -e "%table key\tvalue\nsun\t100\nmoon\t20"
Any output of interpreter is serialized as a string into InterpreterResult.msg and if %table is detected, it is deserialized in a front-end side and rendered
Can we do the same with NETWORK?
Or another approach could be remove 'NETWORK' result type, describe node/edge information some how in TABLE format. And let loadNetworkData() parse and construct network.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
i updated the code base in order to manage simple graphs like this:
%spark
print(s"""
%network {
"nodes" : [
{"id" : 1},
{"id" : 2}
],
"edges" : [{"source" : 2, "target" : 1, "id" : 1}]
}
""")
I prefer to manage the graphs via json format because the goal of this contribution is to manage Property Graphs, not just simple graphs. Is this a viable way?
Thanks for the feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
|
|
||
| <button type="button" class="btn btn-default btn-sm" | ||
| ng-if="paragraph.result.type == 'TABLE'" | ||
| ng-if="paragraph.result.type == 'TABLE' || paragraph.result.type == 'NETWORK'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all other types of visualizations - table, bar, line, area, scatter shouldn't be selectable when result type is 'NETWORK' while they can not display it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of this PR is to full advantage the property graph by the zeppelin platfom.
%spark
print(s"""
%network {
"nodes" : [
{"id" : 1, "label" : "User", "data" : { "fullname" : "Andrea Santurbano" }},
{"id" : 2, "label" : "User", "data" : { "fullname" : "Moon soo Lee" }}
],
"edges" : [{"source" : 2, "target" : 1, "id" : 1, "label" : "HELPS", "data" : { "project" : "Zeppelin", "githubUrl": "https://github.com/apache/zeppelin/pull/1582" } }]
}
""")
This type of graph can be displayed in other ways like tabular, donut, etc...
So i think the other visualizations must stay selectable.
Thanks for the feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! could you add examples/info on this to DisplaySystem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Updated the documentation.
|
Front-end visualization code has been refactored from #1529. I think that's the biggest conflicts that this branch have right now. Could you merge master branch to this branch and resolve conflict? |
|
@Leemoonsoo i'll study the code this we. I'll let you know if i need help! Thanks!!! |
|
Let's go with (c) & (b) and then (a)? |
|
Sorry for late response. |
|
Perfect just to clarify what i have to do:
I'm right? |
|
Right! |
|
@conker84 wow what a long story. Do you need any help from us (Neo4j) or @felixcheung @Leemoonsoo is there anything we can do? |
|
@jexp any help is appreciated! |
|
@felixcheung @Leemoonsoo i'm ready to make a new PR but my travis fails in one job |
|
Hi guys any news? |
I think this profile failed due to one of flaky tests. So, just restarting the profile would work. |
|
I can't understand if it is related to something made by me becouse i haven't touched the zeppelin-server |
|
Then, clicking |
| <name>Zeppelin: Neo4j interpreter</name> | ||
|
|
||
| <properties> | ||
| <neo4j.driver.version>1.0.4</neo4j.driver.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps update to 1.2.0
|
|
||
| <properties> | ||
| <neo4j.driver.version>1.0.4</neo4j.driver.version> | ||
| <test.neo4j.kernel.version>3.0.4</test.neo4j.kernel.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update to 3.1.2
| } | ||
| List<String> line = new ArrayList<>(); | ||
| for (String col : cols) { | ||
| Object value = record.get(col); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Value instead.
|
|
||
| private void setResultValue(Object value, Set<Node> nodes, Set<Relationship> relationships, | ||
| List<String> line) { | ||
| if (value instanceof NodeValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Type.isTypeOf(value) from http://neo4j.com/docs/api/java-driver/current/org/neo4j/driver/v1/types/TypeSystem.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't rely on internal types or instanceof checks.
| (PathValue) value : (PathValue) ((InternalPath) value).asValue(); | ||
| nodes.addAll(Iterables.asList(pathVal.asPath().nodes())); | ||
| relationships.addAll(Iterables.asList(pathVal.asPath().relationships())); | ||
| } else if (value instanceof ListValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also handle Map and map values.
| setResultValue(val, nodes, relationships, line); | ||
| } | ||
| } else { | ||
| line.add(String.valueOf(value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value.asString()
| } | ||
| } | ||
|
|
||
| private StatementResult execute(Session session, String cypherQuery, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Neo4j 3.1 also $param is valid parameter syntax.
|
|
||
| @Override | ||
| public void cancel(InterpreterContext context) { | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could abort a running transaction, e.g. with session.reset()
| Map<String, String> graphLabels) { | ||
| Set<String> labels = new LinkedHashSet<>(); | ||
| String firstLabel = null; | ||
| for (String label : n.labels()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should they probably be sorted to assure consistency ?
| public static final String NEO4J_SERVER_URL = "neo4j.url"; | ||
| public static final String NEO4J_SERVER_USER = "neo4j.user"; | ||
| public static final String NEO4J_SERVER_PASSWORD = "neo4j.password"; | ||
| public static final String NEO4J_MAX_CONCURRENCY = "neo4j.max.concurrency"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this also be used for neo4j-driver session pool size?
|
@conker84 @felixcheung I looked at the java-driver Neo4j bits and made some comments. In general it looks great and I really like the ideas. If you need anything else from me to make this PR happen, please let me know. |
|
@jexp thanks for the hints i will integrate them and when i'll be ready i'll ask you another review! |
|
@felixcheung @Leemoonsoo see the #2125 |
### What is this PR for? This issue is about a new network visualization that can leverage the Property Graph Model (https://github.com/tinkerpop/gremlin/wiki/Defining-a-Property-Graph), but also simple graphs in order to provide a set of base apis that can be used by graph dbs (like Neo4j) or graph processing frameworks (like GraphX or Giraph). ### What type of PR is it? [Feature] Is related to the #1582 ### Todos * [x] - Added the intepreter apis to manage graphs (under the pakage **org.apache.zeppelin.interpreter.graph**) * [x] - Added the frontend apis to manage graphs (via d3js) ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-2222 ### How should this be tested? You can download [this notebook](https://gist.github.com/conker84/9574127c2389d08164423894aa93b67f) to test the PR ### Screenshots (if appropriate)  ### Video  ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: conker84 <santand@gmail.com> Closes #2125 from conker84/master and squashes the following commits: b6062a0 [conker84] Removed package org.apache.zeppelin.interpreter.graph e98ca7a [conker84] Comments of review 14/03/2017 b31b7b7 [conker84] Rebase of 07/04/2017 3257bea [conker84] Rebase 30/4 6e74eb9 [conker84] Rebase 30/04
|
Hi guys, |

What is this PR for?
This contribution would to introduce Neo4j Cypher intepreter and the new network visualization;
at the same time would provide base APIs that enable other graph databases (or graph framworks such as GraphX or Giraph).
What type of PR is it?
[Feature]
Todos
What is the Jira issue?
[ZEPPELIN-1604]
How should this be tested?
Donwload and execute Neo4j v3.x, you can also pull a Docker image.
In order to execute test cases, if you are running Java 7, you need to also provide an environment variable telling the tests where to find Java 8, because Neo4j-the-database needs it to run.
Use this statement to create a dummy dataset
%neo4j UNWIND range(1,100) as id CREATE (p:Person {id:id, name: "Name " + id, age: id % 3}) WITH collect(p) as people UNWIND people as p1 UNWIND range(1,10) as friend WITH p1, people[(p1.id + friend) % size(people)] as p2 CREATE (p1)-[:KNOWS {years: abs(p2.id - p2.id)}]->(p2)Then you can write some simple queries like:
%neo4j MATCH (p:Person)-[r:KNOWS]-(p1:Person) RETURN p, r, p1 LIMIT 10;%neo4j MATCH (p:Person)-[r:KNOWS]-(p1:Person) RETURN p.id AS ID_A, p.name AS NAME_A, r.years AS YEARS, p1.id AS ID_B, p1.name AS NAME_B LIMIT 20;Screenshots
Video
Questions: