Initial support for CassandraSQLContext for %sql queries #86

benbromhead · 2015-05-28T06:47:30Z

Allow users to use %sql syntax to query Cassandra tables with SparkSQL without having to manually create a CassandraSQLContext.

Expanding on "Add build profile for Spark/Cassandra integration #79"

To enable Cassandra SQL context support add -Dzeppelin.spark.useCassandraContext=true to the ZEPPELIN_JAVA_OPTS parameter in conf/zeppelin-env.sh file. Alternatively you can add this parameter in the parameter list of the Spark interpreter on the GUI.

Also supported are "spark.cassandra.connection.host", "spark.cassandra.auth.username" and "spark.cassandra.auth.password" parameters.

boneill42 · 2015-06-03T15:42:54Z

Dude, nice work.

Leemoonsoo · 2015-06-08T16:33:55Z

spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java

I think line 268 sets all property into conf that property starts with 'spark.'. So isn't it duplicated?

Leemoonsoo · 2015-06-08T16:43:13Z

Thanks for great contribution!

And how about add zeppelin.spark.useCassandraContext into property builder?

  static {
     Interpreter.register(
         "spark",
         "spark",
         SparkInterpreter.class.getName(),
         new InterpreterPropertyBuilder()
             .add("spark.app.name", "Zeppelin", "The name of spark application.")
             ...
             .add("zeppelin.spark.useHiveContext", "true",
                  "Use HiveContext instead of SQLContext if it is true.")
             ...

both zeppelin.spark.useCassandraContext, zeppelin.spark.useHiveContext can be true or false. In this case, in user point of view, it's very unclear to see what's going to happen. Do you have an idea to handle it?

And Is it going to be difficult to add a test for it?

devsprint · 2015-06-25T10:32:05Z

While I have tried to make Zeppelin to work with DSE 4.6 I have integrated also this change. I'm able to properly query cassandra db. However, if there is an error, the notebook will display a generic error that is not helping at all to identify the root cause. (java.lang.reflect.InvocationTargetException).

if zeppelin.spark.useCassandraContext is set to true, then the value of zeppelin.spark.useHiveContext it doesn't matter. Even if it set to true it will still bind to cassandra context.
I think that there is a need for a small change to make sure that just on of the context could be set to true and not both....

benbromhead · 2015-06-25T18:20:47Z

Apologies I haven't had a whole heap of bandwidth to address the comments above, after chatting @Leemoonsoo at the spark conference the other week I have a good idea about integrating some tests and cleaning up the above mentioned issues.

Stay tuned.

devsprint · 2015-06-25T18:58:20Z

@benbromhead let me know if I can help. I have played with it all day long today, using zeppelin forms and registering custom functions to be applied on results. Every thing works fine so far.

…y for all spark.* props Set default use of Cassandra Context to false

Conflicts: spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java

Maven will now only run Cassandra tests when a Cassandra profile is enabled Add some default values for the CassandraContext Raise exception if Cassandra and Hive contexts are enabled at the same time

… set, plain old spark should be the default with an opt-in to hive/cassandra/etc ... Change SparkSqlInterpreterTest to use Hive context as the tests use Hive specific syntax. Note: This changes the default SparkInterpreter behaviour and those upgrading will need to change their configuration.

benbromhead · 2015-07-17T18:43:44Z

Ok I finally got around to completing some unit tests and changing the default values for the various interpreters.

In this PR I've changed the default behaviour of the SparkInterpreter to use the standard SQLContext by default rather than the HiveContext. This will potentially break peoples notebooks on upgrade if they are relying on Hive syntax/functionality rather than Spark SQL syntax without realising it.

…into cassandraSQLsupport

doanduyhai · 2015-07-17T19:05:38Z

@benbromhead, I'm writing a Cassandra interpreter (connecting Zeppelin directly to C* without going through Spark). Does it worth the effort to have a CassandraSQL interpreter ?

benbromhead · 2015-07-17T21:08:58Z

@doanduyhai I think it is... particularly if you want to do things like joins :)

Allow Cassandra-spark-1.1 profile to run Cassandra tests Apache license for cassandra.cql file Add extra params to InterpreterContext Removed wrong syntax test

Leemoonsoo · 2015-07-21T02:11:56Z

spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java

Is there any good way to keep this value unchanged?
Changing default value might confuse user. what do you think?

benbromhead · 2015-07-22T17:37:46Z

@Leemoonsoo yeah I struggled to come up with a good answer for this. On the one hand I think the default behaviour should be to use the standard SQLSpark context, but by changing this it will break the default behaviour for users that rely on Hive features.

You will notice that the sql context tests actually rely on Hive syntax rather than spark sql compliant syntax. My vote would be to go with what is correct early on in this projects life, but I know this will be irritating for existing users.

Happy to revert to the previous behaviour.

Leemoonsoo · 2015-07-22T21:19:16Z

zeppelin.spark.useHiveContext was originally 'false' but changed to 'true'. The reason was to give the same experience with spark-shell, which uses HiveContext by default.

I think changing zeppelin.spark.useHiveContext need to be handled in separate issue if it is not breaking this contribution.

… spark-shell

benbromhead · 2015-07-22T21:58:52Z

Ok makes sense, changed back to use the HiveContext by default

Leemoonsoo · 2015-08-27T14:09:10Z

+1

benbromhead · 2015-09-14T18:53:36Z

Anything left to do on this before it gets merged?

falconair · 2016-01-07T20:52:34Z

Looks like this request was close to getting merged months ago, any status update on this?

jongyoul · 2016-02-25T06:44:11Z

@Leemoonsoo @benbromhead Do you guys have any update for this? It looks like pending state.

Leemoonsoo · 2016-02-25T14:25:16Z

Can someone provide additional review?

bbromhead · 2016-02-29T18:07:45Z

The codebase has probably shifted a little bit since this last passed tests etc, do you want me to update and test against the latest master?

corneadoug · 2016-09-27T07:18:13Z

@bbromhead @jongyoul @Leemoonsoo any more feedback on changes needed on this PR?

benbromhead · 2016-09-27T17:55:05Z

I've updated this PR in the past and also asked for further interest.

I'm happy to update against the latest if there is an appetite to actually merge this into the project, it would be great to get feedback either way!

doanduyhai · 2016-09-27T20:32:52Z

@benbromhead some remarks before rebasing

maybe upgrade C* version and Spark/Cassandra connector version
maybe use Achilles-embedded (look Cassandra interpreter unit tests) instead of cassandra-unit. The project is no longer actively maintained
the SparkInterpreter class has changed a lot, probably a good idea to cherry-pick your changes and apply them instead of going through the complete rebase
you've hard-coded the Spark/Cassandra connector dependency directly in spark/pom.xml. I'm not sure it's the cleanest way because people who don't use Cassandra still need to pull this dependency. Maybe it's better to create a separate Maven build profile for this. Until now it is done in spark-dependencies/pom.xml but since it will be removed soon by @AhyoungRyu (see [ZEPPELIN-1332] Remove spark-dependencies & suggest new way #1339) we'll need to find an alternative

benbromhead · 2016-09-27T22:23:13Z

I'll look to address the remarks. I've also noticed the project now has a formal tracker via Jira. I'll also look to complete the requirements listed in https://zeppelin.apache.org/contribution/contributions.html

close #83 close #86 close #125 close #133 close #139 close #146 close #193 close #203 close #246 close #262 close #264 close #273 close #291 close #299 close #320 close #347 close #389 close #413 close #423 close #543 close #560 close #658 close #670 close #728 close #765 close #777 close #782 close #783 close #812 close #822 close #841 close #843 close #878 close #884 close #918 close #989 close #1076 close #1135 close #1187 close #1231 close #1304 close #1316 close #1361 close #1385 close #1390 close #1414 close #1422 close #1425 close #1447 close #1458 close #1466 close #1485 close #1492 close #1495 close #1497 close #1536 close #1545 close #1561 close #1577 close #1600 close #1603 close #1678 close #1695 close #1739 close #1748 close #1765 close #1767 close #1776 close #1783 close #1799

…w-rest-api to V_1.0.0 * commit 'dfa8bbe35bde8ebcc9f691f48b7fbfd5d3c96142': [ZP-329] New rest module for tzeppelin

Added support for CassandraSQLContext for %sql queries

fb27aba

Leemoonsoo reviewed Jun 8, 2015
View reviewed changes

bbromhead added 5 commits July 1, 2015 12:09

No longer set spark.cassandra props explicitly as this is done alread…

4808830

…y for all spark.* props Set default use of Cassandra Context to false

Merge branch 'master' into cassandraSQLsupport

f28a956

Conflicts: spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java

Adding integration tests for Cassandra SQL context

b400ee9

Maven will now only run Cassandra tests when a Cassandra profile is enabled Add some default values for the CassandraContext Raise exception if Cassandra and Hive contexts are enabled at the same time

Minor style changes

7fef967

Merge branch 'master' of https://github.com/apache/incubator-zeppelin …

270dc3b

…into cassandraSQLsupport

Added CI test for Cassandra and Spark 1.3

4c8e638

Allow Cassandra-spark-1.1 profile to run Cassandra tests Apache license for cassandra.cql file Add extra params to InterpreterContext Removed wrong syntax test

Leemoonsoo reviewed Jul 21, 2015
View reviewed changes

Reverted change to default spark context. Now uses HiveContext as per…

c453426

… spark-shell

turn off hive for Cassandra test

056f6dd

asfgit closed this in c38a0a0 May 9, 2018

egorklimov pushed a commit to Tinkoff/zeppelin that referenced this pull request Sep 18, 2019

[ZP-329] Merge pull request apache#86 in DW/helicopter from ZP-329-ne…

b77ea8f

…w-rest-api to V_1.0.0 * commit 'dfa8bbe35bde8ebcc9f691f48b7fbfd5d3c96142': [ZP-329] New rest module for tzeppelin

Initial support for CassandraSQLContext for %sql queries #86

Initial support for CassandraSQLContext for %sql queries #86

Uh oh!

Conversation

benbromhead commented May 28, 2015

Uh oh!

boneill42 commented Jun 3, 2015

Uh oh!

Leemoonsoo Jun 8, 2015

Choose a reason for hiding this comment

Uh oh!

Leemoonsoo commented Jun 8, 2015

Uh oh!

devsprint commented Jun 25, 2015

Uh oh!

benbromhead commented Jun 25, 2015

Uh oh!

devsprint commented Jun 25, 2015

Uh oh!

benbromhead commented Jul 17, 2015

Uh oh!

doanduyhai commented Jul 17, 2015

Uh oh!

benbromhead commented Jul 17, 2015

Uh oh!

Leemoonsoo Jul 21, 2015

Choose a reason for hiding this comment

Uh oh!

benbromhead commented Jul 22, 2015

Uh oh!

Leemoonsoo commented Jul 22, 2015

Uh oh!

benbromhead commented Jul 22, 2015

Uh oh!

Leemoonsoo commented Aug 27, 2015

Uh oh!

benbromhead commented Sep 14, 2015

Uh oh!

falconair commented Jan 7, 2016

Uh oh!

jongyoul commented Feb 25, 2016

Uh oh!

Leemoonsoo commented Feb 25, 2016

Uh oh!

bbromhead commented Feb 29, 2016

Uh oh!

corneadoug commented Sep 27, 2016

Uh oh!

benbromhead commented Sep 27, 2016

Uh oh!

doanduyhai commented Sep 27, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benbromhead commented Sep 27, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

doanduyhai commented Sep 27, 2016 •

edited

Loading