-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Initial support for CassandraSQLContext for %sql queries #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Dude, nice work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think line 268 sets all property into conf that property starts with 'spark.'. So isn't it duplicated?
|
Thanks for great contribution! And how about add both And Is it going to be difficult to add a test for it? |
|
While I have tried to make Zeppelin to work with DSE 4.6 I have integrated also this change. I'm able to properly query cassandra db. However, if there is an error, the notebook will display a generic error that is not helping at all to identify the root cause. (java.lang.reflect.InvocationTargetException). if zeppelin.spark.useCassandraContext is set to true, then the value of zeppelin.spark.useHiveContext it doesn't matter. Even if it set to true it will still bind to cassandra context. |
|
Apologies I haven't had a whole heap of bandwidth to address the comments above, after chatting @Leemoonsoo at the spark conference the other week I have a good idea about integrating some tests and cleaning up the above mentioned issues. Stay tuned. |
|
@benbromhead let me know if I can help. I have played with it all day long today, using zeppelin forms and registering custom functions to be applied on results. Every thing works fine so far. |
…y for all spark.* props Set default use of Cassandra Context to false
Conflicts: spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java
Maven will now only run Cassandra tests when a Cassandra profile is enabled Add some default values for the CassandraContext Raise exception if Cassandra and Hive contexts are enabled at the same time
… set, plain old spark should be the default with an opt-in to hive/cassandra/etc ... Change SparkSqlInterpreterTest to use Hive context as the tests use Hive specific syntax. Note: This changes the default SparkInterpreter behaviour and those upgrading will need to change their configuration.
|
Ok I finally got around to completing some unit tests and changing the default values for the various interpreters. In this PR I've changed the default behaviour of the SparkInterpreter to use the standard SQLContext by default rather than the HiveContext. This will potentially break peoples notebooks on upgrade if they are relying on Hive syntax/functionality rather than Spark SQL syntax without realising it. |
…into cassandraSQLsupport
|
@benbromhead, I'm writing a Cassandra interpreter (connecting Zeppelin directly to C* without going through Spark). Does it worth the effort to have a CassandraSQL interpreter ? |
|
@doanduyhai I think it is... particularly if you want to do things like joins :) |
Allow Cassandra-spark-1.1 profile to run Cassandra tests Apache license for cassandra.cql file Add extra params to InterpreterContext Removed wrong syntax test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any good way to keep this value unchanged?
Changing default value might confuse user. what do you think?
|
@Leemoonsoo yeah I struggled to come up with a good answer for this. On the one hand I think the default behaviour should be to use the standard SQLSpark context, but by changing this it will break the default behaviour for users that rely on Hive features. You will notice that the sql context tests actually rely on Hive syntax rather than spark sql compliant syntax. My vote would be to go with what is correct early on in this projects life, but I know this will be irritating for existing users. Happy to revert to the previous behaviour. |
|
I think changing |
|
Ok makes sense, changed back to use the HiveContext by default |
|
+1 |
|
Anything left to do on this before it gets merged? |
|
Looks like this request was close to getting merged months ago, any status update on this? |
|
@Leemoonsoo @benbromhead Do you guys have any update for this? It looks like pending state. |
|
Can someone provide additional review? |
|
The codebase has probably shifted a little bit since this last passed tests etc, do you want me to update and test against the latest master? |
|
@bbromhead @jongyoul @Leemoonsoo any more feedback on changes needed on this PR? |
|
I've updated this PR in the past and also asked for further interest. I'm happy to update against the latest if there is an appetite to actually merge this into the project, it would be great to get feedback either way! |
|
@benbromhead some remarks before rebasing
|
|
I'll look to address the remarks. I've also noticed the project now has a formal tracker via Jira. I'll also look to complete the requirements listed in https://zeppelin.apache.org/contribution/contributions.html |
close #83 close #86 close #125 close #133 close #139 close #146 close #193 close #203 close #246 close #262 close #264 close #273 close #291 close #299 close #320 close #347 close #389 close #413 close #423 close #543 close #560 close #658 close #670 close #728 close #765 close #777 close #782 close #783 close #812 close #822 close #841 close #843 close #878 close #884 close #918 close #989 close #1076 close #1135 close #1187 close #1231 close #1304 close #1316 close #1361 close #1385 close #1390 close #1414 close #1422 close #1425 close #1447 close #1458 close #1466 close #1485 close #1492 close #1495 close #1497 close #1536 close #1545 close #1561 close #1577 close #1600 close #1603 close #1678 close #1695 close #1739 close #1748 close #1765 close #1767 close #1776 close #1783 close #1799
…w-rest-api to V_1.0.0 * commit 'dfa8bbe35bde8ebcc9f691f48b7fbfd5d3c96142': [ZP-329] New rest module for tzeppelin
Allow users to use %sql syntax to query Cassandra tables with SparkSQL without having to manually create a CassandraSQLContext.
Expanding on "Add build profile for Spark/Cassandra integration #79"
To enable Cassandra SQL context support add -Dzeppelin.spark.useCassandraContext=true to the ZEPPELIN_JAVA_OPTS parameter in conf/zeppelin-env.sh file. Alternatively you can add this parameter in the parameter list of the Spark interpreter on the GUI.
Also supported are "spark.cassandra.connection.host", "spark.cassandra.auth.username" and "spark.cassandra.auth.password" parameters.