-
Notifications
You must be signed in to change notification settings - Fork 170
Spark extensions #1373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark extensions #1373
Conversation
d266afb to
0c1e118
Compare
39fede1 to
16bfc0a
Compare
snazy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first pass on this one and left some comments.
...sions/src/main/antlr4/org.apache.spark.sql.catalyst.parser.extensions/NessieSqlExtensions.g4
Outdated
Show resolved
Hide resolved
...sions/src/main/antlr4/org.apache.spark.sql.catalyst.parser.extensions/NessieSqlExtensions.g4
Outdated
Show resolved
Hide resolved
| ; | ||
|
|
||
| statement | ||
| : CREATE (BRANCH|TAG) identifier (IN catalog=identifier)? (AS reference=identifier)? #nessieCreateRef |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to distinguish between named-reference-identifiers and pure-hash-refs and references that can be either.
The current syntax allows something like CREATE BRANCH 012345, which feels a bit odd (like named-refs that can start with a digit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for identifier (allows a leading digit) used for catalog names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure what you mean. Do you mean validate w/ ANTLR that identifier is a valid reference?
We allow leading digits for branch names don't we? Spark definitely allows it for catalogs
...rg-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/NessieUtils.scala
Outdated
Show resolved
Hide resolved
...rg-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/NessieUtils.scala
Outdated
Show resolved
Hide resolved
...tensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/UseReferenceExec.scala
Outdated
Show resolved
Hide resolved
...n/scala/org/apache/spark/sql/catalyst/parser/extensions/NessieSparkSqlExtensionsParser.scala
Show resolved
Hide resolved
...in/scala/org/apache/spark/sql/catalyst/parser/extensions/NessieSqlExtensionsAstBuilder.scala
Outdated
Show resolved
Hide resolved
...in/scala/org/apache/spark/sql/catalyst/parser/extensions/NessieSqlExtensionsAstBuilder.scala
Outdated
Show resolved
Hide resolved
clients/deltalake/core/src/main/scala/org/projectnessie/deltalake/NessieLogStore.scala
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## main #1373 +/- ##
============================================
- Coverage 78.83% 75.64% -3.19%
- Complexity 2049 2067 +18
============================================
Files 259 281 +22
Lines 12023 12646 +623
Branches 962 1003 +41
============================================
+ Hits 9478 9566 +88
- Misses 2074 2599 +525
- Partials 471 481 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
Note: something is going on w/ jacoco where all of the code in this PR is showing 0% coverage. Even though there is 100%. Getting: Possibly because of the shaded jar? Or maybe its scala? |
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM, just a few small things
...in/scala/org/apache/spark/sql/catalyst/parser/extensions/NessieSqlExtensionsAstBuilder.scala
Outdated
Show resolved
Hide resolved
...nsions/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/CreateReferenceField.scala
Outdated
Show resolved
Hide resolved
...in/scala/org/apache/spark/sql/catalyst/parser/extensions/NessieSqlExtensionsAstBuilder.scala
Outdated
Show resolved
Hide resolved
...in/scala/org/apache/spark/sql/catalyst/parser/extensions/NessieSqlExtensionsAstBuilder.scala
Outdated
Show resolved
Hide resolved
...rg-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/NessieUtils.scala
Outdated
Show resolved
Hide resolved
| .getCommitLogStream( | ||
| nessieClient.getTreeApi, | ||
| branch, | ||
| CommitLogParams.empty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use CommitLogParams.builder().before(..).after(..).build() for now?
.../iceberg-extensions/src/test/java/org/projectnessie/spark/extensions/ITNessieStatements.java
Outdated
Show resolved
Hide resolved
.../iceberg-extensions/src/test/java/org/projectnessie/spark/extensions/ITNessieStatements.java
Outdated
Show resolved
Hide resolved
.../iceberg-extensions/src/test/java/org/projectnessie/spark/extensions/ITNessieStatements.java
Outdated
Show resolved
Hide resolved
.../iceberg-extensions/src/test/java/org/projectnessie/spark/extensions/ITNessieStatements.java
Outdated
Show resolved
Hide resolved
TODO * clean up todos * add features to NessieCatalog in iceberg * unit tests * refactor and clean code * update demo
...ts/spark-extensions/src/test/java/org/projectnessie/spark/extensions/ITNessieStatements.java
Outdated
Show resolved
Hide resolved
|
Addressed all of your comments @nastra. Hopefully looks good |
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
A long way to go yet but this is an initial implementation of SQL commands
TODO
This change is