Skip to content

Conversation

@rymurr
Copy link
Contributor

@rymurr rymurr commented Jun 3, 2021

A long way to go yet but this is an initial implementation of SQL commands

TODO

  • clean up todos
  • add features to NessieCatalog in iceberg
  • unit tests
  • refactor and clean code
  • delta
  • update demo
  • call GC algo

This change is Reviewable

@rymurr rymurr force-pushed the spark-extension branch 2 times, most recently from d266afb to 0c1e118 Compare June 9, 2021 09:23
@rymurr rymurr force-pushed the spark-extension branch 3 times, most recently from 39fede1 to 16bfc0a Compare June 16, 2021 15:24
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass on this one and left some comments.

;

statement
: CREATE (BRANCH|TAG) identifier (IN catalog=identifier)? (AS reference=identifier)? #nessieCreateRef
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to distinguish between named-reference-identifiers and pure-hash-refs and references that can be either.
The current syntax allows something like CREATE BRANCH 012345, which feels a bit odd (like named-refs that can start with a digit).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for identifier (allows a leading digit) used for catalog names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure what you mean. Do you mean validate w/ ANTLR that identifier is a valid reference?

We allow leading digits for branch names don't we? Spark definitely allows it for catalogs

@rymurr rymurr force-pushed the spark-extension branch from 3588290 to 58bbee0 Compare June 17, 2021 15:29
@rymurr rymurr marked this pull request as ready for review June 17, 2021 15:29
@rymurr rymurr force-pushed the spark-extension branch from 58bbee0 to 64bc009 Compare June 17, 2021 15:59
@codecov
Copy link

codecov bot commented Jun 17, 2021

Codecov Report

Merging #1373 (2c9eac5) into main (3df8f79) will decrease coverage by 3.18%.
The diff coverage is 12.52%.

Impacted file tree graph

@@             Coverage Diff              @@
##               main    #1373      +/-   ##
============================================
- Coverage     78.83%   75.64%   -3.19%     
- Complexity     2049     2067      +18     
============================================
  Files           259      281      +22     
  Lines         12023    12646     +623     
  Branches        962     1003      +41     
============================================
+ Hits           9478     9566      +88     
- Misses         2074     2599     +525     
- Partials        471      481      +10     
Flag Coverage Δ
java 74.76% <12.52%> (-3.42%) ⬇️
python 86.32% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...er/extensions/NessieSparkSqlExtensionsParser.scala 0.00% <0.00%> (ø)
...ser/extensions/NessieSqlExtensionsAstBuilder.scala 0.00% <0.00%> (ø)
...atalyst/plans/logical/AssignReferenceCommand.scala 0.00% <0.00%> (ø)
...atalyst/plans/logical/CreateReferenceCommand.scala 0.00% <0.00%> (ø)
.../catalyst/plans/logical/DropReferenceCommand.scala 0.00% <0.00%> (ø)
.../catalyst/plans/logical/ShowReferenceCommand.scala 0.00% <0.00%> (ø)
...l/catalyst/plans/logical/UseReferenceCommand.scala 0.00% <0.00%> (ø)
...execution/datasources/v2/AssignReferenceExec.scala 0.00% <0.00%> (ø)
...execution/datasources/v2/CreateReferenceExec.scala 0.00% <0.00%> (ø)
...l/execution/datasources/v2/DropReferenceExec.scala 0.00% <0.00%> (ø)
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3df8f79...2c9eac5. Read the comment docs.

@rymurr
Copy link
Contributor Author

rymurr commented Jun 18, 2021

Note: something is going on w/ jacoco where all of the code in this PR is showing 0% coverage. Even though there is 100%.

Getting:

[WARNING] Execution data for class org/apache/spark/sql/catalyst/parser/extensions/NessieSqlExtensionsParser$NessieCreateRefContext does not match.
[WARNING] Execution data for class org/apache/spark/sql/execution/datasources/v2/CreateReferenceExec does not match.
[WARNING] Execution data for class org/apache/spark/sql/catalyst/parser/extensions/NessieSparkSqlExtensionsParser does no
....

Possibly because of the shaded jar? Or maybe its scala?

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, just a few small things

.getCommitLogStream(
nessieClient.getTreeApi,
branch,
CommitLogParams.empty()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use CommitLogParams.builder().before(..).after(..).build() for now?

Ryan Murray added 5 commits June 28, 2021 14:14
TODO
* clean up todos
* add features to NessieCatalog in iceberg
* unit tests
* refactor and clean code
* update demo
@rymurr rymurr force-pushed the spark-extension branch from ba3c7d3 to 9cf8b50 Compare June 28, 2021 13:51
@rymurr
Copy link
Contributor Author

rymurr commented Jun 28, 2021

Thanks for the review @nastra and @snazy this should be ready for another round

@rymurr
Copy link
Contributor Author

rymurr commented Jun 29, 2021

Addressed all of your comments @nastra. Hopefully looks good

@rymurr rymurr requested a review from nastra June 29, 2021 12:34
Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@rymurr rymurr merged commit 24ee8ac into projectnessie:main Jul 1, 2021
@rymurr rymurr deleted the spark-extension branch July 1, 2021 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants