Skip to content

Server-side commit range filtering#1596

Merged
nastra merged 3 commits intoprojectnessie:mainfrom
nastra:server-side-commit-range-filtering
Jul 14, 2021
Merged

Server-side commit range filtering#1596
nastra merged 3 commits intoprojectnessie:mainfrom
nastra:server-side-commit-range-filtering

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Jul 14, 2021

This change is Reviewable

@nastra nastra marked this pull request as draft July 14, 2021 11:32
@nastra nastra changed the title Server side commit range filtering Server-side commit range filtering Jul 14, 2021
rymurr
rymurr previously approved these changes Jul 14, 2021
@nastra nastra force-pushed the server-side-commit-range-filtering branch from 50663b6 to 229b813 Compare July 14, 2021 11:39
@codecov
Copy link

codecov bot commented Jul 14, 2021

Codecov Report

Merging #1596 (94d2ea4) into main (ae79227) will decrease coverage by 8.97%.
The diff coverage is 98.87%.

Impacted file tree graph

@@             Coverage Diff              @@
##               main    #1596      +/-   ##
============================================
- Coverage     86.61%   77.64%   -8.98%     
- Complexity        0     2251    +2251     
============================================
  Files            13      291     +278     
  Lines           986    13891   +12905     
  Branches        146     1041     +895     
============================================
+ Hits            854    10785    +9931     
- Misses           89     2623    +2534     
- Partials         43      483     +440     
Flag Coverage Δ
java 76.98% <98.75%> (?)
python 86.39% <100.00%> (-0.23%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...va/org/projectnessie/services/rest/StreamUtil.java 90.00% <90.00%> (ø)
...n/java/org/projectnessie/client/ClientTreeApi.java 96.38% <100.00%> (ø)
.../org/projectnessie/api/params/CommitLogParams.java 100.00% <100.00%> (ø)
python/pynessie/cli.py 85.66% <100.00%> (-0.05%) ⬇️
...java/org/projectnessie/jaxrs/AbstractTestRest.java 96.47% <100.00%> (ø)
.../org/projectnessie/services/rest/TreeResource.java 71.10% <100.00%> (ø)
...park/extensions/NessieSparkSessionExtensions.scala 0.00% <0.00%> (ø)
...sie/jaxrs/NessieJaxRsJsonParseExceptionMapper.java 25.00% <0.00%> (ø)
...c/main/java/org/projectnessie/model/Reference.java 100.00% <0.00%> (ø)
...ql/execution/datasources/v2/UseReferenceExec.scala 0.00% <0.00%> (ø)
... and 275 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ae79227...94d2ea4. Read the comment docs.

@nastra nastra requested review from rymurr and snazy July 14, 2021 13:51
@nastra nastra marked this pull request as ready for review July 14, 2021 13:51
@nastra nastra force-pushed the server-side-commit-range-filtering branch from 229b813 to c3add6c Compare July 14, 2021 14:18
@nastra nastra force-pushed the server-side-commit-range-filtering branch from 2f33969 to 94d2ea4 Compare July 14, 2021 15:01
@nastra nastra merged commit b6c7990 into projectnessie:main Jul 14, 2021
@nastra nastra deleted the server-side-commit-range-filtering branch July 14, 2021 17:54
snazy added a commit to snazy/iceberg that referenced this pull request Jul 15, 2021
Apply changes to Iceberg required by API changes in Nessie:
* [Re-introduce wrapper classes for query params of CommitLog/Entries](projectnessie/nessie#1595)
* [Server-side commit range filtering](projectnessie/nessie#1596)
* [Add hashOnRef query param to support time travel on a named ref](projectnessie/nessie#1589)
* [Only accept NamedRefs in REST API](projectnessie/nessie#1583)
snazy added a commit to snazy/iceberg that referenced this pull request Jul 15, 2021
Apply changes to Iceberg required by API changes in Nessie:
* [Re-introduce wrapper classes for query params of CommitLog/Entries](projectnessie/nessie#1595)
* [Server-side commit range filtering](projectnessie/nessie#1596)
* [Add hashOnRef query param to support time travel on a named ref](projectnessie/nessie#1589)
* [Only accept NamedRefs in REST API](projectnessie/nessie#1583)
rymurr pushed a commit to apache/iceberg that referenced this pull request Jul 15, 2021
* Bump Nessie to 0.8.2 + replace Gradle plugin with new JUnit extension

More changes in this PR in following commits.

Replace Gradle plugin with new JUnit extension.
See [Add JAX-RS tests and add JUnit/Jupyter extension](projectnessie/nessie#1566)

* Changes required by Nessie-API changes

Apply changes to Iceberg required by API changes in Nessie:
* [Re-introduce wrapper classes for query params of CommitLog/Entries](projectnessie/nessie#1595)
* [Server-side commit range filtering](projectnessie/nessie#1596)
* [Add hashOnRef query param to support time travel on a named ref](projectnessie/nessie#1589)
* [Only accept NamedRefs in REST API](projectnessie/nessie#1583)

* Bugfix: must send the Contents.id of the existing table

Nessie's `Contents.id` is a random ID generated when the `Contents.Key` is first used (think:
CREATE TABLE) and must not be changed. This change addresses a bug in the Iceberg-Nesie code
that caused a new id for every change.

* Throw `CommitStateUnknownException` for `renameTable` as well

Follow-up of #2515

* Fix race-condition & save one roundtrip to Nessie during "commit"

When commiting a change, the Nessie-API now returns the hash of the commit for the change.
This returned hash should then be used as the "expected hash" for the next commit.

The previous approach was to commit the change to Nessie and then do another request to
retrieve the new hash of HEAD.

This old approach is prone to a race condition, namely when another commit happens after
"this" commit but before retrieving the "new HEAD", so "this" instance would wrongly
ignore the other commit's changes during conflict checks.

See [Let VersionStore.create()+commit() return the current hash](projectnessie/nessie#1089)
minchowang pushed a commit to minchowang/iceberg that referenced this pull request Aug 2, 2021
* Bump Nessie to 0.8.2 + replace Gradle plugin with new JUnit extension

More changes in this PR in following commits.

Replace Gradle plugin with new JUnit extension.
See [Add JAX-RS tests and add JUnit/Jupyter extension](projectnessie/nessie#1566)

* Changes required by Nessie-API changes

Apply changes to Iceberg required by API changes in Nessie:
* [Re-introduce wrapper classes for query params of CommitLog/Entries](projectnessie/nessie#1595)
* [Server-side commit range filtering](projectnessie/nessie#1596)
* [Add hashOnRef query param to support time travel on a named ref](projectnessie/nessie#1589)
* [Only accept NamedRefs in REST API](projectnessie/nessie#1583)

* Bugfix: must send the Contents.id of the existing table

Nessie's `Contents.id` is a random ID generated when the `Contents.Key` is first used (think:
CREATE TABLE) and must not be changed. This change addresses a bug in the Iceberg-Nesie code
that caused a new id for every change.

* Throw `CommitStateUnknownException` for `renameTable` as well

Follow-up of apache#2515

* Fix race-condition & save one roundtrip to Nessie during "commit"

When commiting a change, the Nessie-API now returns the hash of the commit for the change.
This returned hash should then be used as the "expected hash" for the next commit.

The previous approach was to commit the change to Nessie and then do another request to
retrieve the new hash of HEAD.

This old approach is prone to a race condition, namely when another commit happens after
"this" commit but before retrieving the "new HEAD", so "this" instance would wrongly
ignore the other commit's changes during conflict checks.

See [Let VersionStore.create()+commit() return the current hash](projectnessie/nessie#1089)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants