Move table-metadata-pointer back to global state, use snapshot,schema,spec,sort-order ids as on-ref state by snazy · Pull Request #2626 · projectnessie/nessie

snazy · 2021-11-09T11:41:19Z

This change basically reverts #2313. It moves the Iceberg table-metadata-pointer back to Nessie's global state and snapshot ID plus other relevant IDs (schema-ID, partition-spec-ID, sort-order-ID) to Nessie's on-reference state.

A working version of the Iceberg changes is in this branch: https://github.com/snazy/iceberg/tree/back-to-single-table-metadata

codecov · 2021-11-09T12:10:37Z

Codecov Report

Merging #2626 (529860b) into main (d5a50e0) will increase coverage by 0.02%.
The diff coverage is 95.16%.

@@             Coverage Diff              @@
##               main    #2626      +/-   ##
============================================
+ Coverage     84.63%   84.66%   +0.02%     
  Complexity     1943     1943              
============================================
  Files           270      270              
  Lines         11164    11184      +20     
  Branches        807      807              
============================================
+ Hits           9449     9469      +20     
  Misses         1399     1399              
  Partials        316      316

Flag	Coverage Δ
java	`84.49% <96.49%> (+0.02%)`	⬆️
javascript	`85.58% <ø> (ø)`
python	`85.81% <80.00%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...sie/versioned/persist/adapter/ContentAndState.java	`80.00% <ø> (ø)`
python/pynessie/model.py	`91.90% <80.00%> (+0.14%)`	⬆️
...essie/server/store/TableCommitMetaStoreWorker.java	`75.00% <90.00%> (+2.17%)`	⬆️
...ain/java/org/projectnessie/model/IcebergTable.java	`100.00% <100.00%> (ø)`
.../org/projectnessie/jaxrs/AbstractResteasyTest.java	`100.00% <100.00%> (ø)`
...java/org/projectnessie/jaxrs/AbstractTestRest.java	`95.29% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5a50e0...529860b. Read the comment docs.

ajantha-bhat · 2021-11-09T16:45:47Z

servers/store/src/test/java/org/projectnessie/server/store/TestStoreWorker.java

        ObjectTypes.Content.newBuilder()
            .setId(ID)
-            .setIcebergGlobal(IcebergGlobal.newBuilder().setIdGenerators("xyz"))
+            .setIcebergMetadataPointer(


I think if we keep this design. All the branch's snapshot id will be present in current metadata pointer of table@ref.

So, this can cause the below problems.

a) In each branch, each table will lose the snapshot history. So, time travel within the branch is not possible.

b) In each branch, each table will not know the list of all the reachable files as snapshot history is lost. So, if the table is dropped in that branch, we cannot know which files to clean for that table@ref.

c) we lose table evolution history per branch. We cannot know how schema, sort order, partition spec is evolved for that table@ref.

d) Table properties configured for table@ref will be visible and used for the same table in other reference also. which can cause unexpected behaviour in other references for the table.

e) we lose metadata-log history per branch in a table. metadata-log is used for deleting old metadata files (and fixed number of maximum metadata files at given point of time). we have an auto action for cleaning the metadata files.

Now assume branch_1 has 1 commit and branch_2 has 99 commits. If metadata files to retain is configured as 100. For a new commit happens on branch_2, we delete branch_1's metadata file even though it is referenced in Nessie branch_1.

a) In each branch, each table will lose the snapshot history. So, time travel within the branch is not possible.

Time travel would still be possible via Nessie.

b) In each branch, each table will not know the list of all the reachable files as snapshot history is lost.

Can you elaborate why TableMetadata.snapshotLog() is required for this?

So, if the table is dropped in that branch, we cannot know which files to clean for that table@ref.

You can still figure that out by traversing the commit-log.

The only situation I can currently imagine is that you create a table on a new branch, add data and then just drop the branch. But every approach would run into this situation.

c) we lose table evolution history per branch. We cannot know how schema, sort order, partition spec is evolved for that table@ref.

It's still there via the Nessie commits + the information in the referenced snapshots, right?

d) Table properties configured for table@ref will be visible and used for the same table in other reference also. which can cause unexpected behaviour in other references for the table.

I don't think that there's an issue with exposing other references' information. Proper isolation would require way more than just thinking about it here.

What would those unexpected behaviors be?

e) we lose metadata-log history per table in a branch. metadata-log is used for deleting old metadata files (and fixed number of maximum metadata files at given point of time). we have an auto action for cleaning the metadata files.

Why would we lose that metadata-log (TableMetadata.previousFiles())?

There will be exactly one current and valid metadata (file).

b) In each branch, each table will not know the list of all the reachable files as snapshot history is lost.

Can you elaborate why TableMetadata.snapshotLog() is required for this?

I meant Table.snapshots(), which will have all branch's snapshot. So, when I do drop table on branch. At Iceberg side, I cannot know what all snapshots to delete for that branch.

You can still figure that out by traversing the commit-log.

I thought Iceberg itself will handle the drop table. Iceberg need to block its current implementation and relay Nessie during drop table ?

d) Table properties configured for table@ref will be visible and used for the same table in other reference also. which can cause unexpected behaviour in other references for the table.

I don't think that there's an issue with exposing other references' information. Proper isolation would require way more than just thinking about it here.

What would those unexpected behaviors be?

For example, If table in branch_1 is configured to use write.target-file-size-bytes = 10 MB, same table in all the branches will use that for writing file size. which may be unexpected for same table in other branch as they might be created for other experiments.

e) we lose metadata-log history per table in a branch. metadata-log is used for deleting old metadata files (and fixed number of maximum metadata files at given point of time). we have an auto action for cleaning the metadata files.

Why would we lose that metadata-log (TableMetadata.previousFiles())?

There will be exactly one current and valid metadata (file).

I meant we lose per branch, not for overall table. Table in one ref now contains TableMetadata.previousFiles() which has metadata files from all the branches. So, when we add new commit in one branch, it might clean up other branch's metadata file when this configuration is enabled.

Now assume branch_1 has 1 commit and branch_2 has 99 commits. If metadata files to retain is configured as 100. For a new commit happens on branch_2, we delete branch_1's metadata file even though it is referenced in Nessie branch_1.

You are correct ajantha. We should be setting this property at table create time and running checks to ensure it isn't changed

d) Table properties configured for table@ref will be visible and used for the same table in other reference also. which can cause unexpected behaviour in other references for the table

My thinking here is that for the time being table proeprties are global and when we come across a use case which requires table properties per branch we can adjust.

on the topic of time travel and history generally: we are in a weird position currently as there are two timelines (iceberg via snapshots and Nessie via commits) and both are accessible via different APIs. We should strive to have 1 history API for a Nessie table ( which uses Nessie's history). So I am not overly worried about the iceberg timeline and iceberg apis to access it, the iceberg timeline may be goofy but the Nessie timeline is the source of truth. We do have to focus on extending Iceberg to ensure that certain API calls are routed to Nessie, these should be generic extension points in the iceberg metadata handling code and should make sense for all extenders of Iceberg (not just Nessie). Make sense?

dimas-b

Code change LGTM, but I'm not sure I understand all the implications for Iceberg to approve 🤔

As a side note, if we keep Iceberg data in opaque form, would it not be easier to just have two properties in the IcebergTable class: globalState and branchedState and delegate interpretation to the Iceberg-side Nessie code?

rymurr · 2021-11-10T10:05:08Z

model/src/main/java/org/projectnessie/model/IcebergTable.java

I thnk we should put actual properties for each thing here rather than an opaque String (same comment as @dimas-b I think).

Off the top of my head I think we should save as actual properties:

partition spec id

schema id

sequence number
but I may be missing some from V2 spec.

@rymurr: sequence number is a global field. So no need to keep in the on_reference_state.
we just have to keep fields that are specific to current branch.

So, I think current chosen parameters are ok (If table property need to maintain for per branch, we can add that as well)

And agree on the part that we can have a class called TableCurrentState, which keeps these parameters instead of opaque string.

we just have to keep fields that are specific to current branch.

ok cool. agreed

sequence number is a global field.

I would just like to amend this statement a bit for my own clarity. We have to save anything that can be changed w/o creating a snapshot. The real risk we face is there are iceberg transactions that don't create a snapshot. So global counters (column id, sequence number) are safe to ignore. Things like schema and partition id are not.

Additionally I think we should create a test suite (probably along w/ the iceberg fix) to test all these assumtions

rymurr · 2021-11-10T10:08:36Z

servers/store/src/test/java/org/projectnessie/server/store/TestStoreWorker.java

        ObjectTypes.Content.newBuilder()
            .setId(ID)
-            .setIcebergGlobal(IcebergGlobal.newBuilder().setIdGenerators("xyz"))
+            .setIcebergMetadataPointer(


Now assume branch_1 has 1 commit and branch_2 has 99 commits. If metadata files to retain is configured as 100. For a new commit happens on branch_2, we delete branch_1's metadata file even though it is referenced in Nessie branch_1.

You are correct ajantha. We should be setting this property at table create time and running checks to ensure it isn't changed

d) Table properties configured for table@ref will be visible and used for the same table in other reference also. which can cause unexpected behaviour in other references for the table

My thinking here is that for the time being table proeprties are global and when we come across a use case which requires table properties per branch we can adjust.

rymurr · 2021-11-10T10:13:13Z

servers/store/src/test/java/org/projectnessie/server/store/TestStoreWorker.java

        ObjectTypes.Content.newBuilder()
            .setId(ID)
-            .setIcebergGlobal(IcebergGlobal.newBuilder().setIdGenerators("xyz"))
+            .setIcebergMetadataPointer(


on the topic of time travel and history generally: we are in a weird position currently as there are two timelines (iceberg via snapshots and Nessie via commits) and both are accessible via different APIs. We should strive to have 1 history API for a Nessie table ( which uses Nessie's history). So I am not overly worried about the iceberg timeline and iceberg apis to access it, the iceberg timeline may be goofy but the Nessie timeline is the source of truth. We do have to focus on extending Iceberg to ensure that certain API calls are routed to Nessie, these should be generic extension points in the iceberg metadata handling code and should make sense for all extenders of Iceberg (not just Nessie). Make sense?

servers/store/src/main/java/org/projectnessie/server/store/TableCommitMetaStoreWorker.java

ajantha-bhat · 2021-11-10T16:24:22Z

servers/store/src/main/proto/table.proto


-message IcebergGlobal {
-  string id_generators = 1;
+message IcebergRefState {


In the contentAPI descriptions and some other place of code, It still mentions that on-ref-state is just the snapshot id. can we change that as well?
please look up Iceberg: snapshot-ID in the files.

dimas-b

LGTM

ajantha-bhat

one small nit comment.

overall LGTM from my side.

ajantha-bhat · 2021-11-11T04:46:01Z

model/src/main/java/org/projectnessie/api/ContentApi.java

   * object, that contains the most up-to-date part for the globally tracked part (Iceberg:
-   * table-metadata) plus the per-Nessie-reference/hash specific part (Iceberg: snapshot-ID).
+   * table-metadata) plus the per-Nessie-reference/hash specific part (Iceberg: snapshot-ID,
+   * schema-ID, partition-spec-ID, default-sort-order-ID).


nit: In the descriptions, If we are naming this variable(default-sort-order-ID) based on iceberg spec, then other three variable has to be default-spec-id, current-schema-id, current-snapshot-id

or may be instead of mentioning each field, we just link it to IcebergRefState

nit: i thikn ID shjould be id

rymurr · 2021-11-11T08:29:02Z

model/src/main/java/org/projectnessie/api/ContentApi.java

   * object, that contains the most up-to-date part for the globally tracked part (Iceberg:
-   * table-metadata) plus the per-Nessie-reference/hash specific part (Iceberg: snapshot-ID).
+   * table-metadata) plus the per-Nessie-reference/hash specific part (Iceberg: snapshot-ID,
+   * schema-ID, partition-spec-ID, default-sort-order-ID).


nit: i thikn ID shjould be id

rymurr · 2021-11-11T08:29:46Z

model/src/main/java/org/projectnessie/model/IcebergTable.java


-  /** Opaque representation of Iceberg's {@code TableIdGenerators}. */
-  public abstract String getIdGenerators();
+  public abstract long getSnapshotId();


do these need default values? particularly the shcema/specid etc? Or do they have to be specified every time

Don't think those should have default values. It uses what Iceberg's TableMetadata says and not rely on a default value.

python/pynessie/model.py

…,spec,sort-order ids as on-ref state This change basically reverts projectnessie#2313. It moves the Iceberg able-metadata-pointer back to Nessie's global state and snapshot id plus other relevant ids (schema-id, partition-spec-id, sort-order-id) to Nessie's on-reference state.

snazy force-pushed the back-to-single-table-metadata branch from c6dd1cb to 02cbf25 Compare November 9, 2021 12:33

ajantha-bhat reviewed Nov 9, 2021

View reviewed changes

dimas-b reviewed Nov 9, 2021

View reviewed changes

rymurr reviewed Nov 10, 2021

View reviewed changes

snazy force-pushed the back-to-single-table-metadata branch from 02cbf25 to 707d690 Compare November 10, 2021 15:17

dimas-b reviewed Nov 10, 2021

View reviewed changes

servers/store/src/main/java/org/projectnessie/server/store/TableCommitMetaStoreWorker.java Outdated Show resolved Hide resolved

ajantha-bhat reviewed Nov 10, 2021

View reviewed changes

snazy force-pushed the back-to-single-table-metadata branch from 707d690 to 74b7afb Compare November 10, 2021 16:41

dimas-b previously approved these changes Nov 10, 2021

View reviewed changes

ajantha-bhat reviewed Nov 11, 2021

View reviewed changes

rymurr reviewed Nov 11, 2021

View reviewed changes

snazy dismissed dimas-b’s stale review via c73e21e November 11, 2021 09:35

snazy force-pushed the back-to-single-table-metadata branch from fced329 to c73e21e Compare November 11, 2021 09:35

snazy changed the title ~~Move table-metadata-pointer back to global state, use snapshot+schema IDs as on-ref state~~ Move table-metadata-pointer back to global state, use snapshot,schema,spec,sort-order ids as on-ref state Nov 11, 2021

snazy added 4 commits November 11, 2021 15:51

review updates

c5d686e

review comments

cf082c2

re-record after merge-conflict

529860b

snazy force-pushed the back-to-single-table-metadata branch from c73e21e to 529860b Compare November 11, 2021 14:54

ajantha-bhat approved these changes Nov 11, 2021

View reviewed changes

snazy merged commit 2b6b2c3 into projectnessie:main Nov 12, 2021

snazy deleted the back-to-single-table-metadata branch November 12, 2021 09:19

This was referenced Nov 15, 2021

UI - show table/view detail #1996 #2547

Merged

New global state is breaking the Alter table scenarios #2670

Closed

Conversation

snazy commented Nov 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ajantha-bhat Nov 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ajantha-bhat Nov 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

ajantha-bhat left a comment

Choose a reason for hiding this comment

Uh oh!

ajantha-bhat Nov 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

snazy commented Nov 9, 2021 •

edited

Loading

codecov bot commented Nov 9, 2021 •

edited

Loading

ajantha-bhat Nov 9, 2021 •

edited

Loading

ajantha-bhat Nov 10, 2021 •

edited

Loading

ajantha-bhat Nov 11, 2021 •

edited

Loading