-
Notifications
You must be signed in to change notification settings - Fork 3k
Update TableTestBase tests to run with formats v1 and v2 #936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| manifestEntry(Status.EXISTING, null, newFile(10)), | ||
| manifestEntry(Status.EXISTING, null, newFile(1)), | ||
| manifestEntry(Status.DELETED, null, newFile(5)), | ||
| manifestEntry(Status.DELETED, null, newFile(2))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needed to change because it isn't possible to write entries to a manifest that inherits a sequence number but not snapshot ID, unless the snapshot ID matches the commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever we are rewriting manifests in the rewrite manifests action, we create a manifest with snapshot id = null but each entry is assigned a valid snapshot id. Will that continue to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The problem here was that entries had a null snapshot ID, but they shouldn't for rewrites.
| private ManifestEntry wrapped = null; | ||
|
|
||
| IndexedManifestEntry(long commitSnapshotId, Types.StructType partitionType) { | ||
| IndexedManifestEntry(Long commitSnapshotId, Types.StructType partitionType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied manifests will use a null snapshot ID.
|
Let me go through this now. |
| public static ManifestWriter write(PartitionSpec spec, OutputFile outputFile) { | ||
| // always use a v1 writer for appended manifests because sequence number must be inherited | ||
| return write(1, spec, outputFile, null); | ||
| // always use a v2 writer to preserve sequence numbers, but use null for sequence number so appends inherit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean manifests will be written with the v2 schema (i.e. with sequence numbers) even though TableMetadata is v1 and the manifest list is written with v1? And this should work because we do a projection on read and sequence number is optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the constructor used to create an append or rewritten manifest, so this makes the manifests that are passed into FastAppend, MergeAppend, and RewriteManifests support sequence numbers. It is needed for the last case: when rewriting a manifest for a v2 table, we need to preserve sequence numbers.
This works for v1 because v1 tables will ignore the new fields.
|
+1, ran the tests locally. |
This updates all of the tests that extend
TableTestBaseto run with aformatVersionparameter. This checks that operations work in both v1 and v2. This doesn't add any tests that run operations in v1 and upgrade the table to v2. Those will be added in another PR.