Skip to content

Conversation

@rdblue
Copy link
Contributor

@rdblue rdblue commented Apr 11, 2020

This separates the write path for manifests into v1 and v2 using the same approach as #907. It also adds sequence number to the v2 write path.

This also adds more checks in v2 writers to validate when sequence numbers are added. When writing ManifestFile to a manifest list, this checks that the snapshot ID of the manifest matches the current write's snapshot ID.

@rdblue rdblue force-pushed the v2-manifests branch 2 times, most recently from 8a544d8 to eaf7871 Compare April 11, 2020 16:21
ManifestEntry wrapAppend(Long newSnapshotId, DataFile newFile) {
this.status = Status.ADDED;
this.snapshotId = newSnapshotId;
this.sequenceNumber = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The append/delete operations won't attach the sequence number to its entries ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a file is appended, it must have the sequence number of the snapshot where it is committed. The problem is that the sequence number isn't determined until the snapshot's commit is successful. Two committers may be racing to add a commit with the same sequence number.

That's why sequence numbers are assigned initially through inheritance. Every snapshot commit attempt writes a new root manifest list with a new sequence number based on the table metadata's last sequence number. If two writers are trying to commit different snapshots as sequence number 5, one will win and the other will retry with 6. To avoid rewriting the metadata tree below the manifest list, sequence numbers that might change are inherited instead of written into the initial files.

return specId;
case 3:
return snapshotId;
return sequenceNumber;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we're changing the position of snapshotId/addedFilesCount/ ... etc, will it break the compatibility ? I mean the new code couldn't read the old data manifest file ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the order won't break compatibility. The requirement here is that the order of fields in these classes must match the order of fields in the schema. That's why this commit freezes the v1 schema and an IndexedWriter for v1. That way, we can write exactly what we would have before for v1, but make changes for v2.

if (manifest.snapshotId() != null) {
return new BaseInheritableMetadata(manifest.snapshotId(), manifest.sequenceNumber());
} else {
return NOOP;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure in which case we will encounter the snapshot == null, mind to explain the case ? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapshot id is null when a manifest is appended to a table and the snapshot id hasn't been assigned yet.

If the manifest is committed to table metadata, then it will be set when writing and will always be present. That's why it is required in the v2 schema for manifest list files.

Some appended manifests are rewritten before committing. When reading those to rewrite them, this path is used.

I think I'm going to update how this works, but as a separate commit. The problem is that this allows reading a manifest without filling in the snapshot id. But the rewrite method where the reader that is configured this was is used has the snapshot id. So we should add the snapshot id to the ManifestFile and then read to avoid the special case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions here:

  • Now added_snapshot_id is optional. Do we plan to make it required once we start assigning the snapshot in ManifestListWriter?
  • We cannot actually remove InheritableMetadata as there might be old manifests where added_snapshot_id is null, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added_snapshot_id is optional because we used to store manifests as a list in the metadata.json file. V2 changes it to required because we always write a manifest-list with the inherited snapshot id and sequence number.

My comment about changing this wasn't about removing InheritableMetadata. It was about the path that rewrites the manifest and uses the no-op InheritableMetadata. I think that we can simplify this by making sure we set the snapshot ID on the ManifestFile before rewriting. That way any rewrite has the snapshot ID and we don't need a special case here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked about InheritableMetadata as I wondered myself whether we can remove that completely. Seems not.

Yeah, I like the idea of setting the snapshot id in ManifestFile before rewriting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into removing it entirely, although I don't mind it very much. As long as it is baked into the reader and unavoidable, it seems like a clean way to apply the inherited values.

/**
* @return the lowest sequence number of any data file in the manifest
*/
long minSequenceNumber();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need to define this min sequence number ? for manifest file filtering purpose in read path ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is additional information for job planning and metadata maintenance.

This may change, but the initial idea is that if we have the min sequence number of a manifest, we can use that to filter other manifests for planning. For example, if I have a manifest with delete files that was written at sequence number 14, but only manifests with data files written at sequence number 16 and later, I can ignore the manifest with delete files.

We can do something similar when maintaining metadata. If we detect that a delete file is older than all data files then we can remove it.

@rdblue rdblue changed the title Add v2 manifests (WIP) Add v2 manifests Apr 15, 2020
@rdblue
Copy link
Contributor Author

rdblue commented Apr 15, 2020

I'll update this after #925 is in. I extracted those changes out of this one to make this smaller.

@rdblue rdblue force-pushed the v2-manifests branch 2 times, most recently from 86cf1b8 to 5377393 Compare April 17, 2020 01:16
Copy link
Collaborator

@chenjunjiedada chenjunjiedada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,+1

@aokolnychyi
Copy link
Contributor

Let me go through this right now.

*/
public void existing(DataFile existingFile, long fileSnapshotId) {
addEntry(reused.wrapExisting(fileSnapshotId, existingFile));
public void existing(DataFile existingFile, long fileSnapshotId, long sequenceNumber) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we need to update the Javadoc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I am using this method in the rewrite manifests action. The entries metadata table will have sequenceNumber but its value will be 0, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new action should use the sequence number from the metadata table. That will be 0 for metadata written without sequence numbers, or the correct sequence number for v2 tables.

@aokolnychyi
Copy link
Contributor

@rdblue, LGTM. I had only two questions.

I think we need to update Javadoc in one place and should be good to go.

@aokolnychyi
Copy link
Contributor

It is a Javadoc change, I'll merge it.

@aokolnychyi aokolnychyi merged commit fdca3f8 into apache:master Apr 20, 2020
@rdblue
Copy link
Contributor Author

rdblue commented Apr 20, 2020

Thanks, @aokolnychyi, @chenjunjiedada, and @openinx for reviewing!

@aokolnychyi
Copy link
Contributor

Thanks for doing this, @rdblue! Shall we take a look at #936 next?

@rdblue
Copy link
Contributor Author

rdblue commented Apr 20, 2020

Yes, please do! That one should be easier since it is just updating a few tests to be parameterized.

Fokko pushed a commit to Fokko/iceberg that referenced this pull request Apr 21, 2020
@rdblue rdblue added this to the Row-level Delete milestone May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants