Skip to content

Conversation

@rdblue
Copy link
Contributor

@rdblue rdblue commented Apr 22, 2020

This updates DataFile to contain metadata from ManifestEntry because the separation no longer makes sense. V1 metadata files still use ManifestEntry, but v2 will not.

GenericManifestEntry is split into two parts. First, a wrapper class ManifestEntryWrapper that is used by ManifestWriter to replace fields like status when writing a DataFile because the DataFile status is not set by the caller. Second, a class that makes a DataFile appear like a ManifestEntry, called GenericDataFile.AsManifestEntry. The GenericManifestEntry class now extends AsManifestEntry so that reads that used it don't need to be updated.

rdblue added 4 commits April 22, 2020 15:21
This updates DataFile to contain metadata from ManifestEntry because the
separation no longer makes sense. V1 metadata files still use
ManifestEntry, but v2 will not.

static Schema wrapFileSchema(StructType fileType) {
return new Schema(STATUS, SNAPSHOT_ID, SEQUENCE_NUMBER, required(DATA_FILE_ID, "data_file", fileType));
// remove ManifestEntry fields from the file type when wrapping to avoid duplication
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should note that this PR doesn't actually add the manifest entry fields to DataFile, it just sets up that change.

@rdblue
Copy link
Contributor Author

rdblue commented May 7, 2020

I'm closing this because we plan to keep the separate structs.

@rdblue rdblue closed this May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant