Skip to content

Conversation

@rdblue
Copy link
Contributor

@rdblue rdblue commented Apr 9, 2020

This is a follow-up to #896, which added the same constant map support for Avro.

Fixes #575 for Parquet and replaces #585. Andrei did a lot of the work for this in #585.

Co-authored-by: Andrei Ionescu [email protected]


protected abstract void set(S struct, int pos, Object value);

protected Object prepareConstant(Type type, Object value) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm moving this out of Avro and adding a callback to convert the constants to PartitionUtil.constantsMap. That way, Spark can supply a conversion function and use it in both places, instead of duplicating the conversion in Avro and Parquet readers.

@rdblue rdblue force-pushed the parquet-support-constant-map branch from 6965f40 to 6c5db2d Compare April 9, 2020 20:36
}

@Override
protected Object prepareConstant(Type type, Object value) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved into Spark.

@rdblue rdblue force-pushed the parquet-support-constant-map branch from 3e102e6 to db7a4c7 Compare April 9, 2020 21:18
@rdblue rdblue requested a review from rdsr April 10, 2020 00:22
Copy link
Contributor

@rdsr rdsr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment. Otherwise LGTM!

appender.addAll(expected);
}

// add the Avro data file to the source table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not write the data for the parameterized format for which the test is running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just source data for the write from Spark with the target format. Since it isn't part of the test, we don't want it to change at all in ways that might affect the test.

@rdblue rdblue merged commit 3a35c07 into apache:master Apr 11, 2020
@rdblue
Copy link
Contributor Author

rdblue commented Apr 11, 2020

Thanks for reviewing @rdsr!

Fokko pushed a commit to Fokko/iceberg that referenced this pull request Apr 21, 2020
This is a follow-up to apache#896, which added the same constant map support for Avro.

Fixes #575 for Parquet and replaces #585. Andrei did a lot of the work for this in #585.

Co-authored-by: Andrei Ionescu <[email protected]>
sunchao pushed a commit to sunchao/iceberg that referenced this pull request May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

After writing Iceberg dataset with nested partitions it cannot be read anymore

2 participants