-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
Identity partition values should be added to materialized records using values from a file's partition data. The initial implementation for Spark used a JoinedRow to join the partition values to each row read from a format, but this had a few problems:
- Only top-level fields could be set this way, not nested fields
- Values were not added in place and would require a projection in Spark
- No support for directly reading tables with generics
Avro and Parquet are moving to implementations that pass a map from field ID to a value when building the reader, so the constant can be added at the right place in the read schema, and so that the implementation can be shared across in-memory representations. ORC should also add support for passing partition values as a map.
rdsr
Metadata
Metadata
Assignees
Labels
No labels