Skip to content

int96 support in parquet #1138

@thesquelched

Description

@thesquelched

When writing parquet, Spark wants to write timestamps as int96 for compatibility with hive/presto/impala/etc. Iceberg writes timestamps as the parquet timestamp logical type over int64. While this is less than ideal, the real problem is that int96 data is not supported at all, making it impossible to use iceberg with existing parquet data files without first rewriting the data.

Ideally, iceberg would add support for int96-as-timestamp, similar to how spark handles them. However, at minimum, iceberg should be able to understand the int96 type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions