Skip to content

[presto][iceberg] Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3#27396

Closed
apurva-meta wants to merge 6 commits intoprestodb:masterfrom
apurva-meta:export-D97531552
Closed

[presto][iceberg] Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3#27396
apurva-meta wants to merge 6 commits intoprestodb:masterfrom
apurva-meta:export-D97531552

Conversation

@apurva-meta
Copy link
Copy Markdown
Contributor

@apurva-meta apurva-meta commented Mar 21, 2026

Summary:
Iceberg V3 introduces nanosecond-precision timestamp types (timestamp_ns
and timestamptz_ns). This diff adds support for reading tables with these
column types by mapping them to Presto's best available precision
(TIMESTAMP_MICROSECONDS for timestamp_ns, TIMESTAMP_WITH_TIME_ZONE for
timestamptz_ns).

Changes:

  • TypeConverter: Map TIMESTAMP_NANO to Presto types and ORC types
  • ExpressionConverter: Fix predicate pushdown for TIMESTAMP_MICROSECONDS
    precision (was incorrectly converting microseconds as milliseconds)
  • IcebergUtil: Handle TIMESTAMP_NANO partition values (nanos → micros)
  • PartitionData: Handle TIMESTAMP_NANO in JSON partition deserialization
  • PartitionTable: Convert nanosecond partition values to microseconds
  • TestIcebergV3: Add testNanosecondTimestampSchema integration test

Differential Revision: D97531552

@apurva-meta apurva-meta requested review from a team, ZacBlanco and hantangwangd as code owners March 21, 2026 06:52
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 21, 2026

Reviewer's Guide

Implements Iceberg V3 nanosecond timestamp (TIMESTAMP_NANO) support and fully wires Iceberg V3 deletion vectors (PUFFIN format) and row-level operations (DELETE/UPDATE/MERGE) through Presto’s Java and C++/Velox stacks, including metadata/protocol changes, routing, and comprehensive integration tests plus a DV compaction procedure.

Sequence diagram for Iceberg V3 deletion vector write path

sequenceDiagram
  actor User
  participant Coordinator
  participant Worker
  participant IcebergPageSourceProvider
  participant IcebergUpdateablePageSource
  participant IcebergDeletionVectorPageSink
  participant HdfsEnvironment
  participant IcebergAbstractMetadata
  participant IcebergTable

  User->>Coordinator: submit DELETE statement
  Coordinator->>Worker: schedule task with IcebergSplit

  Worker->>IcebergPageSourceProvider: createPageSource(split)
  IcebergPageSourceProvider->>IcebergPageSourceProvider: read storageProperties
  IcebergPageSourceProvider-->>IcebergPageSourceProvider: tableFormatVersion >= 3
  IcebergPageSourceProvider->>IcebergUpdateablePageSource: construct with deleteSinkSupplier returning IcebergDeletionVectorPageSink

  loop for each deleted row
    IcebergUpdateablePageSource->>IcebergDeletionVectorPageSink: appendPage(page_of_row_positions)
    IcebergDeletionVectorPageSink->>IcebergDeletionVectorPageSink: collect row_positions (int)
  end

  Worker->>IcebergUpdateablePageSource: finish()
  IcebergUpdateablePageSource->>IcebergDeletionVectorPageSink: finish()
  IcebergDeletionVectorPageSink->>IcebergDeletionVectorPageSink: sort positions and serialize roaring bitmap
  IcebergDeletionVectorPageSink->>HdfsEnvironment: write Puffin file with Blob deletion_vector_v2
  IcebergDeletionVectorPageSink-->>IcebergUpdateablePageSource: CommitTaskData (FileFormat.PUFFIN, POSITION_DELETES, contentOffset, contentSizeInBytes, recordCount)

  IcebergUpdateablePageSource-->>Coordinator: commit tasks (including DV CommitTaskData)
  Coordinator->>IcebergAbstractMetadata: finishDeleteWithOutput(tasks)
  IcebergAbstractMetadata->>IcebergTable: newRewrite().rewriteFiles(remove_old_DVs, add_PUFFIN_DVs)
  IcebergTable-->>User: DELETE committed with V3 deletion vectors
Loading

Class diagram for updated Iceberg delete/DV metadata types

classDiagram
  class CommitTaskData {
    -String path
    -long fileSizeInBytes
    -MetricsWrapper metrics
    -int partitionSpecId
    -Optional~String~ partitionDataJson
    -FileFormat fileFormat
    -Optional~String~ referencedDataFile
    -FileContent content
    -OptionalLong contentOffset
    -OptionalLong contentSizeInBytes
    -OptionalLong recordCount
    +CommitTaskData(path, fileSizeInBytes, metrics, partitionSpecId, partitionDataJson, fileFormat, referencedDataFile, content, contentOffset, contentSizeInBytes, recordCount)
    +CommitTaskData(path, fileSizeInBytes, metrics, partitionSpecId, partitionDataJson, fileFormat, referencedDataFile, content)
    +getPath() String
    +getFileSizeInBytes() long
    +getMetrics() MetricsWrapper
    +getPartitionSpecId() int
    +getPartitionDataJson() Optional~String~
    +getFileFormat() FileFormat
    +getReferencedDataFile() Optional~String~
    +getContent() FileContent
    +getContentOffset() OptionalLong
    +getContentSizeInBytes() OptionalLong
    +getRecordCount() OptionalLong
  }

  class DeleteFileJava {
    <<final>>
    -FileContent content
    -String path
    -FileFormat fileFormat
    -long recordCount
    -long fileSizeInBytes
    -List~Integer~ equalityFieldIds
    -Map~Integer,byte[]~ lowerBounds
    -Map~Integer,byte[]~ upperBounds
    -long dataSequenceNumber
    +fromIceberg(deleteFile) DeleteFileJava
    +DeleteFileJava(content, path, fileFormat, recordCount, fileSizeInBytes, equalityFieldIds, lowerBounds, upperBounds, dataSequenceNumber)
    +getContent() FileContent
    +getPath() String
    +getFileFormat() FileFormat
    +getRecordCount() long
    +getFileSizeInBytes() long
    +getEqualityFieldIds() List~Integer~
    +getLowerBounds() Map~Integer,byte[]~
    +getUpperBounds() Map~Integer,byte[]~
    +getDataSequenceNumber() long
    +toString() String
  }

  class DeleteFileCpp {
    <<struct>>
    +FileContent content
    +String path
    +FileFormat format
    +int64_t recordCount
    +int64_t fileSizeInBytes
    +List~Integer~ equalityFieldIds
    +Map~Integer,String~ lowerBounds
    +Map~Integer,String~ upperBounds
    +int64_t dataSequenceNumber
    +to_json(j, DeleteFileCpp)
    +from_json(j, DeleteFileCpp)
  }

  class FileFormatJava {
    <<enum>>
    ORC
    PARQUET
    AVRO
    METADATA
    PUFFIN
    -String ext
    -boolean splittable
    +fromIcebergFileFormat(format) FileFormatJava
  }

  class FileFormatCpp {
    <<enum>>
    ORC
    PARQUET
    AVRO
    METADATA
    PUFFIN
  }

  class FileContentJava {
    <<enum>>
    DATA
    POSITION_DELETES
    EQUALITY_DELETES
  }

  class FileContentCpp {
    <<enum>>
    DATA
    POSITION_DELETES
    EQUALITY_DELETES
  }

  CommitTaskData --> FileFormatJava : uses
  CommitTaskData --> FileContentJava : uses
  CommitTaskData --> DeleteFileJava : serialized as

  DeleteFileJava --> FileFormatJava : uses
  DeleteFileJava --> FileContentJava : uses

  DeleteFileCpp --> FileFormatCpp : uses
  DeleteFileCpp --> FileContentCpp : uses

  FileFormatJava <--> FileFormatCpp : protocol_mapping
  FileContentJava <--> FileContentCpp : protocol_mapping
Loading

File-Level Changes

Change Details Files
Support Iceberg V3 TIMESTAMP_NANO types by mapping them to appropriate Presto and ORC representations and handling partition value conversions.
  • Extend TypeConverter to map TIMESTAMP_NANO (with and without zone) to TIMESTAMP_MICROSECONDS or TIMESTAMP_WITH_TIME_ZONE and to ORC TIMESTAMP type
  • Handle TIMESTAMP_NANO in IcebergUtil and PartitionTable when building Domains and converting partition values (nanos to micros/millis)
  • Allow TIMESTAMP_NANO in PartitionData JSON deserialization and add an integration test that exercises V3 tables with nanosecond timestamp columns
presto-iceberg/src/main/java/com/facebook/presto/iceberg/TypeConverter.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergUtil.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/PartitionTable.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/PartitionData.java
presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergV3.java
Fix predicate pushdown for timestamp and wire Iceberg TIMESTAMP_MICROSECONDS correctly into Iceberg expressions.
  • Adjust ExpressionConverter to treat TimestampType differently from TimeType and only convert millisecond-precision timestamps to micros, leaving higher precisions as-is
  • Keep TimeType conversion to micros-based representation unchanged
presto-iceberg/src/main/java/com/facebook/presto/iceberg/ExpressionConverter.java
Enable and validate Iceberg V3 row-level operations (DELETE/UPDATE/MERGE) and PUFFIN deletion vectors end-to-end via tests.
  • Update existing V3 tests to assert successful DELETE/UPDATE/MERGE instead of NOT_SUPPORTED errors
  • Rename and change semantics of Puffin DV tests to assert that PUFFIN deletion vectors are accepted and no longer rejected at split enumeration time
  • Add extensive integration tests covering DV creation, metadata, multi-file / multi-snapshot behavior, schema evolution, default values, partition transforms, and read/write round trips with DVs
presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergV3.java
Wire Iceberg delete file metadata, including PUFFIN DVs and data sequence numbers, through the Presto protocol to Velox and into Iceberg delete handling.
  • Extend protocol enums and JSON serialization for FileFormat to include PUFFIN and for FileContent to include EQUALITY_DELETES, and add dataSequenceNumber to DeleteFile in the protocol and Java model
  • Map PUFFIN FileFormat and EQUALITY_DELETES/FileContent into Velox equivalents, and reclassify PUFFIN POSITION_DELETES as kDeletionVector in the C++ connector split conversion
  • Propagate dataSequenceNumber from IcebergSplit through info columns and the HiveIcebergSplit to Velox, and add dataSequenceNumber to the Java-side DeleteFile model and protocol serialization
presto-native-execution/presto_cpp/main/connectors/IcebergPrestoToVeloxConnector.cpp
presto-native-execution/presto_cpp/presto_protocol/connector/iceberg/presto_protocol_iceberg.h
presto-native-execution/presto_cpp/presto_protocol/connector/iceberg/presto_protocol_iceberg.cpp
presto-iceberg/src/main/java/com/facebook/presto/iceberg/delete/DeleteFile.java
Introduce a deletion-vector write path for Iceberg V3 tables, producing PUFFIN deletion vector files instead of position-delete data files.
  • Add IcebergDeletionVectorPageSink that buffers row positions, serializes them into a roaring bitmap in Puffin format, writes a .puffin file, and emits CommitTaskData including DV-specific metadata (contentOffset, contentSizeInBytes, recordCount, referenced data file)
  • Teach IcebergPageSourceProvider and IcebergUpdateablePageSource to construct a deletion-vector page sink (ConnectorPageSink) for format-version >= 3 and to treat the delete sink generically via ConnectorPageSink APIs
  • Extend CommitTaskData to carry optional DV metadata fields and adjust finishDeleteWithOutput to build Iceberg DeleteFile differently for PUFFIN (using content offset/size/recordCount and referencedDataFile) versus standard delete files (using metrics)
presto-iceberg/src/main/java/com/facebook/presto/iceberg/delete/IcebergDeletionVectorPageSink.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergPageSourceProvider.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergUpdateablePageSource.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/CommitTaskData.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergAbstractMetadata.java
Enable Iceberg V3 deletion vector handling and compaction on the coordinator side, and relax previous V3 restrictions.
  • Remove the split-time rejection of PUFFIN deletion vectors from IcebergSplitSource so splits with DV files are allowed through
  • Increase MAX_FORMAT_VERSION_FOR_ROW_LEVEL_OPERATIONS from 2 to 3 and allow Iceberg v3 column default values by dropping the NOT_SUPPORTED checks
  • Add a new RewriteDeleteFilesProcedure, register it in IcebergCommonModule, and implement DV compaction logic that groups Puffin DVs per data file, merges roaring bitmaps, writes consolidated Puffin DV files, and commits the rewrite via Iceberg’s RewriteFiles API
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergSplitSource.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergUtil.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergAbstractMetadata.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergCommonModule.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/procedure/RewriteDeleteFilesProcedure.java
Extend internal file format enumeration to include PUFFIN and wire it through to Iceberg file-format conversions.
  • Add PUFFIN to the Java FileFormat enum and to fromIcebergFileFormat mapping so Iceberg PUFFIN formats can be represented in Presto
  • Ensure PUFFIN FileFormat values are propagated correctly through the Java and C++ protocol layers and converted appropriately in the Velox connector
presto-iceberg/src/main/java/com/facebook/presto/iceberg/FileFormat.java
presto-native-execution/presto_cpp/presto_protocol/connector/iceberg/presto_protocol_iceberg.h
presto-native-execution/presto_cpp/presto_protocol/connector/iceberg/presto_protocol_iceberg.cpp

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • There are multiple independent implementations of Roaring bitmap (de)serialization for deletion vectors (e.g., in tests, IcebergDeletionVectorPageSink, and RewriteDeleteFilesProcedure), which increases the risk of subtle bugs and divergence; consider centralizing this logic in a single utility and/or delegating to a well-tested Roaring library wherever possible.
  • In RewriteDeleteFilesProcedure.rewriteDeleteFiles, the commit flow is a bit obscure (e.g., calling both rewriteFiles.commit() and metadata.commit() while metadata is created ad hoc via the factory); it would help maintainability to clarify or refactor ownership of commits so it’s clear which component is responsible for finalizing Iceberg changes.
  • The Roaring bitmap deserializeRoaringBitmap in RewriteDeleteFilesProcedure implements support for both array and bitmap containers and partially handles run containers via cookie bits; it might be safer to explicitly validate the cookie and container types you expect (and fail fast otherwise) to avoid silently ignoring or misinterpreting newer/unsupported encodings.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- There are multiple independent implementations of Roaring bitmap (de)serialization for deletion vectors (e.g., in tests, `IcebergDeletionVectorPageSink`, and `RewriteDeleteFilesProcedure`), which increases the risk of subtle bugs and divergence; consider centralizing this logic in a single utility and/or delegating to a well-tested Roaring library wherever possible.
- In `RewriteDeleteFilesProcedure.rewriteDeleteFiles`, the commit flow is a bit obscure (e.g., calling both `rewriteFiles.commit()` and `metadata.commit()` while `metadata` is created ad hoc via the factory); it would help maintainability to clarify or refactor ownership of commits so it’s clear which component is responsible for finalizing Iceberg changes.
- The Roaring bitmap `deserializeRoaringBitmap` in `RewriteDeleteFilesProcedure` implements support for both array and bitmap containers and partially handles run containers via cookie bits; it might be safer to explicitly validate the cookie and container types you expect (and fail fast otherwise) to avoid silently ignoring or misinterpreting newer/unsupported encodings.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@meta-codesync meta-codesync bot changed the title feat: [presto][iceberg] Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3 feat: [presto][iceberg] Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3 (#27396) Mar 21, 2026
apurva-meta added a commit to apurva-meta/presto that referenced this pull request Mar 21, 2026
…e support for Iceberg V3 (prestodb#27396)

Summary:

Iceberg V3 introduces nanosecond-precision timestamp types (timestamp_ns
and timestamptz_ns). This diff adds support for reading tables with these
column types by mapping them to Presto's best available precision
(TIMESTAMP_MICROSECONDS for timestamp_ns, TIMESTAMP_WITH_TIME_ZONE for
timestamptz_ns).

Changes:
- TypeConverter: Map TIMESTAMP_NANO to Presto types and ORC types
- ExpressionConverter: Fix predicate pushdown for TIMESTAMP_MICROSECONDS
  precision (was incorrectly converting microseconds as milliseconds)
- IcebergUtil: Handle TIMESTAMP_NANO partition values (nanos → micros)
- PartitionData: Handle TIMESTAMP_NANO in JSON partition deserialization
- PartitionTable: Convert nanosecond partition values to microseconds
- TestIcebergV3: Add testNanosecondTimestampSchema integration test

== RELEASE NOTES ==
General Changes
* Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
Hive Connector Changes
* Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
* Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
* Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
* Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
* Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
* Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
* Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
* Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
* Improve IcebergSplitReader error handling and fix test file handle leaks.
* Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531552
Apurva Kumar added 5 commits March 24, 2026 16:35
…tensibility

Summary:
- Reformat FileContent enum in presto_protocol_iceberg.h from single-line
  to multi-line for better readability and future extension.
- Add blank line for visual separation before infoColumns initialization.

Protocol files are auto-generated from Java sources via chevron. The manual
edits here mirror what the generator would produce once the Java changes
are landed and the protocol is regenerated.

Differential Revision: D97531548
…equality delete conflict resolution

Summary:
Wire the dataSequenceNumber field from the Java Presto protocol to the
C++ Velox connector layer, enabling server-side sequence number conflict
resolution for equality delete files.

Changes:
- Add dataSequenceNumber field to IcebergSplit protocol (Java + C++)
- Parse dataSequenceNumber in IcebergPrestoToVeloxConnector and pass it
  through HiveIcebergSplit to IcebergSplitReader
- Add const qualifiers to local variables for code clarity

Differential Revision: D97531547
…discovery

Summary:
Iceberg V3 introduces deletion vectors stored as blobs inside Puffin files.
Previously, the coordinator's IcebergSplitSource rejected PUFFIN-format delete
files with a NOT_SUPPORTED error, preventing V3 deletion vectors from being
discovered and sent to workers.

This diff:
1. Adds PUFFIN to the FileFormat enum (both presto-trunk and
   presto-facebook-trunk) so fromIcebergFileFormat() can convert
   Iceberg's PUFFIN format to Presto's FileFormat.PUFFIN.
2. Removes the PUFFIN rejection check in presto-trunk's
   IcebergSplitSource.toIcebergSplit(), allowing deletion vector
   files to flow through to workers.
3. Updates TestIcebergV3 to verify PUFFIN files are accepted rather
   than rejected at split enumeration time.

The C++ worker-side changes (protocol enum + connector conversion) will
follow in a separate diff.

Differential Revision: D97531557
…nd connector layer

Summary:
This is the C++ counterpart to the Java PUFFIN support diff. It wires
the PUFFIN file format through the Prestissimo protocol and connector
conversion layer so that Iceberg V3 deletion vector files can be
deserialized and handled by native workers.

Changes:
1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON
   serialization table in presto_protocol_iceberg.{h,cpp}.
2. Handles PUFFIN in toVeloxFileFormat() in
   IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a
   placeholder since DeletionVectorReader reads raw binary and
   does not use the DWRF/Parquet reader infrastructure.

Differential Revision: D97531555
…age sink and compaction procedure

Summary:
- Add IcebergDeletionVectorPageSink for writing DV files during table maintenance
- Add RewriteDeleteFilesProcedure for DV compaction
- Wire DV page sink through IcebergCommonModule, IcebergAbstractMetadata, IcebergPageSourceProvider
- Add IcebergUpdateablePageSource for DV-aware page source
- Update CommitTaskData, IcebergUtil for DV support
- Add test coverage in TestIcebergV3

Differential Revision: D97531549
apurva-meta added a commit to apurva-meta/presto that referenced this pull request Mar 27, 2026
…e support for Iceberg V3 (prestodb#27396)

Summary:

Iceberg V3 introduces nanosecond-precision timestamp types (timestamp_ns
and timestamptz_ns). This diff adds support for reading tables with these
column types by mapping them to Presto's best available precision
(TIMESTAMP_MICROSECONDS for timestamp_ns, TIMESTAMP_WITH_TIME_ZONE for
timestamptz_ns).

Changes:
- TypeConverter: Map TIMESTAMP_NANO to Presto types and ORC types
- ExpressionConverter: Fix predicate pushdown for TIMESTAMP_MICROSECONDS
  precision (was incorrectly converting microseconds as milliseconds)
- IcebergUtil: Handle TIMESTAMP_NANO partition values (nanos → micros)
- PartitionData: Handle TIMESTAMP_NANO in JSON partition deserialization
- PartitionTable: Convert nanosecond partition values to microseconds
- TestIcebergV3: Add testNanosecondTimestampSchema integration test

== RELEASE NOTES ==
General Changes
* Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
Hive Connector Changes
* Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
* Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
* Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
* Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
* Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
* Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
* Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
* Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
* Improve IcebergSplitReader error handling and fix test file handle leaks.
* Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531552
…ort for Iceberg V3

Summary:
Iceberg V3 introduces nanosecond-precision timestamp types (timestamp_ns
and timestamptz_ns). This diff adds support for reading tables with these
column types by mapping them to Presto's best available precision
(TIMESTAMP_MICROSECONDS for timestamp_ns, TIMESTAMP_WITH_TIME_ZONE for
timestamptz_ns).

Changes:
- TypeConverter: Map TIMESTAMP_NANO to Presto types and ORC types
- ExpressionConverter: Fix predicate pushdown for TIMESTAMP_MICROSECONDS
  precision (was incorrectly converting microseconds as milliseconds)
- IcebergUtil: Handle TIMESTAMP_NANO partition values (nanos → micros)
- PartitionData: Handle TIMESTAMP_NANO in JSON partition deserialization
- PartitionTable: Convert nanosecond partition values to microseconds
- TestIcebergV3: Add testNanosecondTimestampSchema integration test

Differential Revision: D97531552
@meta-codesync meta-codesync bot changed the title feat: [presto][iceberg] Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3 (#27396) [presto][iceberg] Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3 Mar 27, 2026
@linux-foundation-easycla
Copy link
Copy Markdown

CLA Missing ID CLA Not Signed

@steveburnett
Copy link
Copy Markdown
Contributor

  • Please sign the Presto CLA.

  • Please add a release note - or NO RELEASE NOTE - following the Release Notes Guidelines to pass the failing but not required CI check.

  • Please edit the PR title to follow semantic commit style to pass the failing and required CI check. See the failure in the test for advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants