[prestissimo][iceberg] Wire PUFFIN file format through C++ protocol and connector layer#27394
[prestissimo][iceberg] Wire PUFFIN file format through C++ protocol and connector layer#27394apurva-meta wants to merge 4 commits intoprestodb:masterfrom
Conversation
Reviewer's GuideWires the Iceberg PUFFIN file format and delete-file data sequence numbers through the Java and C++ Presto/Iceberg protocol and connector layers so native workers can correctly route V3 deletion vector files to the DeletionVectorReader while accepting PUFFIN deletes during split enumeration. Class diagram for updated Iceberg delete file and file format typesclassDiagram
direction LR
class JavaFileFormat {
<<enum>>
ORC
PARQUET
AVRO
METADATA
PUFFIN
+fromIcebergFileFormat(format)
}
class JavaDeleteFile {
<<final>>
-FileContent content
-String path
-FileFormat format
-long recordCount
-long fileSizeInBytes
-List~Integer~ equalityFieldIds
-Map~Integer, byte[]~ lowerBounds
-Map~Integer, byte[]~ upperBounds
-long dataSequenceNumber
+fromIceberg(deleteFile)
+getContent() FileContent
+getPath() String
+getFormat() FileFormat
+getRecordCount() long
+getFileSizeInBytes() long
+getEqualityFieldIds() List~Integer~
+getLowerBounds() Map~Integer, byte[]~
+getUpperBounds() Map~Integer, byte[]~
+getDataSequenceNumber() long
}
class CppFileFormat {
<<enum class>>
ORC
PARQUET
AVRO
METADATA
PUFFIN
}
class CppFileContent {
<<enum class>>
DATA
POSITION_DELETES
EQUALITY_DELETES
}
class CppDeleteFile {
<<struct>>
+FileContent content
+String path
+FileFormat format
+int64_t recordCount
+int64_t fileSizeInBytes
+List~Integer~ equalityFieldIds
+Map~Integer, String~ lowerBounds
+Map~Integer, String~ upperBounds
+int64_t dataSequenceNumber
+to_json(j, p)
+from_json(j, p)
}
class VeloxFileFormat {
<<enum>>
ORC
PARQUET
DWRF
}
class VeloxFileContent {
<<enum>>
kData
kPositionalDeletes
kEqualityDeletes
kDeletionVector
}
class IcebergDeleteFileVelox {
<<class>>
+FileContent content
+string path
+FileFormat format
+int64_t recordCount
+int64_t fileSizeInBytes
+vector~int32_t~ equalityFieldIds
+unordered_map~int32_t, string~ lowerBounds
+unordered_map~int32_t, string~ upperBounds
+int64_t dataSequenceNumber
}
JavaDeleteFile --> JavaFileFormat : uses
CppDeleteFile --> CppFileFormat : uses
CppDeleteFile --> CppFileContent : uses
IcebergDeleteFileVelox --> VeloxFileFormat : uses
IcebergDeleteFileVelox --> VeloxFileContent : uses
CppFileFormat <--> JavaFileFormat : protocol_mapping
CppDeleteFile <--> JavaDeleteFile : JSON_protocol
class IcebergPrestoToVeloxConnector {
<<class>>
+toVeloxFileContent(content) VeloxFileContent
+toVeloxFileFormat(format) VeloxFileFormat
+toVeloxSplit(catalogId, connectorSplit, splitContext) unique_ptr~HiveIcebergSplit~
}
IcebergPrestoToVeloxConnector --> CppDeleteFile : reads
IcebergPrestoToVeloxConnector --> IcebergDeleteFileVelox : constructs
IcebergPrestoToVeloxConnector ..> VeloxFileFormat : maps_PUFFIN_to_DWRF
IcebergPrestoToVeloxConnector ..> VeloxFileContent : reclassifies_PUFFIN_DV_to_kDeletionVector
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
toVeloxFileFormat(), consider explicitly guarding the PUFFIN→DWRF mapping with a check on file content (e.g., only for deletion-vector content) or an assertion, so that PUFFIN cannot silently be misrouted if introduced for non-DV files in the future. - In
IcebergPrestoToVeloxConnector::toVeloxSplit, you can avoid theNOLINT(facebook-bugprone-unchecked-pointer-access)by binding*icebergSplitto a reference afterVELOX_CHECK_NOT_NULLand then using that reference fordataSequenceNumberand other fields instead of the raw pointer.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `toVeloxFileFormat()`, consider explicitly guarding the PUFFIN→DWRF mapping with a check on file content (e.g., only for deletion-vector content) or an assertion, so that PUFFIN cannot silently be misrouted if introduced for non-DV files in the future.
- In `IcebergPrestoToVeloxConnector::toVeloxSplit`, you can avoid the `NOLINT(facebook-bugprone-unchecked-pointer-access)` by binding `*icebergSplit` to a reference after `VELOX_CHECK_NOT_NULL` and then using that reference for `dataSequenceNumber` and other fields instead of the raw pointer.
## Individual Comments
### Comment 1
<location path="presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergV3.java" line_range="317-326" />
<code_context>
+ try {
+ computeActual("SELECT * FROM " + tableName);
+ }
+ catch (RuntimeException e) {
+ // Verify the error is NOT the old "PUFFIN not supported" rejection.
+ // Other failures (e.g., fake .puffin file not on disk) are acceptable.
+ assertFalse(
+ e.getMessage().contains("Iceberg deletion vectors") && e.getMessage().contains("not supported"),
+ "PUFFIN deletion vectors should be accepted, not rejected: " + e.getMessage());
</code_context>
<issue_to_address>
**suggestion:** Defensive handling of a potential null exception message to avoid NPEs in the test itself.
This assertion calls `e.getMessage().contains(...)` twice; if the exception message is null, the test will throw `NullPointerException` instead of cleanly asserting on PUFFIN support. You can defensively handle this by normalizing the message first, e.g.
```java
tString message = String.valueOf(e.getMessage());
assertFalse(
message.contains("Iceberg deletion vectors") && message.contains("not supported"),
"PUFFIN deletion vectors should be accepted, not rejected: " + message);
```
so the test remains stable even when the exception message is null.
```suggestion
try {
computeActual("SELECT * FROM " + tableName);
}
catch (RuntimeException e) {
// Verify the error is NOT the old "PUFFIN not supported" rejection.
// Other failures (e.g., fake .puffin file not on disk) are acceptable.
String message = String.valueOf(e.getMessage());
assertFalse(
message.contains("Iceberg deletion vectors") && message.contains("not supported"),
"PUFFIN deletion vectors should be accepted, not rejected: " + message);
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| try { | ||
| computeActual("SELECT * FROM " + tableName); | ||
| } | ||
| catch (RuntimeException e) { | ||
| // Verify the error is NOT the old "PUFFIN not supported" rejection. | ||
| // Other failures (e.g., fake .puffin file not on disk) are acceptable. | ||
| assertFalse( | ||
| e.getMessage().contains("Iceberg deletion vectors") && e.getMessage().contains("not supported"), | ||
| "PUFFIN deletion vectors should be accepted, not rejected: " + e.getMessage()); | ||
| } |
There was a problem hiding this comment.
suggestion: Defensive handling of a potential null exception message to avoid NPEs in the test itself.
This assertion calls e.getMessage().contains(...) twice; if the exception message is null, the test will throw NullPointerException instead of cleanly asserting on PUFFIN support. You can defensively handle this by normalizing the message first, e.g.
tString message = String.valueOf(e.getMessage());
assertFalse(
message.contains("Iceberg deletion vectors") && message.contains("not supported"),
"PUFFIN deletion vectors should be accepted, not rejected: " + message);so the test remains stable even when the exception message is null.
| try { | |
| computeActual("SELECT * FROM " + tableName); | |
| } | |
| catch (RuntimeException e) { | |
| // Verify the error is NOT the old "PUFFIN not supported" rejection. | |
| // Other failures (e.g., fake .puffin file not on disk) are acceptable. | |
| assertFalse( | |
| e.getMessage().contains("Iceberg deletion vectors") && e.getMessage().contains("not supported"), | |
| "PUFFIN deletion vectors should be accepted, not rejected: " + e.getMessage()); | |
| } | |
| try { | |
| computeActual("SELECT * FROM " + tableName); | |
| } | |
| catch (RuntimeException e) { | |
| // Verify the error is NOT the old "PUFFIN not supported" rejection. | |
| // Other failures (e.g., fake .puffin file not on disk) are acceptable. | |
| String message = String.valueOf(e.getMessage()); | |
| assertFalse( | |
| message.contains("Iceberg deletion vectors") && message.contains("not supported"), | |
| "PUFFIN deletion vectors should be accepted, not rejected: " + message); | |
| } |
…ocol and connector layer (prestodb#27394) Summary: This is the C++ counterpart to the Java PUFFIN support diff. It wires the PUFFIN file format through the Prestissimo protocol and connector conversion layer so that Iceberg V3 deletion vector files can be deserialized and handled by native workers. Changes: 1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON serialization table in presto_protocol_iceberg.{h,cpp}. 2. Handles PUFFIN in toVeloxFileFormat() in IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a placeholder since DeletionVectorReader reads raw binary and does not use the DWRF/Parquet reader infrastructure. == RELEASE NOTES == General Changes * Upgrade Apache Iceberg library from 1.10.0 to 1.10.1. Hive Connector Changes * Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure. * Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips. * Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types. * Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation. * Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance. * Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables. * Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables. * Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution. * Improve IcebergSplitReader error handling and fix test file handle leaks. * Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries. Differential Revision: D97531555
5eb17f1 to
6424418
Compare
…ocol and connector layer (prestodb#27394) Summary: This is the C++ counterpart to the Java PUFFIN support diff. It wires the PUFFIN file format through the Prestissimo protocol and connector conversion layer so that Iceberg V3 deletion vector files can be deserialized and handled by native workers. Changes: 1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON serialization table in presto_protocol_iceberg.{h,cpp}. 2. Handles PUFFIN in toVeloxFileFormat() in IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a placeholder since DeletionVectorReader reads raw binary and does not use the DWRF/Parquet reader infrastructure. == RELEASE NOTES == General Changes * Upgrade Apache Iceberg library from 1.10.0 to 1.10.1. Hive Connector Changes * Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure. * Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips. * Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types. * Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation. * Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance. * Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables. * Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables. * Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution. * Improve IcebergSplitReader error handling and fix test file handle leaks. * Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries. Differential Revision: D97531555
6424418 to
d405441
Compare
…ocol and connector layer (prestodb#27394) Summary: This is the C++ counterpart to the Java PUFFIN support diff. It wires the PUFFIN file format through the Prestissimo protocol and connector conversion layer so that Iceberg V3 deletion vector files can be deserialized and handled by native workers. Changes: 1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON serialization table in presto_protocol_iceberg.{h,cpp}. 2. Handles PUFFIN in toVeloxFileFormat() in IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a placeholder since DeletionVectorReader reads raw binary and does not use the DWRF/Parquet reader infrastructure. == RELEASE NOTES == General Changes * Upgrade Apache Iceberg library from 1.10.0 to 1.10.1. Hive Connector Changes * Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure. * Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips. * Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types. * Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation. * Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance. * Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables. * Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables. * Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution. * Improve IcebergSplitReader error handling and fix test file handle leaks. * Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries. Differential Revision: D97531555
d405441 to
c73cb15
Compare
…ocol and connector layer (prestodb#27394) Summary: This is the C++ counterpart to the Java PUFFIN support diff. It wires the PUFFIN file format through the Prestissimo protocol and connector conversion layer so that Iceberg V3 deletion vector files can be deserialized and handled by native workers. Changes: 1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON serialization table in presto_protocol_iceberg.{h,cpp}. 2. Handles PUFFIN in toVeloxFileFormat() in IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a placeholder since DeletionVectorReader reads raw binary and does not use the DWRF/Parquet reader infrastructure. == RELEASE NOTES == General Changes * Upgrade Apache Iceberg library from 1.10.0 to 1.10.1. Hive Connector Changes * Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure. * Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips. * Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types. * Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation. * Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance. * Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables. * Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables. * Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution. * Improve IcebergSplitReader error handling and fix test file handle leaks. * Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries. Differential Revision: D97531555
c73cb15 to
84905a2
Compare
…tensibility Summary: - Reformat FileContent enum in presto_protocol_iceberg.h from single-line to multi-line for better readability and future extension. - Add blank line for visual separation before infoColumns initialization. Protocol files are auto-generated from Java sources via chevron. The manual edits here mirror what the generator would produce once the Java changes are landed and the protocol is regenerated. Differential Revision: D97531548
…equality delete conflict resolution Summary: Wire the dataSequenceNumber field from the Java Presto protocol to the C++ Velox connector layer, enabling server-side sequence number conflict resolution for equality delete files. Changes: - Add dataSequenceNumber field to IcebergSplit protocol (Java + C++) - Parse dataSequenceNumber in IcebergPrestoToVeloxConnector and pass it through HiveIcebergSplit to IcebergSplitReader - Add const qualifiers to local variables for code clarity Differential Revision: D97531547
…discovery Summary: Iceberg V3 introduces deletion vectors stored as blobs inside Puffin files. Previously, the coordinator's IcebergSplitSource rejected PUFFIN-format delete files with a NOT_SUPPORTED error, preventing V3 deletion vectors from being discovered and sent to workers. This diff: 1. Adds PUFFIN to the FileFormat enum (both presto-trunk and presto-facebook-trunk) so fromIcebergFileFormat() can convert Iceberg's PUFFIN format to Presto's FileFormat.PUFFIN. 2. Removes the PUFFIN rejection check in presto-trunk's IcebergSplitSource.toIcebergSplit(), allowing deletion vector files to flow through to workers. 3. Updates TestIcebergV3 to verify PUFFIN files are accepted rather than rejected at split enumeration time. The C++ worker-side changes (protocol enum + connector conversion) will follow in a separate diff. Differential Revision: D97531557
…ocol and connector layer (prestodb#27394) Summary: This is the C++ counterpart to the Java PUFFIN support diff. It wires the PUFFIN file format through the Prestissimo protocol and connector conversion layer so that Iceberg V3 deletion vector files can be deserialized and handled by native workers. Changes: 1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON serialization table in presto_protocol_iceberg.{h,cpp}. 2. Handles PUFFIN in toVeloxFileFormat() in IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a placeholder since DeletionVectorReader reads raw binary and does not use the DWRF/Parquet reader infrastructure. == RELEASE NOTES == General Changes * Upgrade Apache Iceberg library from 1.10.0 to 1.10.1. Hive Connector Changes * Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure. * Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips. * Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types. * Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation. * Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance. * Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables. * Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables. * Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution. * Improve IcebergSplitReader error handling and fix test file handle leaks. * Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries. Differential Revision: D97531555
84905a2 to
3c83724
Compare
…ocol and connector layer (prestodb#27394) Summary: This is the C++ counterpart to the Java PUFFIN support diff. It wires the PUFFIN file format through the Prestissimo protocol and connector conversion layer so that Iceberg V3 deletion vector files can be deserialized and handled by native workers. Changes: 1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON serialization table in presto_protocol_iceberg.{h,cpp}. 2. Handles PUFFIN in toVeloxFileFormat() in IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a placeholder since DeletionVectorReader reads raw binary and does not use the DWRF/Parquet reader infrastructure. == RELEASE NOTES == General Changes * Upgrade Apache Iceberg library from 1.10.0 to 1.10.1. Hive Connector Changes * Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure. * Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips. * Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types. * Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation. * Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance. * Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables. * Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables. * Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution. * Improve IcebergSplitReader error handling and fix test file handle leaks. * Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries. Differential Revision: D97531555
…nd connector layer
Summary:
This is the C++ counterpart to the Java PUFFIN support diff. It wires
the PUFFIN file format through the Prestissimo protocol and connector
conversion layer so that Iceberg V3 deletion vector files can be
deserialized and handled by native workers.
Changes:
1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON
serialization table in presto_protocol_iceberg.{h,cpp}.
2. Handles PUFFIN in toVeloxFileFormat() in
IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a
placeholder since DeletionVectorReader reads raw binary and
does not use the DWRF/Parquet reader infrastructure.
Differential Revision: D97531555
3c83724 to
845bd69
Compare
|
|
Summary:
This is the C++ counterpart to the Java PUFFIN support diff. It wires
the PUFFIN file format through the Prestissimo protocol and connector
conversion layer so that Iceberg V3 deletion vector files can be
deserialized and handled by native workers.
Changes:
serialization table in presto_protocol_iceberg.{h,cpp}.
IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a
placeholder since DeletionVectorReader reads raw binary and
does not use the DWRF/Parquet reader infrastructure.
Differential Revision: D97531555