Conversation
880340f to
37c084d
Compare
|
As the original author of the Velox Iceberg code, I’d strongly recommend not porting that implementation directly into Bolt. The current Iceberg implementation in Velox lives inside the Hive connector, but that was never the intended design. It was a compromise at the time to get the code merged, and in hindsight it’s a clear anti-pattern. Over time, the situation has only worsened as more Hive-specific assumptions were added and shared across both Hive and Iceberg paths. There are several concrete issues with the current design:
kRowIndex and kRowId do not have the same meaning in Iceberg. Despite that, Iceberg is forced into Hive’s abstraction, which leads to:
This is fundamentally limiting, especially since Iceberg has many connector-specific behaviors and configs. I’ve been working on a plan to introduce Iceberg the right way, with proper separation and extensibility. Please see #107 The first step is refactoring the connector architecture to remove Hive coupling. This work is already in progress with @ZacBlanco: Relevant PRs: #156 I strongly recommend not merging a direct port of the Velox Iceberg code at this stage. If we do so it would:
Instead, once the connector refactor is complete, we can:
This will save us substantial rework and give us a much cleaner foundation going forward. If you want, I'm willing to rush into the second step to extract common code path from Hive and introduce the Iceberg connector. I had the code already last year. |
7f79bda to
1c2d86d
Compare
Thanks for the very detailed explanation! I fully agree with your analysis and suggestion. I should avoid directly porting the existing Iceberg implementation and instead wait for the connector refactoring to introduce Iceberg properly as a standalone connector. Really appreciate your guidance! |
What problem does this PR solve?
Issue Number: close #191
Type of Change
Description
Support iceberg connector and iceberg functions
Performance Impact
No Impact: This change does not affect the critical path (e.g., build system, doc, error handling).
Positive Impact: I have run benchmarks.
Click to view Benchmark Results
Negative Impact: Explained below (e.g., trade-off for correctness).
Release Note
Please describe the changes in this PR
Release Note:
Checklist (For Author)
Breaking Changes
No
Yes (Description: ...)
Click to view Breaking Changes