Use native implementation of Iceberg delete filters#13219
Use native implementation of Iceberg delete filters#13219electrum merged 2 commits intotrinodb:masterfrom
Conversation
|
Works well on my scenario. And faster than #13112. |
|
@electrum can you please extract some of the prep easy commits to a separate PR to have them merged quickly? |
|
@lhofhansl thanks for the feedback! That's great to hear. |
djsagain
left a comment
There was a problem hiding this comment.
Lots of very nice cleanups in this PR!
I didn't take the time to figure out Iceberg's StructLikeSet and StructProjection, but the rest looked right to me.
f8db58e to
2ed07b9
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This creates intermediate nodes. It would be more efficient to have a composite flat "and" predicate with a list of sub-predicates
There was a problem hiding this comment.
I considered that originally, but I'm not sure it would be more efficient. We expect to only have one or a few filters, so with inlining, the JVM might be able to de-virtualize and flatten the calls into a single expression. The flat list seems to have more overhead for the common cases.
There was a problem hiding this comment.
We can optimize this later if it becomes an issue. An obvious one is to combine all the deletes into a single bitmap. For equality, we could use method handle combinators to combine the predicates, and possibly also to "compile" the equality comparison expression.
There was a problem hiding this comment.
I combined position deletes into a single bitmap.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/DeleteFile.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/DeleteFile.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/PositionDeleteFilter.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/PositionDeleteFilter.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/PositionDeleteFilter.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/PositionDeleteFilter.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/PositionDeleteFilter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Agreed, though we aren't doing that today, so this isn't a regression. We can do that as a follow up. I didn't see an easy way to get memory usage from RoaringBitmap. We could serialize and use that as an estimate, or do a simple estimate based on the cardinality (multiply by some constant factor).
Description
Improve performance of Iceberg queries for tables with updated or deleted rows.
Related issues, pull requests, and links
Documentation
(x) No documentation is needed.
Release notes
(x) Release notes entries required with the following suggested text: