-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
I had done a chbenchmark of iceberg on Trino. I found that the performance of MOR is very low when have many delete files. The scale of data is 10 warehouse. The average duration is less than 10 second when no have delete files, but when I add some delete file to every tables some queries spent over one hour.
Reasons:
- Make predicates of delete only initialize once #5195 The Trino every page will call DeleteFilter#filter, every calling of DeleteFilter#filter will initialize delete files.
- Add StructLikeWrapperFactory to generate StructLikeWrapper #5244 Add InternalRecordWrapperFactory to generate InternalRecordWrapper #5242 We found that the cost of creating StructLikeWrapper and InternalRecordWrapper is high.
this is Flame Graph:
The query performance improved when we made these optimizations. such as the query "select count(*) from stock", before optimize spent 8 minutes, after optimize only spent 20 seconds.
flyrain and myfjdthink
Metadata
Metadata
Assignees
Labels
No labels
