-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed as not planned
Labels
Description
In upsert/cdc case, we usually will get a lot of pos-delete and eq-delete files. When we read/rewrite data from the v2 table, DeleteFilter will open all referenced pos-delete files and eq-delete files for each data file to construct the posDeleteSet and eqDeleteSet.
Currently, that all work will handled by same thread for each CombinedScanTask and all delete files are read serially, that is mean iceberg read a delete file must wait for the last file to be read and DeleteFilter will take a lot of time to open and read delete files.
I think DeleteFilter should read delete files in parallel when construct the posDeleteSet and eqDeleteSet to speed up reading v2 table.