Skip to content

Avoid checking isSplittable for files smaller than the split max size#4540

Merged
electrum merged 1 commit intotrinodb:masterfrom
pettyjamesm:minimum-splittable-size
Jul 27, 2020
Merged

Avoid checking isSplittable for files smaller than the split max size#4540
electrum merged 1 commit intotrinodb:masterfrom
pettyjamesm:minimum-splittable-size

Conversation

@pettyjamesm
Copy link
Member

Cross contribution of prestodb/presto#14877

For some input formats, the isSplittable check is non-trivial and can add a significant amount of time to split generation. This change allows files smaller than the max split size to avoid that check and simply calls them unsplittable since they're within the split target range already.

@cla-bot cla-bot bot added the cla-signed label Jul 22, 2020
@pettyjamesm pettyjamesm force-pushed the minimum-splittable-size branch 2 times, most recently from 1173622 to 385f927 Compare July 22, 2020 20:03
For some input formats, the isSplittable check is non-trivial and can
add a significant amount of time to split generation when handling a
large number of very small files. This change allows files smaller
than the max initial split size to avoid that check and considers them
unsplittable instead.
@electrum
Copy link
Member

Thanks!

@pettyjamesm pettyjamesm deleted the minimum-splittable-size branch March 1, 2022 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants