Skip to content

Fixes for file renaming and manifest features#15536

Merged
highker merged 4 commits intoprestodb:masterfrom
NikhilCollooru:mFix
Jan 4, 2021
Merged

Fixes for file renaming and manifest features#15536
highker merged 4 commits intoprestodb:masterfrom
NikhilCollooru:mFix

Conversation

@NikhilCollooru
Copy link
Contributor

@NikhilCollooru NikhilCollooru commented Dec 16, 2020

@highker
Copy link

highker commented Dec 22, 2020

Yes correct. We append 0s in front.

If we append 0s for bucketed tables, why sorting will be a problem? Or even with bucketed tables + file renaming, we will not do 0 appending anymore?

@NikhilCollooru
Copy link
Contributor Author

Yes correct. We append 0s in front.

If we append 0s for bucketed tables, why sorting will be a problem? Or even with bucketed tables + file renaming, we will not do 0 appending anymore?

In normal scenario, for bucketed tables the names are like "file_prefix + padded_bucket_number" .
https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/HiveWriterFactory.java#L635
The fileName is not efficiently compressable because of the file-prefix.

In file renaming case, we dont append 0 at the beginning of the filename because we dont know how many files are there in the partition . Hence we dont know how many 0s we should pad.

@highker highker self-requested a review December 22, 2020 02:46
@highker highker merged commit cebe0c2 into prestodb:master Jan 4, 2021
@NikhilCollooru NikhilCollooru deleted the mFix branch January 4, 2021 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants