Fix double slash issue in Iceberg metadata location#14389
Fix double slash issue in Iceberg metadata location#14389ebyhr merged 2 commits intotrinodb:masterfrom
Conversation
There was a problem hiding this comment.
Can you link the PRs that added this issue in 393 and which one fixed it in 395?
4287d0d to
4156863
Compare
There was a problem hiding this comment.
This is a +1,580 −814 PR. Can we have a link to a specific code lines in 393 or 394 that were causing this?
There was a problem hiding this comment.
is it possible that the metadata path really contains two slashes?
for example, what if table was created with location = 's3://bucket/my_table/////'?
also, is it possible that //metadata/ substring occurs somewhere else within the location?
for example, what if my bucket is called metadata?
or, i have a table created with location = 's3://bucket/i/love/slashes/and//////metadata/my_schema/my_table'?
There was a problem hiding this comment.
Actually, that was concern when I was writing this method. Fixed the logic to replace only when the location ends with //metadata/{json file name}. Please take another look.
There was a problem hiding this comment.
add
// Simulate corrupted metadata location as Trino 393-394 was doing
There was a problem hiding this comment.
"Remove redundant slash" sounds a bit like "remove trailing slash" -- some operation that you do when manipulating paths.
Let's have the name express the fact this is a workaround for bad writers, not something you'd normally have to do.
fixBrokenMetadataLocation
4156863 to
9f642cb
Compare
There was a problem hiding this comment.
it's inconceivable that location contains multiple occurrences of brokenSuffix, but why would we burden the reader with having to think about that?
Use substring (or replaceFirst(Patter.quote(...) + "$", Matcher.quotereplacement(...)))
9f642cb to
bf0d2a7
Compare
|
CI hit #14568 |
| String correctSuffix = "/metadata/" + fileName; | ||
| String brokenSuffix = "//metadata/" + fileName; |
There was a problem hiding this comment.
Could this have happened with files in the //data/ directory?
There was a problem hiding this comment.
I temporary reverted #13984 and confirmed a data directory is fine. We may need additional handling if other query engines generates such directories though. Let's keep as-is since we haven't received issues about a data directory for now.
Description
Fixes #14299
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: