Filter Iceberg splits based on $path column predicates#13012
Filter Iceberg splits based on $path column predicates#13012ebyhr merged 2 commits intotrinodb:masterfrom
Conversation
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java
Outdated
Show resolved
Hide resolved
|
CI hit #12950 |
There was a problem hiding this comment.
This should make only $path column enforced, and if it is any other metadata column, we should still consider this unsupported.
There was a problem hiding this comment.
That doesn't need to return Optional.
it's either Domain.none(), Domain.all(), or some specific Domain
There was a problem hiding this comment.
I don't think hadoopPath is appropriate here. It appends some #... in case of s3://,
but that suffix is not -- i hope so -- returned when doing SELECT "$path" FROM t
There was a problem hiding this comment.
can we have some test coverage with s3?
sth like
- create a table with two files
- get
"$path"for one of the files - select where $path = selected_path
- verify we get data from that one file only
There was a problem hiding this comment.
I think we should use hadoopPath() unless fixing $path result.
IcebergSplitSource
private IcebergSplit toIcebergSplit(FileScanTask task)
{
return new IcebergSplit(
hadoopPath(task.file().path().toString()),
↓
IcebergPageSourceProvider
ReaderPageSource dataPageSource = createDataPageSource(
session,
hdfsContext,
new Path(split.getPath()),
...
else if (column.isPathColumn()) {
columnAdaptations.add(ColumnAdaptation.constantColumn(nativeValueToBlock(FILE_PATH.getType(), utf8Slice(path.toString()))));
}It appends # when the bucket starts with / likes s3:///bucket (not always) as you already know (#11998)
I tried if we can create such bucket in S3 & Minio, but failing by invalid bucket name.
~ aws s3api create-bucket --bucket /ebyhr-test
Parameter validation failed:
Invalid bucket name "/ebyhr-test": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"
https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html
There was a problem hiding this comment.
the #... suffix is internal thing, not to be exposed to users.
There was a problem hiding this comment.
Let's move this logic to the place which applies conditions on metadata columns, so that we know no constraint gets ignored.
thus, here it would be checkArgument(! isMetadataColumnId(columnHandle.getId()))
and IcebergSplitSource would divide TupleDomain into
- $path domain --> handled directly there
- the reset --> passed to
toIcebergExpression
dab1097 to
c0805b6
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplit.java
Outdated
Show resolved
Hide resolved
c0805b6 to
7391dd9
Compare
|
The PR seems to be in a better shape now. |
7391dd9 to
fef1224
Compare
There was a problem hiding this comment.
nit: we can defensively check if all metadata column predicates are consumed here, so that for any future additions for metadata-column predicates are actually applied in the splitsource. but don't feel too strongly - as we'll add tests anyway.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Q are we changing the behavior here? i.e. before this PR, effectively hadoopPath() was being returned here since that was propagated from the split, but now it's the actual path without encoding.
There was a problem hiding this comment.
Yes, it's changing. I will mention in a release note. Relates to #13012 (comment)
fef1224 to
05ce82b
Compare
05ce82b to
f54431a
Compare
f54431a to
a3aa655
Compare
alexjo2144
left a comment
There was a problem hiding this comment.
@homar how do you feel about changing finishOptimize to only remove delete files if the whole table is being optimized? I know we spent a while trying to figure out exactly when we could remove the files in a partition, but maintaining that logic feels pretty error prone.
There was a problem hiding this comment.
Maybe rename to dataColumnPredicate or nonMetadataPredicate to signal that it's a filtered version
There was a problem hiding this comment.
Renamed to dataColumnPredicate.
I am fine with that tough I am afraid that if we do that then there a lot of clients will run into situation when delete files are never removed as they never optimize entire table. |
|
They should eventually get cleaned up by |
The initial implementation removed delete files from a partition even if the whole table was not scanned. This was fine, but assumes the enforced predicate describes entire partitions. This assumption will not be true after trinodb#13012
a4cec23 to
c984e7d
Compare
|
Just rebased on upstream to resolve conflicts. Let's merge #13343 first. |
The initial implementation removed delete files from a partition even if the whole table was not scanned. This was fine, but assumes the enforced predicate describes entire partitions. This assumption will not be true after #13012
c984e7d to
2dacf04
Compare
|
We should do this for the file_modified_time column as well. Want to do that here, or in a separate PR? |
|
I want to separate a PR for |
|
@findepi Could you please review when you have time? |
|
Can you add more more test? Specifically
|
2dacf04 to
c7d041b
Compare
|
@alexjo2144 Added another test case. |
|
|
||
| import java.io.IOException; | ||
| import java.io.UncheckedIOException; | ||
| import java.net.URI; |
There was a problem hiding this comment.
should "Return $path without URL encoding in Iceberg" commit have any test changes/additions?
There was a problem hiding this comment.
Ideally it should, but let me handle in #13457
There was a problem hiding this comment.
url encoding (or lack of) should be exercisable independently from double slashes.
eg path containing %.
There was a problem hiding this comment.
Not sure if I understood your suggestion correctly, but just adding % to file or directory name wouldn't work because it doesn't pass in !path.equals(hadoopPath.toString()) in hadoopPath().
There was a problem hiding this comment.
Add a message
"Constraint on an unexpected column %s", columnHandle
There was a problem hiding this comment.
nit: can inline pathMatchesPredicate
There was a problem hiding this comment.
if effective predicate is none, this method returns ALL domain.
The NONE tuple domain needs to be handled explicitly.
There was a problem hiding this comment.
OTOH, the none is not expected here (should be filtered out earlier), so sth like
IcebergColumnHandle pathColumn = pathColumnHandle();
Domain domain = effectivePredicate.getDomains().orElseThrow(() -> new IllegalArgumentException("Unexpected NONE tuple domain"))
.get(pathColumn);
if (domain == null) {
return Domain.all(pathColumn.getType());
}
return domain;c7d041b to
f22fdd4
Compare
The initial implementation removed delete files from a partition even if the whole table was not scanned. This was fine, but assumes the enforced predicate describes entire partitions. This assumption will not be true after trinodb#13012
Description
Filter Iceberg splits based on $path column predicates
Fixes #12785
Documentation
(x) No documentation is needed.
Release notes
(x) Release notes entries required with the following suggested text: