Account for memory allocated by trino s3 staging output stream#14407
Conversation
da5abd6 to
d9ca5e1
Compare
c19095e to
6f3e3cd
Compare
6f3e3cd to
4e4368d
Compare
lib/trino-filesystem/src/main/java/io/trino/filesystem/hdfs/HdfsOutputFile.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
getOutputStreamRetainedSizeInBytesSupplier -> retainedSizeInBytesSupplier
There was a problem hiding this comment.
you can just extend io.trino.hdfs.TrinoFileSystemCache.OutputStreamWrapper
There was a problem hiding this comment.
There was a problem hiding this comment.
you can just extend io.trino.hdfs.TrinoFileSystemCache.OutputStreamWrapper
it is private i didn't want to change that, but sure I can do it
There was a problem hiding this comment.
it is private i didn't want to change that, but sure I can do it
Oh, but I mean you can make OutputStreamWrapper do what MemoryAwareOutputStreamWrapper does (provide MemoryAware interface for wrapper stream)
There was a problem hiding this comment.
getOutputStreamRetainedSizeInBytesSupplier -> outputStreamRetainedSizeInBytesSupplier
There was a problem hiding this comment.
If transferManager does not hold for memory outside between TrinoS3StagingOutputStream calls, then it should be 0
There was a problem hiding this comment.
I was not sure about this but you are right probably
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergParquetFileWriter.java
Outdated
Show resolved
Hide resolved
4e4368d to
32e2c21
Compare
There was a problem hiding this comment.
getOutputStreamRetainedSizeInBytesSupplier -> outputStreamRetainedSizeInBytesSupplier
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergParquetFileWriter.java
Outdated
Show resolved
Hide resolved
|
I think clearer approach is to use then in This way there is less intermediate layers and |
I can't because |
e3c9d0a to
247bcee
Compare
You can add |
That is not the only problem. Those and simply creating instances of |
571ff0d to
81ae13f
Compare
|
@sopel39 please take another look. in |
lib/trino-filesystem/src/main/java/io/trino/filesystem/fileio/ForwardingOutputFile.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/main/java/io/trino/filesystem/MemoryAwareFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-orc/src/main/java/io/trino/orc/OutputStreamOrcDataSink.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
AggregatedMemoryContext aggregatedMemoryContext -> AggregatedMemoryContext memoryContext
There was a problem hiding this comment.
AggregatedMemoryContext aggregatedMemoryContext -> AggregatedMemoryContext memoryContext
There was a problem hiding this comment.
AggregatedMemoryContext aggregatedMemoryContext -> AggregatedMemoryContext memoryContext
There was a problem hiding this comment.
aggregatedMemoryContext -> outputMemoryContext
There was a problem hiding this comment.
make this constructor private.
Add factory methods:
OutputStreamOrcDataSink OutputStreamOrcDataSink#create(TrinoOutputFile outputFile) {
AggregatedMemoryContext context = newSimpleAggregatedMemoryContext();
return new OutputStreamOrcDataSink(outputFile.create(context), context);
}
/**
* Use only when output stream memory consumption tracking is not possible.
*/
OutputStreamOrcDataSink OutputStreamOrcDataSink#create(OutputStream stream) {
return new OutputStreamOrcDataSink(strea, newSimpleAggregatedMemoryContext());
}
plugin/trino-hive/src/main/java/io/trino/plugin/hive/s3/TrinoS3FileSystem.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
make it TrinoOutputFile outputFile = fileSystem.newOutputFile(outputPath)
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergFileWriterFactory.java
Outdated
Show resolved
Hide resolved
6ec2b56 to
aef4280
Compare
lib/trino-filesystem/src/main/java/io/trino/filesystem/MemoryAwareFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-orc/src/main/java/io/trino/orc/OutputStreamOrcDataSink.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
static constructor above OutputStreamOrcDataSink constructor
lib/trino-orc/src/main/java/io/trino/orc/OutputStreamOrcDataSink.java
Outdated
Show resolved
Hide resolved
lib/trino-orc/src/main/java/io/trino/orc/OutputStreamOrcDataSink.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
remove? you don't account for mem in TrinoS3StagingOutputStream
There was a problem hiding this comment.
buffer is already created above
There was a problem hiding this comment.
you don't really have to mock these. Use real objects (e.g io.trino.memory.context.AggregatedMemoryContext#newRootAggregatedMemoryContext with reservation handler to track max allocation
7d0718e to
5dc2be3
Compare
There was a problem hiding this comment.
similar as for OutputStreamOrcDataSink. Create AggregatedMemoryContext locally in ParquetFileWriter.
You can then simplify calls to createParquetFileWriter
There was a problem hiding this comment.
create memory context locally and override getMemoryUsage method
plugin/trino-hive/src/test/java/io/trino/plugin/hive/s3/TestTrinoS3FileSystem.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
why reserved is not 0? Mem leak?
There was a problem hiding this comment.
Not a memory leak. in the test I simply don't close outputStream, memory context is not closed and thus not set to 0.
I did it on purpose to be able to catche the value. But I can do it better.
There was a problem hiding this comment.
heh I didn't notice this before answering the comment above ;) it is done now
5dc2be3 to
650cc11
Compare
plugin/trino-hive/pom.xml
Outdated
There was a problem hiding this comment.
just close is sufficient (no need for setBytes)
There was a problem hiding this comment.
I would just store max and make sure that assertThat(memoryReservationHandler.getMaxReserver).isGreatherThan(0);
Checking that 64 is part of reserved values is fragile (depends on implementation)
650cc11 to
f884858
Compare
Description
Account for memory allocated by TrinoS3FileSystems
Fixes: #14023
Non-technical explanation
Some queries that previously killed cluster should be gracefully terminated now
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: