Reduce writers retained memory utilization#14823
Conversation
dd5b9ec to
271064f
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/orc/OrcFileWriter.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/orc/OrcFileWriter.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/FileWriter.java
Outdated
Show resolved
Hide resolved
electrum
left a comment
There was a problem hiding this comment.
Which writers are holding onto memory after they are closed? Can we fix that directly instead of making the code more complicated with this change?
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSink.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSink.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/RollbackAction.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSink.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/MergeFileWriter.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/RcFileFileWriter.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/RecordFileWriter.java
Outdated
Show resolved
Hide resolved
it's |
|
Alternative approach: #14824 |
271064f to
d7e9d07
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSink.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/RollbackAction.java
Outdated
Show resolved
Hide resolved
fc5bfe7 to
c7bd53a
Compare
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakePageSink.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSink.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
To me making the rollback action a Closeable looks weird, there is no resource that we are releasing.
You could use Runnable instead if the problem is that you want to avoid extra return null.
There was a problem hiding this comment.
You could use Runnable instead if the problem is that you want to avoid extra return null.
I actually use it in try-with-resource, e.g:
try (rollbackAction) {
avroWriter.close();
}
catch (Exception e) {
throw new TrinoException(ICEBERG_WRITER_CLOSE_ERROR, "Error rolling back write to Avro file", e);
}
so I don't have to deal with syntax sugar myself. I think using native Java try-with-resource is actually better than relying on Guava closer.
Instance size of writers was not accounted for by Delta, Hive and Iceberg writers. Once it becomes accounted for, then it seems that estimated memory for Orc and Parquet is excessive.
Whenever writer is closed, it's instance was retained so that rollback could be performed in case of an error. However, this was retaining excessive amount of memory which was not needed anymore. This commit makes io.trino.plugin.hive.FileWriter#commit return rollback action reference so that page sinks don't have to keep reference to writers themselves.
c7bd53a to
bde343f
Compare
Whenever writer is closed, it's instance was
retained so that rollback could be performed
in case of an error. However, this was retaining
excessive amount of memory which was not needed
anymore.
This commit introduces RollbackAction that can
be retained after driver is closed which should
significantly reduce memory usage during long
inserts.
Description
Non-technical explanation
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: