Add recording distribution of outputBuffer utilization by radek-kondziolka · Pull Request #13463 · trinodb/trino

radek-kondziolka · 2022-08-02T14:11:32Z

This pull request adds the exposure of the distribution of outputBuffer's utilization as the outputBuffers.utilization parameter in the operator stats.

skrzypo987

It appears that a lot of code is copied between the *OutputBuffer classes, including the new code introduced in this commit. Please take a look if there is a chance to deduplicate some of it.

core/trino-main/src/main/java/io/trino/operator/TableWriterOperator.java

radek-kondziolka · 2022-08-04T07:05:20Z

@skrzypo987 , that's true. However, that pull request just adds measuring the output buffer utilization and it just adds the field to store state, the update of the state and the exposition of the state. I think it is not too much and it is very readable.

Obviously, we could add a super class but on my eye it is not a good idea here to separate out the abstract class only for stats. Sometimes it is better to have duplicated code for readability and to diminish complexity, I think so at least.

core/trino-main/src/main/java/io/trino/execution/buffer/PartitionedOutputBuffer.java

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferInfo.java

skrzypo987

lgtm

sopel39

Have you manually checked that stats are there?

core/trino-main/src/main/java/io/trino/operator/output/TaskOutputOperator.java

core/trino-main/src/main/java/io/trino/operator/OperatorInfo.java

radek-kondziolka · 2022-08-11T13:26:35Z

Have you manually checked that stats are there?

@sopel39 , it was hard to me to catch the difference. I've run query insert into ... select ... with two configurations:

sink.max-buffer-size=1024

    "min" : 0.0,
    "max" : 1.0587518122047186,
    "p01" : 0.0,
    "p05" : 0.3941453357877656,
    "p10" : 0.8341365753791842
    "p25" : 1.007034594562806,
    "p50" : 1.021537570494948,
    "p75" : 1.036792696828272,
    "p90" : 1.0482016705239794,
    "p95" : 1.0556658666995862,
    "p99" : 1.0582062227524862,
    "total" : 1595430429266,

sink.max-buffer-size=8192MB

    "min" : 0.0,
    "max" : 1.0073409504257143,
    "p01" : 0.0,
    "p05" : 0.05224226690431327,
    "p10" : 0.12749997785610248
    "p25" : 0.3531256313305722,
    "p50" : 0.7062936963112328,
    "p75" : 0.9783176548063535,
    "p90" : 1.0037902268993106,
    "p95" : 1.0052850667382625,
    "p99" : 1.007157468224605,
    "total" : 1653018955249,

We see the difference between smaller and bigger max-buffer-size (for lower percentiles).

Before we merge I need to wait for benchmark results.
However, for smaller sink.max-buffer-size (like 32MB by default) I was observing utilization ~200%. Maybe a such overutilization is possible and even expected in some scenarios?

sopel39

lgtm % comments

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBuffer.java

core/trino-main/src/main/java/io/trino/execution/buffer/LazyOutputBuffer.java

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java

core/trino-main/src/main/java/io/trino/execution/buffer/SpoolingExchangeOutputBuffer.java

core/trino-main/src/main/java/io/trino/operator/output/TaskOutputOperator.java

radek-kondziolka · 2022-08-19T08:23:37Z

@sopel39 ,
all your comments were addressed. I wait for macrobenchmark results.

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java

core/trino-main/src/main/java/io/trino/operator/output/PagePartitioner.java

core/trino-main/src/main/java/io/trino/operator/output/TaskOutputOperator.java

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java

core/trino-main/src/main/java/io/trino/execution/buffer/PartitionedOutputBuffer.java

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java

sopel39

lgtm % comments.

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java

sopel39 · 2022-08-31T16:10:16Z

CI are failing

sopel39

lgtm % comments.

How do you extract histogram for query?
Is lock conjestion not a problem (how fast is TDigest.copyOf(bufferUtilization))?

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferInfo.java

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java

radek-kondziolka · 2022-09-05T09:51:04Z

@sopel39 ,

How do you extract histogram for query?

Just by downloading the JSON with stats

Is lock conjestion not a problem (how fast is TDigest.copyOf(bufferUtilization))?

It does not look like (basing on JFR profile)

sopel39

lgtm % comment

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferInfo.java

sopel39

mind compilation errors

core/trino-main/src/main/java/io/trino/execution/StageStateMachine.java

Add outputBufferUtilization field to the OutputBufferInfo to expose distribution of output buffer utilization.

sopel39 · 2022-09-07T19:50:03Z

Could you paste example utilization report for query where source stage is/isn't a bottleneck?

radek-kondziolka · 2022-09-13T17:05:09Z

Could you paste example utilization report for query where source stage is/isn't a bottleneck?

@sopel39 , shared offline

cla-bot bot added the cla-signed label Aug 2, 2022

radek-kondziolka requested review from gaurav8297, raunaqmorarka and skrzypo987 August 2, 2022 14:20

skrzypo987 reviewed Aug 3, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/operator/TableWriterOperator.java Outdated Show resolved Hide resolved

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from 7b6875c to cd44061 Compare August 4, 2022 07:02

sopel39 reviewed Aug 9, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/PartitionedOutputBuffer.java Outdated Show resolved Hide resolved

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferInfo.java Outdated Show resolved Hide resolved

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 4 times, most recently from 6ed60bd to d41bf05 Compare August 11, 2022 08:19

radek-kondziolka requested review from skrzypo987 and sopel39 August 11, 2022 08:21

skrzypo987 approved these changes Aug 11, 2022

View reviewed changes

sopel39 reviewed Aug 11, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/operator/output/TaskOutputOperator.java Outdated Show resolved Hide resolved

core/trino-main/src/main/java/io/trino/operator/OperatorInfo.java Outdated Show resolved Hide resolved

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from 2ae0830 to d938757 Compare August 11, 2022 13:28

sopel39 reviewed Aug 17, 2022

View reviewed changes

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from d938757 to 5b9bece Compare August 19, 2022 07:49

sopel39 reviewed Aug 19, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 3 times, most recently from ee6a954 to 1a37705 Compare August 19, 2022 12:47

sopel39 reviewed Aug 22, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from 1a37705 to 7fbd00c Compare August 23, 2022 07:56

sopel39 reviewed Aug 30, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

sopel39 reviewed Aug 30, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/PartitionedOutputBuffer.java Outdated Show resolved Hide resolved

sopel39 reviewed Aug 30, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

sopel39 reviewed Aug 30, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from afd266c to 787127d Compare August 31, 2022 12:38

radek-kondziolka requested a review from sopel39 August 31, 2022 12:40

sopel39 reviewed Aug 31, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from f1d033d to e35f0ed Compare September 2, 2022 06:59

sopel39 reviewed Sep 2, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferInfo.java Outdated Show resolved Hide resolved

core/trino-main/src/main/java/io/trino/execution/buffer/OutputBufferMemoryManager.java Outdated Show resolved Hide resolved

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from e35f0ed to 4b20fbe Compare September 5, 2022 09:57

sopel39 approved these changes Sep 5, 2022

View reviewed changes

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from d5e5c56 to d9b5d25 Compare September 6, 2022 12:47

sopel39 approved these changes Sep 6, 2022

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/StageStateMachine.java Outdated Show resolved Hide resolved

sopel39 mentioned this pull request Sep 6, 2022

Refactor exchange interfaces #13968

Merged

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from a413907 to 7d495ef Compare September 7, 2022 09:58

Add recording distribution of outputBuffer utilization

c3fe23f

Add outputBufferUtilization field to the OutputBufferInfo to expose distribution of output buffer utilization.

radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from 7d495ef to c3fe23f Compare September 7, 2022 12:04

sopel39 approved these changes Sep 7, 2022

View reviewed changes

sopel39 merged commit df1dde1 into trinodb:master Sep 7, 2022

github-actions bot added this to the 395 milestone Sep 7, 2022

colebow mentioned this pull request Sep 7, 2022

Add Trino 395 release notes #13975

Merged

sopel39 mentioned this pull request Aug 22, 2024

Expose StreamingDirectExchangeBuffer buffer utilization distribution as DataSource metrics #23107

Open

Conversation

radek-kondziolka commented Aug 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skrzypo987 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

radek-kondziolka commented Aug 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

skrzypo987 left a comment

Choose a reason for hiding this comment

Uh oh!

sopel39 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

radek-kondziolka commented Aug 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sopel39 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

radek-kondziolka commented Aug 19, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sopel39 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sopel39 commented Aug 31, 2022

Uh oh!

sopel39 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

radek-kondziolka commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sopel39 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sopel39 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sopel39 commented Sep 7, 2022

Uh oh!

radek-kondziolka commented Aug 2, 2022 •

edited

Loading

radek-kondziolka commented Aug 4, 2022 •

edited

Loading

radek-kondziolka commented Aug 11, 2022 •

edited

Loading

radek-kondziolka commented Sep 5, 2022 •

edited

Loading