Skip to content

Add recording distribution of outputBuffer utilization#13463

Merged
sopel39 merged 1 commit intotrinodb:masterfrom
radek-kondziolka:rk/add_buffer_utilization_distribution
Sep 7, 2022
Merged

Add recording distribution of outputBuffer utilization#13463
sopel39 merged 1 commit intotrinodb:masterfrom
radek-kondziolka:rk/add_buffer_utilization_distribution

Conversation

@radek-kondziolka
Copy link
Copy Markdown
Contributor

@radek-kondziolka radek-kondziolka commented Aug 2, 2022

This pull request adds the exposure of the distribution of outputBuffer's utilization as the outputBuffers.utilization parameter in the operator stats.

Copy link
Copy Markdown
Member

@skrzypo987 skrzypo987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that a lot of code is copied between the *OutputBuffer classes, including the new code introduced in this commit. Please take a look if there is a chance to deduplicate some of it.

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from 7b6875c to cd44061 Compare August 4, 2022 07:02
@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

radek-kondziolka commented Aug 4, 2022

@skrzypo987 , that's true. However, that pull request just adds measuring the output buffer utilization and it just adds the field to store state, the update of the state and the exposition of the state. I think it is not too much and it is very readable.

Obviously, we could add a super class but on my eye it is not a good idea here to separate out the abstract class only for stats. Sometimes it is better to have duplicated code for readability and to diminish complexity, I think so at least.

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 4 times, most recently from 6ed60bd to d41bf05 Compare August 11, 2022 08:19
Copy link
Copy Markdown
Member

@skrzypo987 skrzypo987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you manually checked that stats are there?

@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

radek-kondziolka commented Aug 11, 2022

Have you manually checked that stats are there?

@sopel39 , it was hard to me to catch the difference. I've run query insert into ... select ... with two configurations:

  • sink.max-buffer-size=1024
    "min" : 0.0,
    "max" : 1.0587518122047186,
    "p01" : 0.0,
    "p05" : 0.3941453357877656,
    "p10" : 0.8341365753791842
    "p25" : 1.007034594562806,
    "p50" : 1.021537570494948,
    "p75" : 1.036792696828272,
    "p90" : 1.0482016705239794,
    "p95" : 1.0556658666995862,
    "p99" : 1.0582062227524862,
    "total" : 1595430429266,
  • sink.max-buffer-size=8192MB
    "min" : 0.0,
    "max" : 1.0073409504257143,
    "p01" : 0.0,
    "p05" : 0.05224226690431327,
    "p10" : 0.12749997785610248
    "p25" : 0.3531256313305722,
    "p50" : 0.7062936963112328,
    "p75" : 0.9783176548063535,
    "p90" : 1.0037902268993106,
    "p95" : 1.0052850667382625,
    "p99" : 1.007157468224605,
    "total" : 1653018955249,

We see the difference between smaller and bigger max-buffer-size (for lower percentiles).

Before we merge I need to wait for benchmark results.
However, for smaller sink.max-buffer-size (like 32MB by default) I was observing utilization ~200%. Maybe a such overutilization is possible and even expected in some scenarios?

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from 2ae0830 to d938757 Compare August 11, 2022 13:28
Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from d938757 to 5b9bece Compare August 19, 2022 07:49
@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

@sopel39 ,
all your comments were addressed. I wait for macrobenchmark results.

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 3 times, most recently from ee6a954 to 1a37705 Compare August 19, 2022 12:47
@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from 1a37705 to 7fbd00c Compare August 23, 2022 07:56
@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from afd266c to 787127d Compare August 31, 2022 12:38
Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments.

@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Aug 31, 2022

CI are failing

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from f1d033d to e35f0ed Compare September 2, 2022 06:59
Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments.

  1. How do you extract histogram for query?
  2. Is lock conjestion not a problem (how fast is TDigest.copyOf(bufferUtilization))?

@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

radek-kondziolka commented Sep 5, 2022

@sopel39 ,

How do you extract histogram for query?

Just by downloading the JSON with stats

Is lock conjestion not a problem (how fast is TDigest.copyOf(bufferUtilization))?

It does not look like (basing on JFR profile)

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from e35f0ed to 4b20fbe Compare September 5, 2022 09:57
Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comment

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from d5e5c56 to d9b5d25 Compare September 6, 2022 12:47
Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind compilation errors

@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch 2 times, most recently from a413907 to 7d495ef Compare September 7, 2022 09:58
Add outputBufferUtilization field to the OutputBufferInfo to
expose distribution of output buffer utilization.
@radek-kondziolka radek-kondziolka force-pushed the rk/add_buffer_utilization_distribution branch from 7d495ef to c3fe23f Compare September 7, 2022 12:04
@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Sep 7, 2022

Could you paste example utilization report for query where source stage is/isn't a bottleneck?

@sopel39 sopel39 merged commit df1dde1 into trinodb:master Sep 7, 2022
@github-actions github-actions bot added this to the 395 milestone Sep 7, 2022
@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

Could you paste example utilization report for query where source stage is/isn't a bottleneck?

@sopel39 , shared offline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants