Add CPU and scheduled time timeline to operator metrics by lukasz-stec · Pull Request #29016 · trinodb/trino

lukasz-stec · 2026-04-07T16:13:26Z

Description

Introduces ResourceUsageTimeSeriesRecorder, a fixed-size bucketed sampler that doubles its bucket width as operator execution grows. CPU and wall time are recorded across addInput, getOutput, and finish phases, merged into a single snapshot, and surfaced as "CPU and scheduled time usage over time" in operator metrics.

JMH benchmark results show small overhead.

Benchmark                                                  (operatorCount)  (randomBucketWidth)  (randomStartTime)  (recordDelayMillis)  (resourceTimeSeries)  (snapshotCount)  Mode  Cnt        Score      Error  Units
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A                2  avgt   20      222.676 ±    1.748  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A               10  avgt   20      402.062 ±    6.396  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A              100  avgt   20     3616.961 ±  210.224  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A                2  avgt   20      277.844 ±    1.140  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A               10  avgt   20     1024.055 ±    5.733  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A              100  avgt   20    11280.448 ±  824.108  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A                2  avgt   20      258.718 ±    2.311  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A               10  avgt   20      748.176 ±    7.178  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A              100  avgt   20     7295.883 ±  295.682  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A                2  avgt   20      308.245 ±    3.080  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A               10  avgt   20      926.292 ±    3.346  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A              100  avgt   20     9384.697 ±  162.079  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd                2                  N/A                N/A                  N/A                 false              N/A  avgt   20     4879.238 ±    5.494  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd                2                  N/A                N/A                  N/A                  true              N/A  avgt   20     5559.705 ±   40.382  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd               10                  N/A                N/A                  N/A                 false              N/A  avgt   20    92375.376 ±  233.255  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd               10                  N/A                N/A                  N/A                  true              N/A  avgt   20    94436.608 ±  422.369  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd              100                  N/A                N/A                  N/A                 false              N/A  avgt   20  1416322.264 ± 9655.329  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd              100                  N/A                N/A                  N/A                  true              N/A  avgt   20  1421749.822 ± 7786.944  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                  100                   N/A              N/A  avgt   20       34.609 ±    0.020  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                  500                   N/A              N/A  avgt   20       34.709 ±    0.048  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                 1000                   N/A              N/A  avgt   20       34.755 ±    0.090  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                 2000                   N/A              N/A  avgt   20       34.771 ±    0.077  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                32000                   N/A              N/A  avgt   20       34.678 ±    0.030  ns/op

The operator metric serializes to json like:

        "CPU and scheduled time usage over time" : {
          "startTimeEpochSeconds" : 1775136871,
          "bucketWidthSeconds" : 1,
          "cpuNanosBuckets" : [ 52000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 151000, 489000, 179000, 17051000 ],
          "wallNanosBuckets" : [ 57208, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 144955, 493628, 177336, 123101335 ]
        }

It enables visualizations like this:

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( X) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Add "CPU and scheduled time usage over time" metric to the operator stats.

Introduces ResourceUsageTimeSeriesRecorder, a fixed-size bucketed sampler that doubles its bucket width as operator execution grows. CPU and wall time are recorded across addInput, getOutput, and finish phases, merged into a single snapshot, and surfaced as "CPU and scheduled time usage over time" in operator metrics. JMH benchmark results show small overhead. Benchmark (operatorCount) (randomBucketWidth) (randomStartTime) (recordDelayMillis) (resourceTimeSeries) (snapshotCount) Mode Cnt Score Error Units BenchmarkResourceUsageTimeSeriesRecorder.merge N/A false false N/A N/A 2 avgt 20 222.676 ± 1.748 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A false false N/A N/A 10 avgt 20 402.062 ± 6.396 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A false false N/A N/A 100 avgt 20 3616.961 ± 210.224 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A false true N/A N/A 2 avgt 20 277.844 ± 1.140 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A false true N/A N/A 10 avgt 20 1024.055 ± 5.733 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A false true N/A N/A 100 avgt 20 11280.448 ± 824.108 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A true false N/A N/A 2 avgt 20 258.718 ± 2.311 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A true false N/A N/A 10 avgt 20 748.176 ± 7.178 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A true false N/A N/A 100 avgt 20 7295.883 ± 295.682 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A true true N/A N/A 2 avgt 20 308.245 ± 3.080 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A true true N/A N/A 10 avgt 20 926.292 ± 3.346 ns/op BenchmarkResourceUsageTimeSeriesRecorder.merge N/A true true N/A N/A 100 avgt 20 9384.697 ± 162.079 ns/op BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd 2 N/A N/A N/A false N/A avgt 20 4879.238 ± 5.494 ns/op BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd 2 N/A N/A N/A true N/A avgt 20 5559.705 ± 40.382 ns/op BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd 10 N/A N/A N/A false N/A avgt 20 92375.376 ± 233.255 ns/op BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd 10 N/A N/A N/A true N/A avgt 20 94436.608 ± 422.369 ns/op BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd 100 N/A N/A N/A false N/A avgt 20 1416322.264 ± 9655.329 ns/op BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd 100 N/A N/A N/A true N/A avgt 20 1421749.822 ± 7786.944 ns/op BenchmarkResourceUsageTimeSeriesRecorder.record N/A N/A N/A 100 N/A N/A avgt 20 34.609 ± 0.020 ns/op BenchmarkResourceUsageTimeSeriesRecorder.record N/A N/A N/A 500 N/A N/A avgt 20 34.709 ± 0.048 ns/op BenchmarkResourceUsageTimeSeriesRecorder.record N/A N/A N/A 1000 N/A N/A avgt 20 34.755 ± 0.090 ns/op BenchmarkResourceUsageTimeSeriesRecorder.record N/A N/A N/A 2000 N/A N/A avgt 20 34.771 ± 0.077 ns/op BenchmarkResourceUsageTimeSeriesRecorder.record N/A N/A N/A 32000 N/A N/A avgt 20 34.678 ± 0.030 ns/op

losipiuk · 2026-04-13T15:38:35Z

+            bucket = (int) ((nowNanos - startNanos) / bucketWidthNanos);
+            int sourceOffset = 0;
+            int targetOffset = 0;
+            if (startNanosOffset > 0) {


I do not get how this is not important how much did we move the startNanos. The logic is not clear to me.

Ok - startNanosOffset will be either 0 or previous bucket width - maybe worth putting in the comment.

Comment added

losipiuk

Very nice code

losipiuk · 2026-04-13T15:46:53Z

@CodeRabbit full-review

coderabbitai · 2026-04-13T15:47:14Z

The PR description image shows two stacked area charts visualizing the new metrics feature:

CPU USAGE TIMESERIES BY OPERATOR TYPE — CPU time per bucket (0–5.07s) from 03:34:31 to 03:34:47 PM, broken down by operator types (ScanFilterAndProjectOperator, TableScanOperator, LookupJoinOperator, etc.)
WALL TIME TIMESERIES BY OPERATOR TYPE — Wall time per bucket (0–30.74s) over the same time range, similarly broken down by operator type.

@losipiuk Sure! I'll kick off a full review of this PR now.

✅ Actions performed

Full review triggered.

coderabbitai · 2026-04-13T15:55:56Z

📝 Walkthrough

Walkthrough

This change introduces time-series recording of operator CPU and wall-time usage. A new ResourceUsageTimeSeriesRecorder class records timing data into fixed-size, time-bucketed histograms with dynamic expansion and snapshot/merge capabilities. The OperationTimer.OperationTiming class is extended with an overloaded constructor accepting an optional recorder instance. OperatorContext integrates the recorder by passing it to timing operations and including the captured time-series snapshot in operator metrics. Test assertions are updated to reflect the additional metric entry, and comprehensive unit and benchmark tests validate the recorder's functionality and performance.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

core/trino-main/src/test/java/io/trino/operator/TestResourceUsageTimeSeriesRecorder.java (1)
329-390: These two offset-merge tests currently cover the same case.

testMergeWithInitialBucketOffset and testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems use identical inputs and identical assertions, so the second one is not adding coverage. Either collapse them or change one to exercise a different partial-bucket scenario.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/trino-main/src/test/java/io/trino/operator/TestResourceUsageTimeSeriesRecorder.java`
around lines 329 - 390, Both tests testMergeWithInitialBucketOffset and
testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems are identical;
remove duplication by either deleting the second test or changing it to exercise
a different partial-bucket scenario. Update the second test
(testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems) to use a
different narrow snapshot start (for example start=101 or 101 with bucketWidth
1) via ResourceUsageTimeSeriesSnapshot.create and then recompute and assert the
expected merged result from merge(wide, narrow) (adjust startTimeEpochSeconds(),
bucketWidthSeconds(), and cpuNanosBuckets() expected values accordingly) so it
covers a true partial-bucket merge case rather than duplicating the first test.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@core/trino-main/src/main/java/io/trino/operator/ResourceUsageTimeSeriesRecorder.java`:
- Around line 297-311: The constructor ResourceUsageTimeSeriesSnapshot currently
allows cpuNanosBuckets and wallNanosBuckets of differing lengths which later
causes index errors; add a validation in the constructor
(ResourceUsageTimeSeriesSnapshot(...)) that verifies cpuNanosBuckets.length ==
wallNanosBuckets.length and reject mismatch (e.g., via checkArgument or similar)
with a clear message mentioning both array names and their lengths so invalid
payloads from create(...) are rejected early.
- Around line 106-118: The bucket realignment currently uses startNanos %
bucketWidthNanos (Ticker origin), causing misalignment vs wall-clock; change the
offset calculation to use the wall-clock start time: compute startEpochNanos =
startTimeEpochSeconds * 1_000_000_000L (or equivalent) and set startNanosOffset
= startEpochNanos % bucketWidthNanos, then adjust startNanos = startNanos -
startNanosOffset (and keep the existing bucket, sourceOffset/targetOffset
adjustments) so the expansion aligns to startTimeEpochSeconds rather than the
Ticker origin in ResourceUsageTimeSeriesRecorder.

In
`@core/trino-main/src/test/java/io/trino/operator/BenchmarkResourceUsageTimeSeriesRecorder.java`:
- Around line 61-100: The benchmark currently reuses the same
RecordData.recorder and TestingTicker across invocations (setup in
RecordData.setup), causing growing timeline and drifting measurements; fix by
reinitializing the recorder and ticker for each benchmark invocation — either
add a new setup method annotated `@Setup`(Level.Invocation) that recreates
ResourceUsageTimeSeriesRecorder and TestingTicker (and resets ticker state and
arrays as needed) or reset recorder/ticker at the start of the record(...)
benchmark so every invocation starts with a fresh
ResourceUsageTimeSeriesRecorder and TestingTicker.

---

Nitpick comments:
In
`@core/trino-main/src/test/java/io/trino/operator/TestResourceUsageTimeSeriesRecorder.java`:
- Around line 329-390: Both tests testMergeWithInitialBucketOffset and
testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems are identical;
remove duplication by either deleting the second test or changing it to exercise
a different partial-bucket scenario. Update the second test
(testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems) to use a
different narrow snapshot start (for example start=101 or 101 with bucketWidth
1) via ResourceUsageTimeSeriesSnapshot.create and then recompute and assert the
expected merged result from merge(wide, narrow) (adjust startTimeEpochSeconds(),
bucketWidthSeconds(), and cpuNanosBuckets() expected values accordingly) so it
covers a true partial-bucket merge case rather than duplicating the first test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e28dbd8b-314e-4005-a842-1c5d21069146

📥 Commits

Reviewing files that changed from the base of the PR and between 5a1c279 and 3690351.

📒 Files selected for processing (7)

core/trino-main/src/main/java/io/trino/operator/OperationTimer.java
core/trino-main/src/main/java/io/trino/operator/OperatorContext.java
core/trino-main/src/main/java/io/trino/operator/ResourceUsageTimeSeriesRecorder.java
core/trino-main/src/test/java/io/trino/operator/BenchmarkResourceUsageTimeSeriesRecorder.java
core/trino-main/src/test/java/io/trino/operator/TestResourceUsageTimeSeriesRecorder.java
core/trino-main/src/test/java/io/trino/operator/TestWorkProcessorOperatorAdapter.java
core/trino-main/src/test/java/io/trino/operator/TestWorkProcessorSourceOperatorAdapter.java

lukasz-stec

Comments addressed

lukasz-stec · 2026-04-13T18:36:05Z

+            bucket = (int) ((nowNanos - startNanos) / bucketWidthNanos);
+            int sourceOffset = 0;
+            int targetOffset = 0;
+            if (startNanosOffset > 0) {


Comment added

cla-bot bot added the cla-signed label Apr 7, 2026

lukasz-stec force-pushed the ls/2604/01-resource-usage-timeline branch 3 times, most recently from 67c610a to 2951467 Compare April 8, 2026 09:00

lukasz-stec requested review from losipiuk and wendigo April 8, 2026 11:04

lukasz-stec marked this pull request as ready for review April 8, 2026 11:04

lukasz-stec force-pushed the ls/2604/01-resource-usage-timeline branch from 2951467 to 3690351 Compare April 8, 2026 13:49

losipiuk reviewed Apr 13, 2026

View reviewed changes

losipiuk approved these changes Apr 13, 2026

View reviewed changes

coderabbitai bot reviewed Apr 13, 2026

View reviewed changes

lukasz-stec commented Apr 13, 2026

View reviewed changes

losipiuk merged commit f3867dc into trinodb:master Apr 15, 2026
98 checks passed

github-actions bot added this to the 481 milestone Apr 15, 2026

ebyhr mentioned this pull request Apr 16, 2026

Add 481 release notes #28894

Open

Conversation

lukasz-stec commented Apr 7, 2026

Description

Additional context and related issues

Release notes

Uh oh!

losipiuk Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

losipiuk Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

lukasz-stec Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

losipiuk left a comment

Choose a reason for hiding this comment

Uh oh!

losipiuk commented Apr 13, 2026

Uh oh!

coderabbitai bot commented Apr 13, 2026

Uh oh!

coderabbitai bot commented Apr 13, 2026

Walkthrough

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukasz-stec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lukasz-stec Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants