Skip to content

Add CPU and scheduled time timeline to operator metrics#29016

Merged
losipiuk merged 1 commit intotrinodb:masterfrom
starburstdata:ls/2604/01-resource-usage-timeline
Apr 15, 2026
Merged

Add CPU and scheduled time timeline to operator metrics#29016
losipiuk merged 1 commit intotrinodb:masterfrom
starburstdata:ls/2604/01-resource-usage-timeline

Conversation

@lukasz-stec
Copy link
Copy Markdown
Member

Description

Introduces ResourceUsageTimeSeriesRecorder, a fixed-size bucketed sampler that doubles its bucket width as operator execution grows. CPU and wall time are recorded across addInput, getOutput, and finish phases, merged into a single snapshot, and surfaced as "CPU and scheduled time usage over time" in operator metrics.

JMH benchmark results show small overhead.

Benchmark                                                  (operatorCount)  (randomBucketWidth)  (randomStartTime)  (recordDelayMillis)  (resourceTimeSeries)  (snapshotCount)  Mode  Cnt        Score      Error  Units
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A                2  avgt   20      222.676 ±    1.748  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A               10  avgt   20      402.062 ±    6.396  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A              100  avgt   20     3616.961 ±  210.224  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A                2  avgt   20      277.844 ±    1.140  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A               10  avgt   20     1024.055 ±    5.733  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A              100  avgt   20    11280.448 ±  824.108  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A                2  avgt   20      258.718 ±    2.311  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A               10  avgt   20      748.176 ±    7.178  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A              100  avgt   20     7295.883 ±  295.682  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A                2  avgt   20      308.245 ±    3.080  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A               10  avgt   20      926.292 ±    3.346  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A              100  avgt   20     9384.697 ±  162.079  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd                2                  N/A                N/A                  N/A                 false              N/A  avgt   20     4879.238 ±    5.494  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd                2                  N/A                N/A                  N/A                  true              N/A  avgt   20     5559.705 ±   40.382  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd               10                  N/A                N/A                  N/A                 false              N/A  avgt   20    92375.376 ±  233.255  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd               10                  N/A                N/A                  N/A                  true              N/A  avgt   20    94436.608 ±  422.369  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd              100                  N/A                N/A                  N/A                 false              N/A  avgt   20  1416322.264 ± 9655.329  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd              100                  N/A                N/A                  N/A                  true              N/A  avgt   20  1421749.822 ± 7786.944  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                  100                   N/A              N/A  avgt   20       34.609 ±    0.020  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                  500                   N/A              N/A  avgt   20       34.709 ±    0.048  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                 1000                   N/A              N/A  avgt   20       34.755 ±    0.090  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                 2000                   N/A              N/A  avgt   20       34.771 ±    0.077  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                32000                   N/A              N/A  avgt   20       34.678 ±    0.030  ns/op

The operator metric serializes to json like:

        "CPU and scheduled time usage over time" : {
          "startTimeEpochSeconds" : 1775136871,
          "bucketWidthSeconds" : 1,
          "cpuNanosBuckets" : [ 52000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 151000, 489000, 179000, 17051000 ],
          "wallNanosBuckets" : [ 57208, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 144955, 493628, 177336, 123101335 ]
        }

It enables visualizations like this:
image

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( X) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Add "CPU and scheduled time usage over time" metric to the operator stats.

@cla-bot cla-bot bot added the cla-signed label Apr 7, 2026
@lukasz-stec lukasz-stec force-pushed the ls/2604/01-resource-usage-timeline branch 3 times, most recently from 67c610a to 2951467 Compare April 8, 2026 09:00
@lukasz-stec lukasz-stec requested review from losipiuk and wendigo April 8, 2026 11:04
@lukasz-stec lukasz-stec marked this pull request as ready for review April 8, 2026 11:04
Introduces ResourceUsageTimeSeriesRecorder, a fixed-size bucketed sampler
that doubles its bucket width as operator execution grows. CPU and wall time
are recorded across addInput, getOutput, and finish phases, merged into a
single snapshot, and surfaced as "CPU and scheduled time usage over time"
in operator metrics.

JMH benchmark results show small overhead.
Benchmark                                                  (operatorCount)  (randomBucketWidth)  (randomStartTime)  (recordDelayMillis)  (resourceTimeSeries)  (snapshotCount)  Mode  Cnt        Score      Error  Units
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A                2  avgt   20      222.676 ±    1.748  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A               10  avgt   20      402.062 ±    6.396  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false              false                  N/A                   N/A              100  avgt   20     3616.961 ±  210.224  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A                2  avgt   20      277.844 ±    1.140  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A               10  avgt   20     1024.055 ±    5.733  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                false               true                  N/A                   N/A              100  avgt   20    11280.448 ±  824.108  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A                2  avgt   20      258.718 ±    2.311  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A               10  avgt   20      748.176 ±    7.178  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true              false                  N/A                   N/A              100  avgt   20     7295.883 ±  295.682  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A                2  avgt   20      308.245 ±    3.080  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A               10  avgt   20      926.292 ±    3.346  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.merge                         N/A                 true               true                  N/A                   N/A              100  avgt   20     9384.697 ±  162.079  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd                2                  N/A                N/A                  N/A                 false              N/A  avgt   20     4879.238 ±    5.494  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd                2                  N/A                N/A                  N/A                  true              N/A  avgt   20     5559.705 ±   40.382  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd               10                  N/A                N/A                  N/A                 false              N/A  avgt   20    92375.376 ±  233.255  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd               10                  N/A                N/A                  N/A                  true              N/A  avgt   20    94436.608 ±  422.369  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd              100                  N/A                N/A                  N/A                 false              N/A  avgt   20  1416322.264 ± 9655.329  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.operatorStatsAdd              100                  N/A                N/A                  N/A                  true              N/A  avgt   20  1421749.822 ± 7786.944  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                  100                   N/A              N/A  avgt   20       34.609 ±    0.020  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                  500                   N/A              N/A  avgt   20       34.709 ±    0.048  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                 1000                   N/A              N/A  avgt   20       34.755 ±    0.090  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                 2000                   N/A              N/A  avgt   20       34.771 ±    0.077  ns/op
BenchmarkResourceUsageTimeSeriesRecorder.record                        N/A                  N/A                N/A                32000                   N/A              N/A  avgt   20       34.678 ±    0.030  ns/op
@lukasz-stec lukasz-stec force-pushed the ls/2604/01-resource-usage-timeline branch from 2951467 to 3690351 Compare April 8, 2026 13:49
bucket = (int) ((nowNanos - startNanos) / bucketWidthNanos);
int sourceOffset = 0;
int targetOffset = 0;
if (startNanosOffset > 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not get how this is not important how much did we move the startNanos. The logic is not clear to me.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - startNanosOffset will be either 0 or previous bucket width - maybe worth putting in the comment.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added

Copy link
Copy Markdown
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice code

@losipiuk
Copy link
Copy Markdown
Member

@CodeRabbit full-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

The PR description image shows two stacked area charts visualizing the new metrics feature:

  1. CPU USAGE TIMESERIES BY OPERATOR TYPE — CPU time per bucket (0–5.07s) from 03:34:31 to 03:34:47 PM, broken down by operator types (ScanFilterAndProjectOperator, TableScanOperator, LookupJoinOperator, etc.)
  2. WALL TIME TIMESERIES BY OPERATOR TYPE — Wall time per bucket (0–30.74s) over the same time range, similarly broken down by operator type.

@losipiuk Sure! I'll kick off a full review of this PR now.

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

📝 Walkthrough

Walkthrough

This change introduces time-series recording of operator CPU and wall-time usage. A new ResourceUsageTimeSeriesRecorder class records timing data into fixed-size, time-bucketed histograms with dynamic expansion and snapshot/merge capabilities. The OperationTimer.OperationTiming class is extended with an overloaded constructor accepting an optional recorder instance. OperatorContext integrates the recorder by passing it to timing operations and including the captured time-series snapshot in operator metrics. Test assertions are updated to reflect the additional metric entry, and comprehensive unit and benchmark tests validate the recorder's functionality and performance.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
core/trino-main/src/test/java/io/trino/operator/TestResourceUsageTimeSeriesRecorder.java (1)

329-390: These two offset-merge tests currently cover the same case.

testMergeWithInitialBucketOffset and testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems use identical inputs and identical assertions, so the second one is not adding coverage. Either collapse them or change one to exercise a different partial-bucket scenario.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/trino-main/src/test/java/io/trino/operator/TestResourceUsageTimeSeriesRecorder.java`
around lines 329 - 390, Both tests testMergeWithInitialBucketOffset and
testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems are identical;
remove duplication by either deleting the second test or changing it to exercise
a different partial-bucket scenario. Update the second test
(testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems) to use a
different narrow snapshot start (for example start=101 or 101 with bucketWidth
1) via ResourceUsageTimeSeriesSnapshot.create and then recompute and assert the
expected merged result from merge(wide, narrow) (adjust startTimeEpochSeconds(),
bucketWidthSeconds(), and cpuNanosBuckets() expected values accordingly) so it
covers a true partial-bucket merge case rather than duplicating the first test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@core/trino-main/src/main/java/io/trino/operator/ResourceUsageTimeSeriesRecorder.java`:
- Around line 297-311: The constructor ResourceUsageTimeSeriesSnapshot currently
allows cpuNanosBuckets and wallNanosBuckets of differing lengths which later
causes index errors; add a validation in the constructor
(ResourceUsageTimeSeriesSnapshot(...)) that verifies cpuNanosBuckets.length ==
wallNanosBuckets.length and reject mismatch (e.g., via checkArgument or similar)
with a clear message mentioning both array names and their lengths so invalid
payloads from create(...) are rejected early.
- Around line 106-118: The bucket realignment currently uses startNanos %
bucketWidthNanos (Ticker origin), causing misalignment vs wall-clock; change the
offset calculation to use the wall-clock start time: compute startEpochNanos =
startTimeEpochSeconds * 1_000_000_000L (or equivalent) and set startNanosOffset
= startEpochNanos % bucketWidthNanos, then adjust startNanos = startNanos -
startNanosOffset (and keep the existing bucket, sourceOffset/targetOffset
adjustments) so the expansion aligns to startTimeEpochSeconds rather than the
Ticker origin in ResourceUsageTimeSeriesRecorder.

In
`@core/trino-main/src/test/java/io/trino/operator/BenchmarkResourceUsageTimeSeriesRecorder.java`:
- Around line 61-100: The benchmark currently reuses the same
RecordData.recorder and TestingTicker across invocations (setup in
RecordData.setup), causing growing timeline and drifting measurements; fix by
reinitializing the recorder and ticker for each benchmark invocation — either
add a new setup method annotated `@Setup`(Level.Invocation) that recreates
ResourceUsageTimeSeriesRecorder and TestingTicker (and resets ticker state and
arrays as needed) or reset recorder/ticker at the start of the record(...)
benchmark so every invocation starts with a fresh
ResourceUsageTimeSeriesRecorder and TestingTicker.

---

Nitpick comments:
In
`@core/trino-main/src/test/java/io/trino/operator/TestResourceUsageTimeSeriesRecorder.java`:
- Around line 329-390: Both tests testMergeWithInitialBucketOffset and
testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems are identical;
remove duplication by either deleting the second test or changing it to exercise
a different partial-bucket scenario. Update the second test
(testMergeWithInitialBucketOffsetDoesNotDoubleCountPartialItems) to use a
different narrow snapshot start (for example start=101 or 101 with bucketWidth
1) via ResourceUsageTimeSeriesSnapshot.create and then recompute and assert the
expected merged result from merge(wide, narrow) (adjust startTimeEpochSeconds(),
bucketWidthSeconds(), and cpuNanosBuckets() expected values accordingly) so it
covers a true partial-bucket merge case rather than duplicating the first test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e28dbd8b-314e-4005-a842-1c5d21069146

📥 Commits

Reviewing files that changed from the base of the PR and between 5a1c279 and 3690351.

📒 Files selected for processing (7)
  • core/trino-main/src/main/java/io/trino/operator/OperationTimer.java
  • core/trino-main/src/main/java/io/trino/operator/OperatorContext.java
  • core/trino-main/src/main/java/io/trino/operator/ResourceUsageTimeSeriesRecorder.java
  • core/trino-main/src/test/java/io/trino/operator/BenchmarkResourceUsageTimeSeriesRecorder.java
  • core/trino-main/src/test/java/io/trino/operator/TestResourceUsageTimeSeriesRecorder.java
  • core/trino-main/src/test/java/io/trino/operator/TestWorkProcessorOperatorAdapter.java
  • core/trino-main/src/test/java/io/trino/operator/TestWorkProcessorSourceOperatorAdapter.java

Copy link
Copy Markdown
Member Author

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments addressed

bucket = (int) ((nowNanos - startNanos) / bucketWidthNanos);
int sourceOffset = 0;
int targetOffset = 0;
if (startNanosOffset > 0) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added

@losipiuk losipiuk merged commit f3867dc into trinodb:master Apr 15, 2026
98 checks passed
@github-actions github-actions bot added this to the 481 milestone Apr 15, 2026
@ebyhr ebyhr mentioned this pull request Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants