misc: Propagate full IO stats (count/min/max) to runtime metrics#15408
misc: Propagate full IO stats (count/min/max) to runtime metrics#15408rui-mo wants to merge 1 commit intofacebookincubator:mainfrom
Conversation
✅ Deploy Preview for meta-velox canceled.
|
jinchengchenghh
left a comment
There was a problem hiding this comment.
Could update the PR description?
Yuhta
left a comment
There was a problem hiding this comment.
The change itself makes sense, the only caveat is it changes the semantics of these stats drastically (so that all the stats other than sum are not comparable before & after this change). This is something worthing noting in the release note.
|
@peterenescu has imported this pull request. If you are a Meta employee, you can view this in D86311108. |
|
@rui-mo The test failures are relevant, can you double check and fix? |
|
To be specific, there are four velox exec tests failing: |
|
With this change, we may remove |
|
Hi everyone, thanks for the suggestions! I've updated the PR description and resolved the unit test failure. Could you please review it again? |
|
@peterenescu merged this pull request in a0d77b3. |
|
This PR is causing the values to overflow since Specifically due to |
|
@majetideepak Thanks for pointing this out. I opened a PR that changes RuntimeMetrics counters from signed to unsigned (int64_t -> uint64_t) #15536. Let me know your thoughts, thanks. |
This PR enhances the propagation of IO statistics into RuntimeMetric by
including not just the sum, but also count, min, and max values from the IO
layer. This affects the IO-related metrics such as:
prefetchBytes,ioWaitWallNanos,storageReadBytes,localReadBytes,ramReadBytes.Previously, these Runtime metrics had a default count of 1, and min/max were
equal to sum because no detailed IO stats were passed down. After this change,
count/min/max now reflect real observed values from the IO layer.
The metrics
maxSingleIoWaitWallNanosandnumStorageReadare removedbecause their functionality is represented by the
ioWaitWallNanos.maxandstorageReadBytes.countrespectively.Users relying on these metrics for historical analysis, monitoring, or
dashboards should be aware of this impact.