Skip to content

Track bytes read even when its a FragmentResultCache hit#22145

Merged
NikhilCollooru merged 1 commit intoprestodb:masterfrom
NikhilCollooru:frcByteReadFix
Apr 16, 2024
Merged

Track bytes read even when its a FragmentResultCache hit#22145
NikhilCollooru merged 1 commit intoprestodb:masterfrom
NikhilCollooru:frcByteReadFix

Conversation

@NikhilCollooru
Copy link
Contributor

@NikhilCollooru NikhilCollooru commented Mar 9, 2024

Description

Track bytes read even when its a FragmentResultCache hit

Motivation and Context

Currently, the input bytes, positions read for a query are reported
as 0 when there is a 100% fragment result cache hit. But for some
usecases its helpful to report the input data size even when its
a fragment result cache hit.
This PR adds support to track the input data size for every fragment
we cache.

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Add support for tracking of the input data size even when there is a fragment result cache hit. This can be enabled by setting the config ```fragment-result-cache.input-data-stats-enabled=true```.

@NikhilCollooru
Copy link
Contributor Author

NikhilCollooru commented Mar 22, 2024

The numbers are not reported correctly even after this change. There is some bug, still need to fix it.

with 0% FRC cache hit rate : 20240322_172109_00004_ay76e (input size: 2.29GB , input rows : 16,622,497)
with 100% FRC cache hit rate: 20240322_172356_00005_ay76e (from runtime stats the input dat : 145MB , from query plan the input rows :3,829,327)

@jainxrohit
Copy link
Contributor

@NikhilCollooru Is there an easy way to test and reproduce this? I wanted to use this to debug.

@NikhilCollooru NikhilCollooru force-pushed the frcByteReadFix branch 3 times, most recently from 38398b4 to 3dbe3c7 Compare April 11, 2024 09:15
@NikhilCollooru
Copy link
Contributor Author

The trick is to use sourceOperator instead of activeOperators(0). After the change to use the sourceOperator, i see things are working correctly as expected.

@NikhilCollooru NikhilCollooru marked this pull request as ready for review April 11, 2024 09:22
@NikhilCollooru NikhilCollooru requested a review from a team as a code owner April 11, 2024 09:22
jainxrohit
jainxrohit previously approved these changes Apr 15, 2024
@jainxrohit
Copy link
Contributor

@NikhilCollooru Can we add a test cases which verifies the input data size?

…esult cache

Currently, the input bytes read for a query are reported
as 0 when there is a 100% fragment result cache hit. But for some
usecases its helpful to report the input data size even when its
a fragment result cache hit.
This PR adds support to track the input data size for every fragment
we cache.
@NikhilCollooru
Copy link
Contributor Author

@NikhilCollooru Can we add a test cases which verifies the input data size?

done.

@NikhilCollooru NikhilCollooru merged commit a846d78 into prestodb:master Apr 16, 2024
@NikhilCollooru NikhilCollooru deleted the frcByteReadFix branch April 16, 2024 20:00
@wanglinsong wanglinsong mentioned this pull request May 1, 2024
48 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants