Implementing max samples per chunk on parquet #57

alanprot · 2025-06-05T16:53:16Z

This PR introduces a new configuration option to set the maximum number of samples per chunk.

This option is particularly useful to avoid decoding large chunks when querying only a small time range within a day. For example, in Cortex, we typically split queries by "day," which means that even when querying just the last 5 minutes of the previous day, we might end up decoding an entire 8-hour chunk. This results in unnecessary CPU cycles spent processing irrelevant data.

By allowing smaller chunk sizes, we can significantly reduce the overhead of decoding, improving efficiency and performance during partial time range queries.

Signed-off-by: alanprot <[email protected]>

convert/convert.go

Signed-off-by: alanprot <[email protected]>

Implementing max samples per chunk on parquet

c634e35

Signed-off-by: alanprot <[email protected]>

MichaHoffmann approved these changes Jun 5, 2025

View reviewed changes

yeya24 approved these changes Jun 5, 2025

View reviewed changes

harry671003 approved these changes Jun 5, 2025

View reviewed changes

harry671003 reviewed Jun 5, 2025

View reviewed changes

convert/convert.go Show resolved Hide resolved

jesusvazquez reviewed Jun 5, 2025

View reviewed changes

convert/convert.go Show resolved Hide resolved

respecting max chunk samples on native histogram

1b018ad

Signed-off-by: alanprot <[email protected]>

alanprot marked this pull request as ready for review June 5, 2025 22:58

alanprot merged commit 54e249b into prometheus-community:main Jun 6, 2025
4 checks passed

alanprot deleted the implement-max-sample-per-chunk branch June 6, 2025 00:57

alanprot mentioned this pull request Jun 6, 2025

Update parquet common + fix queryStoreAfter config for parquet querier cortexproject/cortex#6799

Merged

1 task

alanprot changed the title ~~Implementing max samples per chunk on parquet (WIP)~~ Implementing max samples per chunk on parquet Jun 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implementing max samples per chunk on parquet #57

Implementing max samples per chunk on parquet #57

Uh oh!

alanprot commented Jun 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Implementing max samples per chunk on parquet #57

Implementing max samples per chunk on parquet #57

Uh oh!

Conversation

alanprot commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

alanprot commented Jun 5, 2025 •

edited

Loading