Skip to content

Conversation

@alanprot
Copy link
Collaborator

@alanprot alanprot commented Jun 5, 2025

This PR introduces a new configuration option to set the maximum number of samples per chunk.

This option is particularly useful to avoid decoding large chunks when querying only a small time range within a day. For example, in Cortex, we typically split queries by "day," which means that even when querying just the last 5 minutes of the previous day, we might end up decoding an entire 8-hour chunk. This results in unnecessary CPU cycles spent processing irrelevant data.

By allowing smaller chunk sizes, we can significantly reduce the overhead of decoding, improving efficiency and performance during partial time range queries.

image

@alanprot alanprot marked this pull request as ready for review June 5, 2025 22:58
@alanprot alanprot merged commit 54e249b into prometheus-community:main Jun 6, 2025
4 checks passed
@alanprot alanprot deleted the implement-max-sample-per-chunk branch June 6, 2025 00:57
@alanprot alanprot changed the title Implementing max samples per chunk on parquet (WIP) Implementing max samples per chunk on parquet Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants