Skip to content

Conversation

@francoposa
Copy link
Contributor

@francoposa francoposa commented May 19, 2025

This approach is being taken so as we iterate in dev we do not have to maintain an enormous branch and constantly rebase.

I believe this structure for the dependency code is least likely to cause us import cycles, whether we put the converter as a subpackage of pkg/parquet or standing completely on its own.

In any case, we can always adjust.

patches w.r.t upstream:

  • replace efficient-go errors with pkg/errors
  • satisfy mimir-prometheus' extended ChunkSeries interface with ChunkCount method
  • fix a bad use of require.Len for a labels.Labels not satisfying len() builtin - I believe this is a stringlabels/slicelabels issue
  • further small stringlabels/slicelabels fixes
  • lint: replace == error comparisons with errors.Is
  • lint: replace sort.Strings usages with slices.Sort
  • lint: use context.WithCancelCause
  • apply all correct license headers with upstream attribution

What this PR does

Brings in code from https://github.com/prometheus-community/parquet-common @ commit 382b6ec8 into a new package.

No new components or services utilized yet - nothing new deployed.

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@francoposa francoposa changed the title [WIP] bring in prometheus/parquet-common code to new package bring in prometheus/parquet-common code to new package May 19, 2025
@francoposa francoposa marked this pull request as ready for review May 19, 2025 22:14
@francoposa francoposa requested review from a team and stevesg as code owners May 19, 2025 22:14
Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be preferable to use Renovate's PRs for updating objstore (you can patch this PR with your fixes) and cloud.google.com/go/storage. Plus, I don't think this PR should be removing the toolchain directive and its explanatory comment.

Comment on lines 30 to 34
func TestPromQLAcceptance(t *testing.T) {
if testing.Short() {
t.Skip("Skipping, because 'short' flag was set")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test took ~15min in the CI. Is that a concern? Do we want to make it opt-in rather than opt-out? I'm mostly thinking about the impact on other mimir devs, not us

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our CI health is already pretty terrible. This should be opt-in, 15m is a non-starter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fixed already to be opt-in with env var

@francoposa francoposa force-pushed the francoposa/initial-parquet-package-to-main branch from 276779b to 56fa1e6 Compare May 20, 2025 16:58
@francoposa
Copy link
Contributor Author

@aknuds1 The toolchain directive came back anyway when I copied over the go.mod from main to fix conflicts, so we should be good to go

I also used the env var approach to prevent the super long PromQL acceptance tests from running in CI

@francoposa francoposa requested review from 56quarters and aknuds1 May 20, 2025 17:39
@francoposa francoposa changed the base branch from main to parquet-main May 23, 2025 16:35
@francoposa francoposa merged commit d2f8e2e into parquet-main May 23, 2025
30 checks passed
@francoposa francoposa deleted the francoposa/initial-parquet-package-to-main branch May 23, 2025 16:41
npazosmendez pushed a commit that referenced this pull request May 27, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
jesusvazquez pushed a commit that referenced this pull request Jun 11, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
francoposa added a commit that referenced this pull request Jun 12, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
jesusvazquez pushed a commit that referenced this pull request Jun 16, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
jesusvazquez pushed a commit that referenced this pull request Jun 17, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
npazosmendez pushed a commit that referenced this pull request Jun 24, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
npazosmendez pushed a commit that referenced this pull request Jun 25, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
npazosmendez pushed a commit that referenced this pull request Jun 26, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
npazosmendez pushed a commit that referenced this pull request Jun 27, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
jesusvazquez pushed a commit that referenced this pull request Jun 30, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
npazosmendez pushed a commit that referenced this pull request Jul 1, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
npazosmendez pushed a commit that referenced this pull request Jul 7, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
francoposa added a commit that referenced this pull request Jul 8, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
francoposa added a commit that referenced this pull request Jul 8, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
jesusvazquez pushed a commit that referenced this pull request Jul 10, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
jesusvazquez pushed a commit that referenced this pull request Jul 14, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
npazosmendez pushed a commit that referenced this pull request Jul 18, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
francoposa added a commit that referenced this pull request Jul 21, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again
francoposa added a commit that referenced this pull request Jul 31, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
jesusvazquez pushed a commit that referenced this pull request Aug 2, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
jesusvazquez pushed a commit that referenced this pull request Aug 2, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
jesusvazquez pushed a commit that referenced this pull request Aug 8, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
jesusvazquez pushed a commit that referenced this pull request Aug 8, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
francoposa added a commit that referenced this pull request Aug 11, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
francoposa added a commit that referenced this pull request Aug 11, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
francoposa added a commit that referenced this pull request Aug 12, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
npazosmendez pushed a commit that referenced this pull request Aug 18, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
jesusvazquez pushed a commit that referenced this pull request Sep 19, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
npazosmendez pushed a commit that referenced this pull request Oct 6, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
npazosmendez pushed a commit that referenced this pull request Oct 17, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
npazosmendez pushed a commit that referenced this pull request Oct 28, 2025
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface

* revert breaking upgrade to thanos/objstore

* fix test require

* attempt to update go version for strange errors

* fix stringlabels issues

* update license headers with AGPL and upstream attribution

* fix errors.Is lints

fix errors.Is lints

* fix sort and cancel cause lints

* correct go.mod & vendor in from main to solve conflicts

* use env var to flag parquet promql acceptance

* fix deps from main again

* fix deps from main again

* fix deps from main again

* fix deps from main again

implement new parquet-converter service (#11499)

* bring in parquet-converter from parquet-mimir PoC

* make docs

* make reference-help

* stop using the compactor's config

* remove BlockRanges config, convert all levels of blocks

* drop unused BlockWithExtension struct

* rename ownBlock to own

* move index fetch outside of for loop

* lowercase logs

* wording: compact => convert

* some cleanup

* skip blocks for which compaction mark failed download

* simplfy convertBlock function

* cleanup

* Write Compact Mark

* remove parquetIndex, we don't neeed it

yet at least

* use MetaFetcher to discover blocks

* make reference-help and mark as experimental

* cleanup: we don't need indexes anymore

* revert index loader changes

* basic TestParquetConverter

* make reference-help

* lint

* happy linter

* make docs

* fix: correctly initialize memerlist KV for  parquet converter

* lint: sort lines

* more wording fixes: compact => convert

* licence header

* version 1

* remove parquet-converter from 'backend' and 'all' modules

it's experimental and meant to be run alone

* address docs feedback

* remove unused consts

* increase timeout for a test

TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords

parquet-converter: Introduce metrics and ring test (#11600)

* parquet-converter: Introduce metrics and ring test

This commit introduces a ring test to verify that sharding is working as
expected.

It also introduces metrics to measure total conversions, failures and
durations.

Signed-off-by: Jesus Vazquez <[email protected]>

converter: proper error handling to measure failures

parquet converter in docker compose (#11633)

* add parquet-converter to docker-compose microservices setup

* format jsonnet

fix(parquet converter): close TSDB block after conversion (#11635)

parquet: vendor back from parquet-common (#11644)

introduce store-gateway.parquet-enabled flag & docs (#11722)

upgrade prometheus parquet-common dependency (#11723)

parquet store-gateways introduce stores interface (#11724)

* declare Stores interface satisfied by BucketStores and future Parquet store

* add casts to for uses of existing impl which are not protected by interface

* stub out parquet bucket stores implementation

* most minimal initialization of Parquet Bucket Stores when flag is enabled

* license header

parquet: Scaffolding for parquet bucket store Series() (#11729)

* parquet: Scaffolding for parquet bucket store

* use parquetshardopener and be sure to close them

* gci pkg/storegateway/parquet_bucket_stores.go

Signed-off-by: Jesus Vazquez <[email protected]>

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix split between Parquet Stores and each tenant's Store (#11735)

fix split between Parquet Stores and each tenant's Store

parquet store-gateways blocks sync and  lazy reader (#11759)

parquet-bucket-store: finish implementing Stores interface (#11772)

We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: Nicolas Pazos <[email protected]>
Co-authored-by: Nicolás Pazos <[email protected]>

fix(parquet): share `ReaderPoolMetrics` instance (#11851)

We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.

fix(parquet store gateway): close things that should be closed (#11865)

feat(parquet store gateway): support download labels file without validating (#11866)

Signed-off-by: Nicolás Pazos <[email protected]>
Co-authored-by: francoposa <[email protected]>

fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)

fix: don't stop nil services

fix(parquet store gateways): correctly locate labels parquet files locally (#11894)

parquet bucket store: add some debug logging (#11925)

Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.

Signed-off-by: Jesus Vazquez <[email protected]>

parquet store gateways: several fixes and basic tests  (#11929)

Co-authored-by: francoposa <[email protected]>
Co-authored-by: Jesus Vazquez <[email protected]>

parquet converter: include user id in converter counter metrics (#11966)

Adding user id to the converter metrics to better track converter
progress through tenants.

Signed-off-by: Jesus Vazquez <[email protected]>

Parquet converter: Implement priority queue for block conversion (#11980)

This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.

* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
    - cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed

The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.

---------

Signed-off-by: Jesus Vazquez <[email protected]>

fix(parquet store gateway): obey query sharding matchers (#12018)

Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.

It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.

For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047

fix(parquet store gateway): panic in Series call with SkipChunks (#12020)

`chunksIt` is `nil` when `SkipChunks` is `true`.

parquet-converter debug log messages (#12021)

Co-authored-by: Jesus Vazquez <[email protected]>

chore(parquet): Bump parquet-common dependency (#12023)

Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)

See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70

---------

Co-authored-by: francoposa <[email protected]>

feature(parquet): Implement store-gateway limits (#12040)

This PR is based on the upstream work
prometheus-community/parquet-common#81

The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.

Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants