Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't write to chunk cache if query results are outside of a desired window of time #14983

Open
mveitas opened this issue Nov 16, 2024 · 7 comments · May be fixed by #15393
Open

Don't write to chunk cache if query results are outside of a desired window of time #14983

mveitas opened this issue Nov 16, 2024 · 7 comments · May be fixed by #15393
Labels
component/cache type/feature Something new we should do

Comments

@mveitas
Copy link
Contributor

mveitas commented Nov 16, 2024

Is your feature request related to a problem? Please describe.

NOTE - This issue assumes that queriers write back to the chunk cache. For some reason I had read somewhere that this happens, but I could be 100% wrong making this issue invalid.

We recently had a requirement to provide the capability to maintain active search of our logs for 365 days. As an organization we are looking to optimize searches within the past 30 days and keep as much data in cache as possible for this window in the cache to avoid having to go to object storage to retrieve data.

From my understanding, when a query is run, the chunks that are pulled back from object storage are written to the chunks cache. We are looking to avoid the cache churn for infrequent queries that are beyond the desired duration (in this case 30 days). With this configuration we would reduce the amount of data being written tot he chunks cache and reduce the evictions for older data that might not ever be viewed again.

Describe the solution you'd like
A configuration option is provided that would allow a duration to be specified that would allow the querier to write back to the chunk cache. If the chunk falls outside of this duration, it would not write the result back to the chunk cache.

Describe alternatives you've considered
The alternative is to build a bigger caching infrastructure to handle a longer duration and absorb the older and infrequent data written to the cache

@mveitas
Copy link
Contributor Author

mveitas commented Nov 25, 2024

Maybe this is not a valid issue now that I have discovered query-ingesters-within configuration option.

Do the chunks that are pulled from the store still get written to the chunk cache if the chunks are outside of the query-ingesters-within window?

@Jayclifford345
Copy link
Contributor

Hi @mveitas, spoke with the Loki team today. @ashwanthgoli is going to investigate a little further but it still might be worth implementing this feature on initial glance.

@Jayclifford345
Copy link
Contributor

@salvacorts has also mentioned it might be worth handing queries older than 30 days to L2 cache.

We use it ourselves: e.g. we store last 3d chunks on the L1 cache (SSD-backed) and anything older than 3d is stored on the L2 (memory-backed)

I haven't tested this myself yet but I have tracked down the cache config for you:
https://grafana.com/docs/loki/latest/configure/#chunk_store_config

# Chunks will be handed off to the L2 cache after this duration. 0 to disable L2
# cache.
# CLI flag: -store.chunks-cache-l2.handoff
[l2_chunk_cache_handoff: <duration> | default = 0s]

@ashwanthgoli
Copy link
Contributor

ashwanthgoli commented Nov 26, 2024

Thanks @Jayclifford345! l2 cache would be one way to reduce the churn for the desired duration. Any chunks older than l2_chunk_cache_handoff would be routed to l2 cache which can be configured under chunk_cache_config_l2

A configuration option is provided that would allow a duration to be specified that would allow the querier to write back to the chunk cache. If the chunk falls outside of this duration, it would not write the result back to the chunk cache.

afaik there is no support to do this, happy to help if you would like to contribute.

there is a hacky way to achieve what you want, i haven't tested it but it might work.

  • configure chunk_cache_config_l2 to point to the same cache instance as l1 and set l2_chunk_cache_handoff to 30d
  • update the writeback_buffer to a very small value for l2 cache so we drop most of the chunks by limiting the length of the queue that writes to the cache

edit: in addition setting writeback_goroutines to 0 might be more effective as it stops consuming from the queue

@JStickler JStickler added type/feature Something new we should do component/cache labels Nov 26, 2024
@mveitas
Copy link
Contributor Author

mveitas commented Dec 11, 2024

@ashwanthgoli This is something we are going to take on as we have a very large retention window that we need to have in place. While your workaround with the L2 cache would probably work, it's cleaner to have a separate configuration.

@mveitas
Copy link
Contributor Author

mveitas commented Dec 11, 2024

As an alternative would it make sense to create a no-op cache implementation and configure the L2 cache to use that?

@ashwanthgoli
Copy link
Contributor

prefer adding a new config to not store chunks older than the configured period
simpler than configuring chunk_cache_config_l2 with no-op cache, but not opposed to this either :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/cache type/feature Something new we should do
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants