fix(base-cluster/loki): adjust retention settings for loki logs by tasches · Pull Request #1745 · teutonet/teutonet-helm-charts

tasches · 2025-10-17T08:17:27Z

Summary by CodeRabbit

Chores
- Updated log retention to 45 days.
- Introduced compactor tuning: 15m compaction interval, 24h deletion delay, and 150 deletion workers.
- Removed the legacy retention mechanism and consolidated retention/deletion configuration.

coderabbitai · 2025-10-17T08:17:36Z

Walkthrough

Replaces Loki HelmRelease tableManager retention configuration with limits_config.retention_period: 45d, adds compactor settings (compaction_interval: 15m, retention_delete_delay: 24h, retention_delete_worker_count: 150), and removes the prior tableManager block.

Changes

Cohort / File(s)	Summary
Loki Helm values (monitoring/logs) `charts/base-cluster/templates/monitoring/logs/loki.yaml`	Added `limits_config.retention_period: 45d`; added `compactor` settings: `compaction_interval: 15m`, `retention_delete_delay: 24h`, `retention_delete_worker_count: 150`; removed `tableManager` block that had `retention_deletes_enabled: true` and `retention_period: 6w`.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Helm as Helm chart
    participant Values as Loki values
    participant Loki as Loki components
    participant Compactor as Compactor process

    Helm->>Values: apply `loki.loki` values
    Values->>Loki: set `limits_config.retention_period = 45d`
    Values->>Compactor: configure compactor settings
    note over Compactor: compaction_interval: 15m\nretention_delete_delay: 24h\nretention_delete_worker_count: 150
    Compactor->>Loki: perform compaction & retention deletion runs
    note right of Loki: `tableManager` retention config removed

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Files to check closely:
- charts/base-cluster/templates/monitoring/logs/loki.yaml — verify YAML structure, indentation, and compatibility with Loki Helm chart version and schema.
- Ensure removal of tableManager does not leave deprecated references elsewhere.

Suggested reviewers

teutonet-bot

Poem

🐰 I hopped through charts with a twitch and a hum,

Set logs to forty-five days — tidy and plumb.
Compactors now chatter every fifteen minutes bright,
Old tableManager vanished into the night,
I nibble on code and watch deletions wait one day with delight.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "fix(base-cluster/loki): adjust retention settings for loki logs" directly and accurately describes the main changes in the pull request. The modifications—adding a retention_period to limits_config, configuring the compactor with retention deletion settings, and removing the legacy tableManager retention configuration—are all about adjusting how Loki handles log retention. The title is concise (63 characters, 7 words), uses clear language, and follows conventional semantic versioning format. A teammate scanning the git history would immediately understand that this PR involves changes to Loki's retention configuration.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/loki-retention

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b61010 and 711d502.

📒 Files selected for processing (1)

charts/base-cluster/templates/monitoring/logs/loki.yaml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Update release-please config file for a possibly new chart
GitHub Check: check licenses
GitHub Check: lint helm chart (base-cluster)

🔇 Additional comments (2)

charts/base-cluster/templates/monitoring/logs/loki.yaml (2)

80-81: Retention configuration migrated to limits_config.

The shift from legacy tableManager to limits_config with explicit retention_period is the correct modern approach for Loki 2.9+. The 45-day retention (45d) is a reasonable increase from the previous 6-week (42d) setting.

Please confirm:

What is the target Loki Helm chart version? (Ensure it supports limits_config & compactor retention settings)

Is there a migration plan for existing Loki instances or data that may be affected by the retention change?

85-87: Verify the aggressive retention_delete_delay setting.

The default retention_delete_delay is 2 hours, but your configuration sets it to 1m. A 2-hour delay introduces a buffer before permanently deleting logs, ensuring a window in case of accidental deletion or misconfigurations.

The 1-minute setting is unusually aggressive. While the delay allows index gateways to pull modified index files that don't contain references to marked chunks, preventing queries from failing against deleted chunks, and provides a short window to cancel chunk deletion in case of configuration mistakes, a 1-minute window may be too tight for safety.

In contrast, retention_delete_worker_count: 150 aligns with the default value of 150, which is appropriate.

Please confirm whether the 1-minute delay is intentional for your retention strategy, or if it should be adjusted to a more conservative value like 2 hours for better safety margins.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

charts/base-cluster/templates/monitoring/logs/loki.yaml (1)

80-81: Consider adding additional limits configuration parameters.

The limits_config currently only sets retention_period. Depending on your cluster's log ingestion profile and SingleBinary resource constraints, you may want to add parameters such as:

ingestion_rate_mb: Controls ingestion rate per tenant (default: 3)

max_streams_per_user: Limits the number of log streams per tenant

max_cache_freshness_period: Controls cache freshness for queries

This is not a blocker, as the current minimal configuration is valid, but consider whether rate limits or cache settings are needed for your environment.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f1fd37 and 74666fe.

📒 Files selected for processing (1)

charts/base-cluster/templates/monitoring/logs/loki.yaml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: check licenses
GitHub Check: Update release-please config file for a possibly new chart
GitHub Check: lint helm chart (base-cluster)

🔇 Additional comments (2)

charts/base-cluster/templates/monitoring/logs/loki.yaml (2)

80-87: ✅ Retention settings appropriately corrected.

The changes address the previous critical review comment: retention_delete_delay has been increased from 1m to 24h, which is the safe threshold for production Loki deployments. This prevents premature data loss and aligns with official Loki documentation recommendations. The retention_delete_worker_count: 150 and compaction_interval: 15m are appropriate for SingleBinary deployments.

80-87: ⚠️ Summary mentions tableManager removal, but it's not present in the provided code.

The AI summary states the PR "removes the prior tableManager block," but the full file context does not show a tableManager section. Either it was removed in a prior commit, or the summary refers to a previous iteration. This is a minor inconsistency but doesn't affect the correctness of the current changes.

🤖 I have created a release *beep* *boop* --- ## [10.0.1](base-cluster-v10.0.0...base-cluster-v10.0.1) (2025-10-27) ### Bug Fixes * **base-cluster/descheduler:** don't remove pods with too many restarts ([#1744](#1744)) ([9c1ed51](9c1ed51)) * **base-cluster/ingress:** add missing `prometheus` block 🙄 ([#1767](#1767)) ([a329e1a](a329e1a)) * **base-cluster/loki:** adjust retention settings for loki logs ([#1745](#1745)) ([1985d34](1985d34)) * **base-cluster/monitoring:** use the correct prometheus datasource id ([#1764](#1764)) ([511cc84](511cc84)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).  ## Summary by CodeRabbit * **Bug Fixes** * Fixed descheduler to prevent removal of pods with excessive restart counts * Added missing Prometheus monitoring configuration to ingress * Adjusted log retention settings in Loki * Corrected Prometheus datasource ID in monitoring  --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>

tasches self-assigned this Oct 17, 2025

tasches requested review from cwrau, marvinWolff and teutonet-bot as code owners October 17, 2025 08:17

teutonet-bot added the base-cluster label Oct 17, 2025

coderabbitai Bot reviewed Oct 17, 2025

View reviewed changes

Comment thread charts/base-cluster/templates/monitoring/logs/loki.yaml

cwrau previously approved these changes Oct 17, 2025

View reviewed changes

tasches dismissed cwrau’s stale review via 1434d09 October 17, 2025 09:11

tasches force-pushed the fix/loki-retention branch from 1434d09 to 0f1fd37 Compare October 17, 2025 10:38

tasches added 2 commits October 27, 2025 10:44

fix(base-cluster/loki): adjust retention settings for loki logs

1a8b71a

fix(base-cluster/loki): increase deletion delay to 24h

74666fe

marvinWolff force-pushed the fix/loki-retention branch from 0f1fd37 to 74666fe Compare October 27, 2025 09:45

coderabbitai Bot reviewed Oct 27, 2025

View reviewed changes

marvinWolff approved these changes Oct 27, 2025

View reviewed changes

marvinWolff added this pull request to the merge queue Oct 27, 2025

Merged via the queue into main with commit 1985d34 Oct 27, 2025
31 of 32 checks passed

marvinWolff deleted the fix/loki-retention branch October 27, 2025 10:10

teutonet-bot mentioned this pull request Oct 27, 2025

chore(main): [bot] release base-cluster:10.0.1 #1765

Merged

This was referenced Oct 29, 2025

chore(base-cluster): add loki retention value #1774

Merged

chore(base-cluster): change value name to camel #1777

Merged

This was referenced Nov 28, 2025

chore(main): [bot] release base-cluster:10.1.2 #1841

Merged

chore(main): [bot] release base-cluster:11.0.0 #1862

Merged

teutonet-bot mentioned this pull request Mar 17, 2026

chore(main): [bot] release base-cluster:11.1.1 #2023

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(base-cluster/loki): adjust retention settings for loki logs#1745

fix(base-cluster/loki): adjust retention settings for loki logs#1745
marvinWolff merged 2 commits intomainfrom
fix/loki-retention

tasches commented Oct 17, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Oct 17, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

tasches commented Oct 17, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tasches commented Oct 17, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 17, 2025 •

edited

Loading