Skip to content

proposal: Change architecture for loki pipelines#4940

Merged
kalleep merged 13 commits into
mainfrom
kalleep/loki-pipeline-proposal
Mar 31, 2026
Merged

proposal: Change architecture for loki pipelines#4940
kalleep merged 13 commits into
mainfrom
kalleep/loki-pipeline-proposal

Conversation

@kalleep
Copy link
Copy Markdown
Contributor

@kalleep kalleep commented Nov 26, 2025

This pr is used to add a proposal to change how loki pipeline works to address some issues we see today.

@kalleep kalleep requested a review from a team as a code owner November 26, 2025 14:19
Comment thread docs/design/4940-reliable-loki-pipelines.md Outdated
Comment thread docs/design/4940-reliable-loki-pipelines.md Outdated
Comment thread docs/design/4940-reliable-loki-pipelines.md Outdated
Comment thread docs/design/4940-reliable-loki-pipelines.md Outdated
Comment thread docs/design/4940-reliable-loki-pipelines.md Outdated
@kalleep kalleep force-pushed the kalleep/loki-pipeline-proposal branch 2 times, most recently from 3086ef0 to 951a719 Compare December 10, 2025 13:43
@kalleep kalleep force-pushed the kalleep/loki-pipeline-proposal branch from 951a719 to fc4808e Compare February 4, 2026 10:29
@kalleep kalleep force-pushed the kalleep/loki-pipeline-proposal branch from fc4808e to 9d5e87f Compare March 9, 2026 10:44
@kalleep kalleep changed the title proposal: change architecture for loki pipelines proposal: Change architecture for loki pipelines Mar 9, 2026
Comment thread docs/design/4940-reliable-loki-pipelines.md Outdated
Comment thread docs/design/4940-reliable-loki-pipelines.md
Comment thread docs/design/4940-reliable-loki-pipelines.md Outdated
Comment thread docs/design/4940-reliable-loki-pipelines.md
kalleep added a commit that referenced this pull request Mar 19, 2026
### Pull Request Details
Migrate `loki.enrich` to use shared functions and structures for loki
components, this is done to make it consistent with other component and
to prepare for #4940.

Also before we did not use the lock when reading stored args, this could
lead to data races if a entry is processed at the same time the
component is updated.

### Issue(s) fixed by this Pull Request


<!-- Fixes #issue_id -->

### Notes to the Reviewer
Moved Consume and Drain functions to shared loki package instead of
having it in source package, we use them in non source components too.

### PR Checklist

<!-- Remove items that do not apply. For completed items, change [ ] to
[x]. -->

- [ ] Documentation added
- [x] Tests updated
- [ ] Config converters updated
kalleep added a commit that referenced this pull request Mar 24, 2026
### Pull Request Details
We have added some shared functionallity for working with loki pipelines
and should use them in our `database_observability`. This will make it
easier in the future to start to migrate to [new architecture for loki
pipelines](#4940) and avoid common
pitfalls.

I did not review all `collectors` but just did a sample and e.g.
[explain_plan](https://github.com/grafana/alloy/blob/main/internal/component/database_observability/mysql/collector/explain_plans.go#L469)
can cause a deadlock when component is stopping and in turn preventing
alloy from stopping because nothing is consuming from handler.


Changes:
1. Use
[loki.Fanout](https://github.com/grafana/alloy/blob/main/internal/component/common/loki/fanout.go#L19)
- This is doing internal locking of logs receivers so components do not
have to care about it. It will also stop forwarding when context is
canceled.
2. Use
[loki.Consume](https://github.com/grafana/alloy/blob/0ac6d7ca7c744268668189c0f11460a84c9ef458/internal/component/common/loki/consume.go#L13)
- Runs the consume loop from a `LogsReceiver` and will exit when context
is canceled.
3. Use
[loki.Drain](https://github.com/grafana/alloy/blob/0ac6d7ca7c744268668189c0f11460a84c9ef458/internal/component/common/loki/drain.go#L18)
- When components is stopping we need to still forward / drain to make
sure collectors can stop. Since
#5613 we have a deterministic order
for stopping components so it is safe to forward to downstream
components during stops but we still have a timeout after we would drain
into nothing.
4. Use
[loki.NewEntry](https://github.com/grafana/alloy/blob/0ac6d7ca7c744268668189c0f11460a84c9ef458/internal/component/common/loki/entry.go#L10)
- This makes sure we set `created` timestramp of entries so they can be
properly tracked by `loki_write_entry_propagation_latency_seconds`.

### Issue(s) fixed by this Pull Request

Part of: #5826

### Notes to the Reviewer

<!-- Add any relevant notes for the reviewers and testers of this PR.
-->

### PR Checklist

<!-- Remove items that do not apply. For completed items, change [ ] to
[x]. -->

- [ ] Documentation added
- [ ] Tests updated
- [ ] Config converters updated
@kalleep kalleep force-pushed the kalleep/loki-pipeline-proposal branch from f1bef4e to ef2a3f9 Compare March 25, 2026 13:11
Copy link
Copy Markdown
Contributor

@ptodev ptodev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. To me this is also the most obvious way to improve reliability. I'm just worried about issues with sharding caused by edge cases. I don't see anything wrong in the proposal but I just wonder if I'm missing something. Hopefully if we have thorough tests we will cover all our bases.

@kalleep kalleep merged commit e273439 into main Mar 31, 2026
50 checks passed
@kalleep kalleep deleted the kalleep/loki-pipeline-proposal branch March 31, 2026 08:32
@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Apr 15, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants