Skip to content

fix(loki.source.file): Fix position tracking when component stops#5800

Merged
kalleep merged 5 commits into
mainfrom
kalleep/loki-source-file-stop-without-drain
Mar 18, 2026
Merged

fix(loki.source.file): Fix position tracking when component stops#5800
kalleep merged 5 commits into
mainfrom
kalleep/loki-source-file-stop-without-drain

Conversation

@kalleep
Copy link
Copy Markdown
Contributor

@kalleep kalleep commented Mar 17, 2026

Pull Request Details

Usage of source.Drain is an indicator that something else is not working correctly. This pr will fix the root cause i.e. a tailer being stuck on sending next entry to component handler.

I added a test that on main will fail but with the fix works. With this we no longer need to drain channel and we only advance position if we managed to send entry to component.

Issue(s) fixed by this Pull Request

Notes to the Reviewer

PR Checklist

  • Documentation added
  • Tests updated
  • Config converters updated

@kalleep kalleep requested a review from a team as a code owner March 17, 2026 10:30
@kalleep kalleep requested a review from Copilot March 17, 2026 10:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the loki.source.file tailer shutdown behavior to avoid deadlocks when cancellation occurs while a log entry send is blocked, and adjusts component shutdown accordingly.

Changes:

  • Make tailer.readLines abortable via context when a send to the receiver channel is blocked.
  • Replace the tailer “done channel” shutdown signaling with a sync.WaitGroup-based approach.
  • Simplify loki.source.file component shutdown and add a regression test for cancel-while-send-blocked.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
internal/component/loki/source/file/tailer.go Makes blocked sends cancelable and changes shutdown coordination to use a WaitGroup.
internal/component/loki/source/file/tailer_test.go Adds a regression test for cancellation during blocked sends.
internal/component/loki/source/file/file.go Simplifies shutdown path now that tailers can exit cleanly without a drain routine.

Comment thread internal/component/loki/source/file/tailer.go Outdated
Comment thread internal/component/loki/source/file/tailer_test.go
Comment thread internal/component/loki/source/file/tailer_test.go
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@ptodev ptodev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I personally can't think of any downsides to this. I don't think we were making some tradeoff to get better performance or anything like that.

@kalleep kalleep merged commit 9762946 into main Mar 18, 2026
47 checks passed
@kalleep kalleep deleted the kalleep/loki-source-file-stop-without-drain branch March 18, 2026 08:01
blewis12 pushed a commit that referenced this pull request Mar 30, 2026
🤖 I have created a release *beep* *boop*
---


## [1.15.0](v1.14.0...v1.15.0)
(2026-03-26)


### ⚠ BREAKING CHANGES

* **otelcol:** Upgrade to OTel Collector v0.147.0
([#5784](#5784))
* Renamed undocumented metrics that was previously prefixed with
<component_id>_<metric_name> to loki_source_awsfirehose_<metric_name>

### Features 🌟

* **alloy-mixin:** Add filters, groupBy, and multi-select dashboard
variables ([#5611](#5611))
([3ef714e](3ef714e))
* **beyla.ebpf:** Add support for Prometheus native histograms
([#5812](#5812))
([7d806fb](7d806fb))
* **beyla.ebpf:** Bump Beyla to v3.6
([#5833](#5833))
([cd878d5](cd878d5))
* **converters:** Support converting Promtail limits_config
([#5777](#5777))
([9491385](9491385))
* **database_observability.mysql:** Add filtering of query samples and
wait events by minimum duration
([#5678](#5678))
([5a4d03b](5a4d03b))
* **database_observability.mysql:** Embed prometheus exporter within
db-o11y component
([#5711](#5711))
([88bffb0](88bffb0))
* **database_observability.postgres:** Add configurable limit to
`pg_stat_statements` query
([#5639](#5639))
([0de0a3f](0de0a3f))
* **database_observability.postgres:** Embed prometheus exporter within
db-o11y component
([#5714](#5714))
([9dc2e83](9dc2e83))
* **database_observability:** Add scaffolding for db-o11y integration
tests ([#5575](#5575))
([ca637d8](ca637d8))
* **database_observability:** Promote components to stable
([#5736](#5736))
([21a9af6](21a9af6))
* Expose Functionality to Handle syslogs with Empty MSG Field
([#5687](#5687))
([178b1e6](178b1e6))
* **helm:** Allow setting `revisionHistoryLimit` in the helm chart
([#5847](#5847))
([9713ad4](9713ad4))
* **loki.process:** Support structured metadata as source type of
stage.labels for loki.process
([#5055](#5055))
([eda3152](eda3152))
* **loki.secretfilter:** Add sampling for secretfilter entries
([#5663](#5663))
([9997802](9997802))
* **loki.source.gcplog:** Add alloy config for MaxOutstandingBytes and
MaxOutstandingMessages
([#5760](#5760))
([c2b9f0b](c2b9f0b))
* **loki.write:** Add loki pipeline latency metric
([#5702](#5702))
([cc744a1](cc744a1))
* **mixin:** Update loki dashboard
([#5848](#5848))
([b616d58](b616d58))
* **otelcol.receiver.datadog:** Expose intake proxy and
trace_id_cache_size settings
([#5776](#5776))
([0384ad4](0384ad4))
* **otelcol:** Upgrade to OTel Collector v0.147.0
([#5784](#5784))
([a9b5396](a9b5396))
* **prometheus.exporter.cloudwatch:** Use aws-sdk-go-v2 by default
([#5768](#5768))
([a2f3489](a2f3489))
* **pyroscope.ebpf:** Add comm, pid labels and kernel frame options
([#5769](#5769))
([4fa7068](4fa7068))
* **pyroscope.ebpf:** Expose OTel eBPF profiler internal metrics to
Prometheus ([#5774](#5774))
([e713392](e713392))
* **pyroscope:** Copy prometheus common/config HTTP client into
promhttp2 package
([#5810](#5810))
([0b31aaa](0b31aaa))


### Bug Fixes 🐛

* **beyla:** Inject Beyla version into binary via ldflags
([#5735](#5735))
([71c03ec](71c03ec))
* Correctly handle the deprecated topic field in otelcol.receiver.kafka
configuration ([#5726](#5726))
([538ac75](538ac75))
* **database_observability.mysql:** Ensure result sets are properly
closed ([#5893](#5893))
([f28f91c](f28f91c))
* **database_observability:** Ensure all collectors are properly stopped
([#5796](#5796))
([6bfa2a7](6bfa2a7))
* **database_observability:** Ensure that `connection_info` metric is
only emitted for a given DB instance when it is available
([#5707](#5707))
([bf0c3dc](bf0c3dc))
* **database_observability:** Solve test flakiness in MySQL and Postgres
sample collectors
([#5130](#5130))
([a7590d1](a7590d1))
* **deps:** Update module github.com/buger/jsonparser to v1.1.2
[SECURITY] ([#5834](#5834))
([b2fee8a](b2fee8a))
* **deps:** Update module github.com/buger/jsonparser to v1.1.2
[SECURITY] ([#5870](#5870))
([698b4e7](698b4e7))
* **deps:** Update module google.golang.org/grpc to v1.79.3 [SECURITY]
([#5825](#5825))
([5cfbcc4](5cfbcc4))
* **deps:** Update module google.golang.org/grpc to v1.79.3 [SECURITY]
([#5871](#5871))
([259152d](259152d))
* **deps:** Update npm dependencies
([#5876](#5876))
([f0f6a11](f0f6a11))
* **deps:** Update npm deps across repo to address CVE-2026-26996 and
CVE-2026-22029 ([#5872](#5872))
([df518dd](df518dd))
* **go:** Update build image to go v1.25.8
([#5832](#5832))
([f9b3043](f9b3043))
* **go:** Update go to 1.25.8
([#5844](#5844))
([534e7db](534e7db))
* Helm: alloy.extraPorts not working with service.type=NodePort [COPY]
([#5892](#5892))
([162c6f7](162c6f7))
* **loki.enrich:** Use shared loki functions and fix locking
([#5821](#5821))
([f916c72](f916c72))
* **loki.process:** Multiline no longer pass empty entry if start was
flushed ([#5746](#5746))
([7bdedf1](7bdedf1))
* **loki.process:** Protect against json that does not look like docker
json format ([#5761](#5761))
([0af6eaa](0af6eaa))
* **loki.secretfilter:** Fix bug where entries were being shadow dropped
([#5786](#5786))
([90243f9](90243f9))
* **loki.source.file:** Fix position tracking when component stops
([#5800](#5800))
([9762946](9762946))
* **loki.source.file:** Keep positions for compressed files when reading
is finished ([#5723](#5723))
([fb41d0a](fb41d0a))
* **loki.source.gcplog:** Update to pubsub v2 and fix shutdown semantics
([#5713](#5713))
([e9d9b69](e9d9b69))
* **loki.source.heroku:** Fix shutdown semantics and consume logs in
batches ([#5804](#5804))
([deda452](deda452))
* **loki.write:** Remove noisy log
([#5837](#5837))
([8e28f35](8e28f35))
* **loki:** Make drain forward entries with fallback timeout
([#5830](#5830))
([cfbca90](cfbca90))
* **prometheus.scrape:** Update arguments and targets even if
`scrape_native_histograms` and `extra_metrics` are updated
([#5787](#5787))
([dc4cb0a](dc4cb0a))
* **pyroscope.ebpf:** Update opentelemetry-ebpf-profiler
([#5904](#5904))
([dfaec47](dfaec47))
* Stop components in a deterministic order
([#5613](#5613))
([00cd371](00cd371))


### Chores

* Use shared source structures for aws firehose
([#5739](#5739))
([aef19dc](aef19dc))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: grafana-alloybot[bot] <167359181+grafana-alloybot[bot]@users.noreply.github.com>
@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Apr 2, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants