Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
639204d
loki.write: implement sharding
kalleep Nov 19, 2025
4181ced
Update docs/sources/reference/components/loki/loki.write.md
kalleep Nov 19, 2025
cf85550
Update docs/sources/reference/components/loki/loki.write.md
kalleep Nov 20, 2025
307aeb0
Update docs/sources/reference/components/loki/loki.write.md
kalleep Nov 20, 2025
601ddec
Update docs/sources/reference/components/loki/loki.write.md
kalleep Nov 20, 2025
39dcafa
Update internal/component/common/loki/client/config.go
kalleep Nov 20, 2025
fce3f82
Update "client" to "endpoint" to better match documentation
kalleep Nov 20, 2025
ef92489
Update naming and comments
kalleep Nov 20, 2025
1b17a30
Refactor so we can reuse endpoint for wal and non wal implementation
kalleep Nov 20, 2025
8c47db9
wrapp close done in once
kalleep Nov 20, 2025
22483b9
update comment
kalleep Nov 20, 2025
b561650
fix
kalleep Nov 20, 2025
aaebdd5
Fix metric
kalleep Nov 21, 2025
fbd640f
unexport constants
kalleep Nov 21, 2025
6e405ca
unexport client metrics
kalleep Nov 21, 2025
36fbaab
fix test
kalleep Nov 21, 2025
def54ab
fix test
kalleep Nov 21, 2025
94f680e
fix race where flushAndShutdown holds mutex while we try to drain the
kalleep Nov 21, 2025
b7c1f21
Update comment
kalleep Nov 24, 2025
ef40a0a
Chaning queue_config is marked as experimental and add test to check it
kalleep Jan 12, 2026
ea9f263
Add back experimental banner
kalleep Jan 12, 2026
757dfa0
Remove duplicated validation
kalleep Jan 14, 2026
19d9b79
Add log when we fail to drain the whole queue during shutdown
kalleep Jan 14, 2026
049f018
Update docs to describe how queue size is calculated and how the memory
kalleep Jan 14, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 12 additions & 5 deletions docs/sources/reference/components/loki/loki.write.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ You can use the following blocks with `loki.write`:
| `endpoint` > [`basic_auth`][basic_auth] | Configure `basic_auth` for authenticating to the endpoint. | no |
| `endpoint` > [`oauth2`][oauth2] | Configure OAuth 2.0 for authenticating to the endpoint. | no |
| `endpoint` > `oauth2` > [`tls_config`][tls_config] | Configure TLS settings for connecting to the endpoint. | no |
| `endpoint` > [`queue_config`][queue_config] | When WAL is enabled, configures the queue client. | no |
| `endpoint` > [`queue_config`][queue_config] | Configure the queue used for the endpoint. | no |
| `endpoint` > [`tls_config`][tls_config] | Configure TLS settings for connecting to the endpoint. | no |
| [`wal`][wal] | Write-ahead log configuration. | no |

Expand Down Expand Up @@ -104,8 +104,9 @@ The following arguments are supported:
If no `tenant_id` is provided, the component assumes that the Loki instance at `endpoint` is running in single-tenant mode and no X-Scope-OrgID header is sent.

When multiple `endpoint` blocks are provided, the `loki.write` component creates a client for each.
Received log entries are fanned-out to these clients in succession.
That means that if one client is bottlenecked, it may impact the rest.
Received log entries are fanned-out to these endpoints in succession. That means that if one endpoint is bottlenecked, it may impact the rest.

Each endpoint has a _queue_ of batches to be sent. The `queue_config` block can be used to customize the behavior of this queue.

Endpoints can be named for easier identification in debug metrics by using the `name` argument. If the `name` argument isn't provided, a name is generated based on a hash of the endpoint settings.

Expand All @@ -129,15 +130,21 @@ When `retry_on_http_429` is enabled, the retry mechanism is governed by the back

{{< docs/shared lookup="stability/experimental_feature.md" source="alloy" version="<ALLOY_VERSION>" >}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this continue to be experimental?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we could keep this as experimental. But we would always use this config after this pr.

What would be considered experimental with it would be naming and changing defaults I guess.


The optional `queue_config` block configures, when WAL is enabled, how the underlying client queues batches of logs sent to Loki.
Refer to [Write-Ahead block](#wal) for more information.
The optional `queue_config` block configures how the endpoint queues batches of logs sent to Loki.
Comment thread
thampiotr marked this conversation as resolved.

The following arguments are supported:

| Name | Type | Description | Default | Required |
| --------------- | ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | -------- |
| `capacity` | `string` | Controls the size of the underlying send queue buffer. This setting should be considered a worst-case scenario of memory consumption, in which all enqueued batches are full. | `10MiB` | no |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `capacity` | `string` | Controls the size of the underlying send queue buffer. This setting should be considered a worst-case scenario of memory consumption, in which all enqueued batches are full. | `10MiB` | no |
| `capacity` | `string` | Controls the size of the underlying send queue buffer of each shard. Consider this setting as the worst-case scenario of memory consumption, in which all enqueued batches are full. | `10MiB` | no |

What does it even mean 'all enqueued batches are full'? Shouldn't it say that it's the total size of all the enqueued batches instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was there before and I did not check / alter it.

But essentially whenever the capacity is full that means that the queue of batches is full and we cannot enqueue another one so we would block here until we get more capacity

Copy link
Copy Markdown
Contributor

@thampiotr thampiotr Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting is per-shard right? we should clarify this here.

| `drain_timeout` | `duration` | Configures the maximum time the client can take to drain the send queue upon shutdown. During that time, it enqueues pending batches and drains the send queue sending each. | `"1m"` | no |
| `min_shards` | `number` | Minimum number of concurrent shards sending samples to the endpoint. | `1` | no |

Each endpoint is divided into a number of concurrent _shards_ which are responsible for sending a fraction of batches. The number of shards is controlled with `min_shards` argument.
Each shard has a queue of batches it keeps in memory, controlled with the `capacity` argument.
Comment thread
thampiotr marked this conversation as resolved.

Queue size is calculated using `batch_size` and `capacity` for each shard. So if `batch_size` is 1MiB and `capacity` is 10MiB each shard would be able to queue up 10 batches.
The maximum amount of memory required for all configured shards can be calculated using `capacity` * `min_shards`.

### `tls_config`

Expand Down
11 changes: 7 additions & 4 deletions internal/component/common/loki/client/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@ type Config struct {
// prevent HOL blocking in multitenant deployments.
DropRateLimitedBatches bool

// Queue controls configuration parameters specific to the queue client
Queue QueueConfig
// QueueConfig controls how shards and queues are configured for endpoint.
QueueConfig QueueConfig
}

// QueueConfig holds configurations for the queue-based remote-write client.
// QueueConfig controls how shards and queues are configured for endpoints.
type QueueConfig struct {
// Capacity is the worst case size in bytes desired for the send queue. This value is used to calculate the size of
// the buffered channel used underneath. The worst case scenario assumed is that every batch buffered in full, hence
Expand All @@ -47,6 +47,9 @@ type QueueConfig struct {
// is the 1 MiB default, and a capacity of 100 MiB, the underlying buffered channel would buffer up to 100 batches.
Capacity int

// DrainTimeout controls the maximum time that draining the send queue can take.
// MinShards is the minimum number of concurrent shards sending batches to the endpoint.
MinShards int

// DrainTimeout controls the maximum time that draining the queue can take.
DrainTimeout time.Duration
}
Loading
Loading