Skip to content

loadbalancingexporter: batch logs after routing#19

Merged
amir-jakoby merged 41 commits into
mainfrom
saw-6744-patch-loadbalancing-exporter-to-batch-logs-after-routing-per
Mar 19, 2026
Merged

loadbalancingexporter: batch logs after routing#19
amir-jakoby merged 41 commits into
mainfrom
saw-6744-patch-loadbalancing-exporter-to-batch-logs-after-routing-per

Conversation

@amir-jakoby

@amir-jakoby amir-jakoby commented Mar 19, 2026

Copy link
Copy Markdown

loadbalancingexporter: Batch logs after routing

Adds optional per-backend log batching after routing with backward-compatible default-off behavior.

  • add a per-backend async log batcher with flush-on-size, bytes, timeout, shutdown, and resolver removal
  • expose log_batcher config with sane defaults while preserving the legacy path when disabled
  • extend tests and docs for batching behavior, rollout safety, and config validation

Note

Medium Risk
Changes exporter lifecycle/routing concurrency and adds an async per-backend log batching path, which could affect shutdown/rolling-update behavior and delivery ordering/latency when enabled. Default-off mitigates runtime impact, but queue/consume gating refactors touch traces/metrics/logs paths.

Overview
Adds a new log_batcher configuration to loadbalancingexporter (default disabled) that buffers logs per resolved backend and flushes on max_records, max_bytes (serialized OTLP pre-compression), flush_interval, shutdown, or backend removal.

Refactors load balancer and wrapped exporter lifecycle to support safe draining on resolver churn (remove-under-lock + drain hook), introduces consume start/stop gating to avoid leaks on early returns, and updates queue handling to use xexporterhelper.WithQueueBatch with payload codec encoding; sending_queue.compress_in_memory is now a hard validation error.

Updates docs/tests accordingly, plus minor repo maintenance (codecov + issue templates + tidylist entries, hotreload processor metadata/docs regeneration, and toolchain/go.sum updates).

Written by Cursor Bugbot for commit 504e89f. This will update automatically on new commits. Configure here.


Summary by cubic

Adds optional post-routing, per-backend log batching in loadbalancingexporter to reduce small RPCs and CPU; default is off. Implements SAW-6744 and restores queue payload codec support via xexporterhelper while preserving existing queue compression behavior.

  • New Features

    • Per-backend async log batcher; flush on max_records, serialized OTLP max_bytes (pre-compression), flush_interval, shutdown, or resolver removal.
    • log_batcher config with strict validation and defaults (512, 1 MiB, 100ms); legacy direct-send path remains when disabled.
    • Route-first merge with independent per-backend concurrency and telemetry for pending size/bytes, flush reasons, errors, and dropped/overflow counts.
  • Bug Fixes

    • Safer resolver churn and shutdown: synchronize ring updates; remove under lock then drain outside via hook; honor shutdown context; drain inflight logs; reject log sends when not started or stopping; prevent creating new per-backend batchers after shutdown.
    • Consume gating across logs/metrics/traces to avoid early-return leaks; release consume slots on errors; surface enqueue failures when stopping; avoid duplicate enqueue errors.
    • Faster batching hot path: pre-group by endpoint, de-duplicate resource/scope, and use incremental serialized-bytes accounting to avoid O(n²) re-serialization.
    • Queue updates: use xexporterhelper.WithQueueBatch with a payload-encoding wrapper for queue compression; make sending_queue.compress_in_memory a hard validation error.
    • Minor refactor: inlined exporter lookup helper to simplify routing hot paths; no behavior change.
    • Docs/telemetry and repo maintenance: updated log_batcher README, regenerated hotreloadprocessor and logstometricsprocessor metadata/docs/tests, added Codecov component mappings and module listings; no functional changes.

Written for commit 504e89f. Summary will update on new commits.

Summary by CodeRabbit

  • New Features

    • Optional per-backend log batching for the load-balancing exporter with configurable flush triggers (max records, max bytes, flush interval). Disabled by default.
  • Improvements

    • More robust exporter lifecycle and routing: safer removal/drain flow, gated consume start/stop, and batching-aware routing.
    • Refined queue and payload handling with clearer compression validation.
  • Documentation

    • README updated with log_batcher configuration and examples.
  • Tests

    • Extensive new and updated unit tests covering batching, routing, lifecycle, and flush behaviors.
  • Chores

    • Toolchain and internal listings updated.

Expose optional per-backend log batching with backward-compatible defaults and preserve the legacy path when disabled.

Refs: SAW-6744
@coderabbitai

coderabbitai Bot commented Mar 19, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds an optional per-backend in-memory log batching feature to the loadbalancing exporter, including config/validation, a concurrency-safe logBatcher with per-endpoint batchers, payload codec adapter, lifecycle integration with load balancer/exporters, telemetry, and comprehensive tests and docs.

Changes

Cohort / File(s) Summary
Configuration & Docs
\.chloggen/saw-6744-loadbalancing-log-batcher.yaml, exporter/loadbalancingexporter/README.md, exporter/loadbalancingexporter/config.go, exporter/loadbalancingexporter/config_test.go
Add log_batcher config block and defaults; validation for enabled mode; docs/example update and tests for unmarshalling/validation.
Log Batcher Impl & Tests
exporter/loadbalancingexporter/log_batcher.go, exporter/loadbalancingexporter/log_batcher_test.go
New internal logBatcher and backendLogBatcher implementing per-endpoint batching, flush triggers (records/bytes/interval), telemetry, lifecycle (Remove/Shutdown) and unit tests for concurrency and flushing.
Load Balancer Removal & Exporter Lifecycle
exporter/loadbalancingexporter/loadbalancer.go, exporter/loadbalancingexporter/loadbalancer_test.go, exporter/loadbalancingexporter/wrapped_exporter.go
Separate locked exporter selection from async draining; add onExporterRemove hook; mark exporters stopping with atomic flag and add consume/start/stop helpers to guard per-exporter consumption.
Log Exporter Integration & Tests
exporter/loadbalancingexporter/log_exporter.go, exporter/loadbalancingexporter/log_exporter_test.go
When enabled, group records by resolved endpoint and enqueue to logBatcher; add grouping/enqueue/retry and consumeBatch routines; include batcher in shutdown and adapt tests/metrics.
Factory, Queue & Payload Codec
exporter/loadbalancingexporter/factory.go, exporter/loadbalancingexporter/factory_test.go, exporter/loadbalancingexporter/payload_codec.go, exporter/loadbalancingexporter/helpers.go
Add default log_batcher values; refactor resilience wiring to typed QueueBatchSettings; inject payload codec via payloadCodecEncoding adapter; add mergeLogs helper.
Consume Lifecycle Guarding (Traces/Metrics)
exporter/loadbalancingexporter/metrics_exporter.go, exporter/loadbalancingexporter/trace_exporter.go, exporter/loadbalancingexporter/metrics_exporter_test.go, exporter/loadbalancingexporter/trace_exporter_test.go
Use tryStartConsume/doneConsume and deferred cleanup to return errExporterIsStopping on early stopping and ensure started consumes are released; add tests for early-return behavior.
Routing Test Helper
exporter/loadbalancingexporter/routing_test.go
Add test helper findRoutingIDForEndpoint to locate a routing ID that maps to a given endpoint on a hash ring.
Misc / Module / Tidylist
internal/tools/go.mod, receiver/lokireceiver/go.mod, internal/tidylist/tidylist.txt
Bump internal tools Go version, remove an indirect dependency from lokireceiver go.mod, and add two processors to tidylist.

Sequence Diagram

sequenceDiagram
    participant Client
    participant LogExporter as Log Exporter
    participant LoadBalancer as Load Balancer
    participant LogBatcher as Log Batcher
    participant BackendBatcher as Backend Batcher
    participant Downstream as Downstream Exporter

    Client->>LogExporter: ConsumeLogs(plog.Logs)
    LogExporter->>LogExporter: iterate resources/scopes/records
    loop per record
        LogExporter->>LoadBalancer: resolve endpoint/exporter for record
        LoadBalancer-->>LogExporter: endpoint + exporter
        LogExporter->>LogBatcher: Enqueue(endpoint, single-record Logs)
        LogBatcher->>BackendBatcher: append record, update pending counters
        alt flush triggered (count/bytes/interval)
            BackendBatcher->>BackendBatcher: drain pending logs
            BackendBatcher->>Downstream: ConsumeLogs(merged batch)
            Downstream-->>BackendBatcher: result
            BackendBatcher->>LogBatcher: emit telemetry (counts/bytes/errors)
        end
    end
    LogExporter->>Client: return result
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly Related PRs

Suggested Reviewers

  • sawmills-architect-review

Poem

🐰 I hop through logs both small and grand,

Per-backend baskets held in careful hand,
I count the bytes and wait the ticking bell,
Then flush one tidy batch — all's well, all's well! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.94% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding optional per-backend log batching after routing to the loadbalancing exporter.
Description check ✅ Passed The pull request description is comprehensive and well-structured, covering all required aspects of the changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch saw-6744-patch-loadbalancing-exporter-to-batch-logs-after-routing-per
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51e7d42b16

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread exporter/loadbalancingexporter/log_batcher.go Outdated
Comment thread exporter/loadbalancingexporter/log_exporter.go Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 1 file (changes from recent commits).

Requires human review: This PR introduces significant new logic for concurrent per-backend log batching, including new goroutines, lifecycle management, and buffering which require human review for safety.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 11 files

Confidence score: 2/5

  • There is a concrete regression risk in exporter/loadbalancingexporter/log_exporter.go: blocking Enqueue without context cancellation can cause ConsumeLogs to stall indefinitely when backend queues are full, which is user-facing under load.
  • exporter/loadbalancingexporter/log_batcher.go has a likely data-loss path: flush() drains pending before async send completion, and failures are only debug-logged, so dropped batches may not be retried or surfaced.
  • exporter/loadbalancingexporter/loadbalancer.go and exporter/loadbalancingexporter/log_exporter.go both indicate concurrency hazards during resolver churn (global routing blocked under updateLock, plus race between routing and enqueue), which raises instability risk in dynamic backend updates.
  • Pay close attention to exporter/loadbalancingexporter/log_exporter.go, exporter/loadbalancingexporter/log_batcher.go, exporter/loadbalancingexporter/loadbalancer.go, exporter/loadbalancingexporter/log_exporter_test.go, exporter/loadbalancingexporter/config.go - blocking behavior, potential log loss, lock contention/races, reduced rollout-safety test coverage, and premature config exposure should be addressed before merge.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="exporter/loadbalancingexporter/config.go">

<violation number="1" location="exporter/loadbalancingexporter/config.go:44">
P2: According to linked Linear issue SAW-6744, this should stay on internal defaults for the first implementation; adding `log_batcher` here exposes a customer-facing config knob before that rollout contract is satisfied.</violation>
</file>

<file name="exporter/loadbalancingexporter/log_exporter.go">

<violation number="1" location="exporter/loadbalancingexporter/log_exporter.go:142">
P1: Enqueue is blocking without context cancellation, so a slow backend can stall `ConsumeLogs` indefinitely once the backend queue fills.</violation>

<violation number="2" location="exporter/loadbalancingexporter/log_exporter.go:142">
P2: Race condition between routing and enqueue during resolver changes: `exporterAndEndpoint` resolves the exporter, then `Enqueue` calls `getOrCreateBackend`. If `removeExtraExporters` runs between these two steps, it calls `batcher.Remove` (which drains and deletes the backend) and then starts `exp.Shutdown()` on the child exporter. The subsequent `Enqueue` → `getOrCreateBackend` will re-create a new backend for the removed endpoint referencing the now-shutting-down exporter, causing log records to be sent to a closed exporter and potentially dropped. Consider holding the loadbalancer read-lock across both the endpoint lookup and the enqueue, or checking exporter liveness inside `getOrCreateBackend`.</violation>
</file>

<file name="exporter/loadbalancingexporter/log_batcher.go">

<violation number="1" location="exporter/loadbalancingexporter/log_batcher.go:281">
P1: Async flush errors (size-triggered and timeout-triggered) are only debug-logged, but the batch data has already been moved out of `pending` in `flush()` (`drained := *pending; *pending = plog.NewLogs()`). If `b.send()` fails, the drained logs are silently dropped with no retry and no way for the upstream caller to detect the failure — `Enqueue` already returned `nil`. At minimum, the `droppedRecords` counter should be incremented on flush error so the data loss is observable, and the log level should be `Warn` or `Error` rather than `Debug`.</violation>
</file>

<file name="exporter/loadbalancingexporter/log_exporter_test.go">

<violation number="1" location="exporter/loadbalancingexporter/log_exporter_test.go:600">
P2: This weakens the rolling-update test so it no longer verifies convergence to the final backend set.

According to linked Linear issue SAW-6744, resolver-removal behavior must be validated, and `NotEmpty` allows stale intermediate resolver states to pass.</violation>
</file>

<file name="exporter/loadbalancingexporter/loadbalancer.go">

<violation number="1" location="exporter/loadbalancingexporter/loadbalancer.go:208">
P1: Calling `onExporterRemove` synchronously while holding `updateLock` can block global routing during resolver churn if a backend drain is slow.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread exporter/loadbalancingexporter/log_exporter.go Outdated
Comment thread exporter/loadbalancingexporter/log_batcher.go Outdated
Comment thread exporter/loadbalancingexporter/loadbalancer.go Outdated
Comment thread exporter/loadbalancingexporter/config.go
Comment thread exporter/loadbalancingexporter/log_exporter.go Outdated
Comment thread exporter/loadbalancingexporter/log_exporter_test.go Outdated

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] REQUEST_CHANGES — 2 concerns before merge.

Solid design: per-backend goroutines, correct shutdown ordering (batcher drains before load balancer closes), proper timer drain in flush(). Two gaps need addressing: a metric that's registered but never written, and an unbounded block on Enqueue with no context escape hatch.

Comment thread exporter/loadbalancingexporter/log_batcher.go Outdated
Comment thread exporter/loadbalancingexporter/log_batcher.go
Comment thread exporter/loadbalancingexporter/loadbalancer.go Outdated
Comment thread exporter/loadbalancingexporter/log_batcher_test.go Outdated

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] REQUEST_CHANGES — 2 concerns, 2 unresolved threads

Adds a per-backend async log batcher to the loadbalancing exporter with configurable flush-on-size, flush-on-bytes, flush-on-timeout, and drain-on-resolver-removal. Architecture is sound and the default-off flag preserves backward compatibility. Two concurrency issues need addressing before merge.

Comment thread exporter/loadbalancingexporter/log_batcher.go Outdated
Comment thread exporter/loadbalancingexporter/loadbalancer.go Outdated
Comment thread exporter/loadbalancingexporter/log_batcher.go

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] ✅ APPROVE — Prior issues addressed. Clean to merge.

Prior review had 2 concerns. Status:
✅ log_batcher.go — Fixed: Enqueue now takes ctx context.Context, adds case <-ctx.Done() escape hatch. acquireBackend calls inflight.Add(1) inside b.mu.RLock() — TOCTOU gap fully closed.
✅ log_batcher.go — Fixed: droppedRecords incremented in flush() on send failure.

Unresolved threads: 0

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 8 files (changes from recent commits).

Requires human review: This PR introduces significant new asynchronous logic for log batching and modifies core load balancing lifecycle management, which requires human architectural review.

Comment thread exporter/loadbalancingexporter/log_batcher.go

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] ✅ All review threads resolved and concerns addressed. Approved.

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] COMMENT — Clarification on Cursor Bugbot finding.

The Bugbot claim that onExporterRemove is called while holding updateLock.Lock() is incorrect. Looking at onBackendChanges (loadbalancer.go ~L154-L171), the lock is explicitly released (lb.updateLock.Unlock()) before lb.drainRemovedExporters(ctx, removed) is called. The blocking flush happens outside the lock — ConsumeLogs callers are not stalled during drain.

However, the // TODO: set a timeout? comment on line ~162 is real: drainRemovedExporters uses context.Background() with no deadline, which can block the resolver goroutine indefinitely if a removed backend is unreachable. This is a resolver-goroutine concern, not a pipeline-stall — but it does mean backend churn against unreachable endpoints could pile up goroutines. Worth addressing in a follow-up, but not a blocker given the async Shutdown path.

My original REQUEST_CHANGES on this PR stands — two unresolved concerns remain (metric registered but never written; unbounded Enqueue block with no context escape). The Bugbot issue is a false positive on lock scope.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 48 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="exporter/loadbalancingexporter/log_batcher.go">

<violation number="1" location="exporter/loadbalancingexporter/log_batcher.go:173">
P1: This wait ignores `ctx` and can deadlock shutdown when an in-flight enqueue is stuck behind a full backend queue.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread exporter/loadbalancingexporter/log_batcher.go Outdated

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: ✅ APPROVE — all prior concerns addressed.

Prior review had 2 blocking concerns. Status:
log_batcher.go:Enqueue — Fixed: ctx context.Context added with <-ctx.Done() branch.
log_batcher.go:flush — Fixed: droppedRecords.Add(...) now fires in the error path.

New code (acquireBackend double-checked lock, drainRemovedExporters, markStopping) is correct. stopAndFlush ctx-cancel path is safe — buffered done channel prevents goroutine leak on early cancellation. Unresolved threads: 0.

Comment thread exporter/loadbalancingexporter/log_batcher.go Outdated

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: code concerns resolved, 2 unresolved threads block approval.

Prior review had 2 blocking concerns:
log_batcher.goEnqueue now takes ctx context.Context; select has case <-ctx.Done() branch. Fixed.
log_batcher.godroppedRecords.Add now called on flush error path. Fixed.

New commits also look clean: acquireBackend correctly guards against a stopping exporter; ctx propagation through consumeLogRecordEnqueue is correct; rolling-update test updated to accept context.Canceled appropriately.

Unresolved threads: 2 (filed by other reviewers). Resolve all threads before merge.

…g batcher

Measure incoming chunk size before merging into pending batch and
accumulate incrementally instead of calling sizer.LogsSize(pending)
after every enqueue. Since ConsumeLogs enqueues one record at a time,
the old pattern re-serialized the entire accumulated batch on every call
— O(n²) in records-per-batch.

Proto repeated fields (ResourceLogs) are wire-additive, so incremental
accounting is accurate. pendingBytes is reset to 0 in flush(), so the
counter stays correct across flush boundaries.

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: COMMENT — prior concerns fixed, one nit remains.

Prior review had 2 🟡 concerns. Status:
log_batcher.goEnqueue ctx + <-ctx.Done() branch added; acquireBackend inflight WaitGroup pattern is correct.
log_batcher.godroppedRecords now incremented on flush error at flush() line.
🟢 log_batcher.go:421Unregister() error silently dropped (see inline).

Unresolved threads: 0

Comment thread exporter/loadbalancingexporter/log_batcher.go

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: code looks good, but 3 unresolved threads remain.

Prior review had 2 🟡 concerns. Status:
Enqueue context cancellation — fixed: ctx parameter added, select includes ctx.Done arm, callers propagate from ConsumeLogs.
droppedRecords never incremented — fixed: metric now written in flush() on send error.

Unresolved threads: 3. Resolve all threads before merge.

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] ✅ APPROVE — all prior concerns addressed.

Enqueue now takes ctx context.Context with a case <-ctx.Done() escape hatch — backpressure propagation concern resolved.
droppedRecords metric incremented in flush error path — observability gap closed.
0 unresolved threads.

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: ✅ APPROVE — both prior findings resolved.

Prior review had 2 concerns:
Enqueue — ctx param added, case <-ctx.Done() branch present. Backpressure escape hatch works correctly.
droppedRecordsb.telemetry.droppedRecords.Add(...) now fires in the if err != nil block of flush. Metric is live.

Unresolved threads: 0

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 1 file (changes from recent commits).

Requires human review: Significant refactor of exporter lifecycle and introduction of complex async batching logic with high line count; requires human architectural and concurrency review.

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] ✅ APPROVE — prior concerns resolved.

Both prior findings addressed: Enqueue now propagates context with ctx.Done() escape, and droppedRecords is correctly incremented on flush error. No new concerns on the changed code path.

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: APPROVE — both prior concerns addressed.

Prior review had 2 concerns:
log_batcher.goEnqueue now takes ctx context.Context with case <-ctx.Done() branch; backpressure propagation fixed.
log_batcher.godroppedRecords.Add(ctx, int64(records), ...) now called in flush() error path; metric is live.

Zero unresolved threads. Clean to merge.

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: APPROVE

Prior review had 2 concerns. Status:
Enqueue now takes ctx context.Context with <-ctx.Done() cancellation branch — backpressure propagation fixed.
droppedRecords metric now incremented on flush error — observability gap closed.

New code (inflight tracking, acquireBackend with isStopping guard, scheduleBackendCleanup, mergeQueuedRequests batch drain) reviewed — no new concerns.

Unresolved threads: 0

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: APPROVE

Prior review had 2 concerns. Status:
log_batcher.go — Fixed: Enqueue now accepts context.Context with cancellation select; acquireBackend tracks inflight via sync.WaitGroup with proper shutdown/remove drain ordering.
log_batcher.go — Fixed: droppedRecords metric now incremented on send() failure in flush().

New code also correctly introduces atomic started flag, wrappedExporter stopping coordination with mutex-guarded consumeWG, bounded channel drain in mergeQueuedRequests, and resource/scope dedup in mergeLogs/insertLogRecord. No new concerns.

Comment thread exporter/loadbalancingexporter/log_batcher.go

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 7 files (changes from recent commits).

Requires human review: Significant refactor of exporter lifecycle and routing synchronization. Changes core data paths and adds complex asynchronous batching logic requiring human review.

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] ✅ APPROVE — both prior concerns addressed, zero unresolved threads.

Enqueue now takes ctx context.Context with case <-ctx.Done(): return ctx.Err() — backpressure escape hatch in place.
droppedRecords incremented in flush() on send failure — metric now reflects actual data loss.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread exporter/loadbalancingexporter/loadbalancer.go Outdated

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] Re-review: prior issues resolved, 1 unresolved thread blocks approval.

Prior review had 2 concerns. Status:
exporter/loadbalancingexporter/log_batcher.goEnqueue now takes ctx context.Context with case <-ctx.Done() escape: fixed.
exporter/loadbalancingexporter/log_batcher.godroppedRecords.Add(ctx, int64(records), ...) now called in flush error branch: fixed.

Code is clean. Would approve, but 1 unresolved review thread remains. Resolve all threads before merge.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 1 file (changes from recent commits).

Requires human review: Large-scale change (3k+ lines) refactoring core exporter lifecycle, routing concurrency, and adding complex async batching logic that requires human architectural review.

@amir-jakoby amir-jakoby merged commit d5bed5a into main Mar 19, 2026
155 of 156 checks passed
@amir-jakoby amir-jakoby deleted the saw-6744-patch-loadbalancing-exporter-to-batch-logs-after-routing-per branch March 19, 2026 17:16

@sawmills-architect-review sawmills-architect-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCH-REVIEW] ✅ All prior concerns addressed. Approved.

Prior review had 2 concerns:
Enqueue now accepts ctx with <-ctx.Done() select case — callers can cancel on slow backends.
droppedRecords metric now incremented on send failure in flush().

Unresolved threads: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant