Skip to content

Fix Kafka CI: start Kafka container, fix configuration test timing, reduce infinite-loop test delay#2395

Merged
jeremydmiller merged 7 commits intomainfrom
kafka-tests
Mar 31, 2026
Merged

Fix Kafka CI: start Kafka container, fix configuration test timing, reduce infinite-loop test delay#2395
jeremydmiller merged 7 commits intomainfrom
kafka-tests

Conversation

@jeremydmiller
Copy link
Copy Markdown
Member

Summary

  • CI was never starting Kafka: CIKafka target in CITargets.cs only called StartDockerServices("postgresql") — the Kafka broker was never started, so all integration tests would fail in CI with connection-refused errors. Added "kafka" to the service list and a new WaitForKafkaToBeReady() readiness check that polls port 9092.

  • 3 unit tests were failing: KafkaTopicGroupConfigurationTests.specification_uniform_sets_config_on_group, specification_per_topic_receives_topic_name, and topic_creation_sets_func_on_group all failed because KafkaTopicGroupListenerConfiguration uses the DelayedEndpointConfiguration pattern (actions stored and applied lazily at startup). Tests were reading properties immediately after calling fluent methods, before Apply() was called. Fixed by explicitly calling ((IDelayedEndpointConfiguration)config).Apply() before assertions.

  • 2-minute test delay reduced to 30s: do_not_go_into_infinite_loop_with_garbage_data had await Task.Delay(2.Minutes()) with no assertions — only verifies the process doesn't crash. 30 seconds is sufficient to observe a tight retry loop and saves ~90 seconds per CI run.

Test plan

🤖 Generated with Claude Code

jeremydmiller and others added 7 commits March 31, 2026 11:48
…duce infinite-loop test delay

- CITargets.cs: Add 'kafka' to StartDockerServices in CIKafka target — was
  only starting postgresql, so Kafka tests had no broker in CI. Added
  WaitForKafkaToBeReady() that polls port 9092 until the broker accepts
  connections (up to 60 seconds, 30 attempts × 2s).

- KafkaTransportTests.cs: Fix KafkaTopicGroupConfigurationTests — the
  Specification() and TopicCreation() methods use the DelayedEndpointConfiguration
  pattern, storing actions that are applied lazily at startup. Tests were checking
  group.SpecificationConfig/CreateTopicFunc immediately after calling the fluent
  methods, before Apply() was ever called. Added ((IDelayedEndpointConfiguration)config).Apply()
  before each assertion.

- publish_and_receive_raw_json.cs: Reduce the do_not_go_into_infinite_loop_with_garbage_data
  test delay from 2 minutes to 30 seconds. The test has no assertions after the delay — it
  only checks the process doesn't crash. 30 seconds is sufficient to observe a tight
  retry loop, and saves ~90 seconds per CI run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…from handler pipeline

When PauseListenerContinuation fires from within the handler pipeline (e.g. rate
limiting), ListeningAgent.PauseAsync called StopAndDrainAsync which always called
LatchReceiver() before receiver.DrainAsync(). This caused DurableReceiver/BufferedReceiver/
InlineReceiver's DrainAsync to see _latched=true (waitForCompletion=true) and wait up to
DrainTimeout (30s) for the ActionBlock/in-flight count to drain — which never happened
because the current message's execute frame was still on the call stack. The result was a
30-second stall on every rate-limit event, causing the rate-limiting test to time out.

Fix: split StopAndDrainAsync into a shared StopAndDrainCoreAsync(bool latchBeforeDrain).
- Normal shutdown (StopAndDrainAsync) continues to pre-latch so DrainAsync can safely
  wait for in-flight messages to complete.
- PauseAsync now calls StopAndDrainCoreAsync(latchBeforeDrain: false) so DrainAsync sees
  _latched==false and returns immediately, avoiding the deadlock.

Also add fast durability polling settings to the rate-limiting integration test so the
scheduled message is picked up quickly after the listener restarts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ackages

- JasperFx: 1.22.0 → 1.23.0
- JasperFx.Events: 1.24.1 → 1.25.0
- All Weasel.*: 8.10.2 → 8.11.2
- Marten + Marten.AspNetCore: 8.26.1 → 8.28.0
- Polecat: 1.4.0 → 1.6.1
- Version: 5.26.0 → 5.27.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pin Microsoft.CodeAnalysis.Common to 4.14.0 in Directory.Packages.props
to resolve the NuGet conflict between Microsoft.CodeAnalysis.Workspaces.MSBuild
(which requires 4.14.0) and Microsoft.EntityFrameworkCore.Design (which pulls
in 4.8.0 transitively). This was causing all 20 CI workflows to fail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The merge brought back local project references to external repos
(jasperfx, weasel, marten). Replace all with proper NuGet package
references to match the existing Directory.Packages.props versions.
Also add Microsoft.CodeAnalysis.CSharp and Microsoft.CodeAnalysis.Analyzers
to support the new Wolverine.SourceGeneration Roslyn analyzer project.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace local project reference to ../../../../polecat/src/Polecat/Polecat.csproj
with the NuGet package reference, matching the existing Polecat 1.6.1 version
in Directory.Packages.props. This was causing CS0246 errors for all Polecat
types in CI where the external repo is not present.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant