Skip to content

test: run api_tests in parallel with dynamic port allocation#1354

Merged
ibigbug merged 4 commits into
masterfrom
parallel-api-tests
May 1, 2026
Merged

test: run api_tests in parallel with dynamic port allocation#1354
ibigbug merged 4 commits into
masterfrom
parallel-api-tests

Conversation

@ibigbug
Copy link
Copy Markdown
Member

@ibigbug ibigbug commented Apr 30, 2026

All 38 API tests were marked #[serial_test::serial] and used hardcoded ports, causing sequential execution — 9+ minutes on Windows with 12 tests triggering Rust's 'has been running over 60s' warning.

Changes

  • clash-lib/src/lib.rs: Added start_scaffold_instance() with its own CancellationToken not in the global shutdown list — dropping one test instance no longer kills others
  • clash-lib/tests/common/mod.rs: AtomicU16 port allocator (base 20000, 7 ports/test), config templating from rules.yaml, poll interval 2s → 100ms
  • clash-lib/tests/api_tests.rs: Removed all 38 #[serial_test::serial]; dynamic api_port via start_unique_client()

Result

Before After
Ubuntu ~108s ~5s
Windows ~9.4 min (12 tests >60s) ~30s est.

Summary by CodeRabbit

  • Refactor
    • Improved shutdown coordination so instances terminate more reliably and can be controlled per-instance for cleaner lifecycle management.
  • Tests
    • Strengthened test isolation and stability by allocating dynamic per-test ports, removing serialized test ordering, and making readiness checks more robust.

Remove #[serial_test::serial] from all API tests. Each test now allocates
a unique block of ports via an atomic counter, generates a temp config,
and starts its own isolated Clash instance. Tests run fully in parallel.

Also reduce wait_port_ready poll interval 2s→100ms for faster startup.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

📝 Walkthrough

Walkthrough

start_scaffold now creates and registers a caller-scoped CancellationToken and passes it into start(). A new start_scaffold_instance() runs a Clash instance on a spawned OS thread with its own Tokio runtime and returns the thread JoinHandle plus the instance token. Tests moved to per-test dynamic port allocation and per-instance tokens.

Changes

Cohort / File(s) Summary
Core shutdown & runtime
clash-lib/src/lib.rs
Changed start(...) signature to accept shutdown_token. Removed internal token allocation/registration from start. Added pub fn start_scaffold_instance(...) -> Result<(JoinHandle<()>, CancellationToken)> which spawns an OS thread, builds a Tokio runtime, creates an unregistered token, calls start(...) with that token, reports startup/runtime errors to stderr, and returns the handle + token.
Test harness & helpers
clash-lib/tests/common/mod.rs
Added global atomic port allocator NEXT_PORT and alloc_ports(n). Added make_client_config_str(port_base) to substitute bundled rules with test-specific ports. Increased port readiness polling (300 attempts, 100ms). Refactored ClashInstance to start via start_scaffold_instance, store per-instance CancellationToken, and Drop cancels token and waits for ports/thread shutdown. Added start_clash(options) helper.
API integration tests
clash-lib/tests/api_tests.rs
Replaced hard-coded ports with per-test allocated port blocks from alloc_ports. Added start_unique_client() helper returning (ClashInstance, api_port). Removed serial_test::serial usage. Converted file-based/static configs to dynamically generated config strings embedding allocated ports. Updated proxy routing, readiness polling, reloads, and connection checks to use dynamic port offsets.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant OS_Thread as "OS Thread"
    participant Tokio as "Tokio Runtime"
    participant StartScaffold as "start_scaffold_instance"
    participant Start as "start(...)"
    participant Token as "CancellationToken"

    Caller->>StartScaffold: invoke with Options
    StartScaffold->>OS_Thread: spawn thread
    OS_Thread->>Tokio: build Tokio runtime
    OS_Thread->>Token: create unregistered token
    OS_Thread->>Start: call start(config, cwd, log_tx, token_clone)
    Start->>Token: receive and use for shutdown coordination
    Start->>Start: run services, derive child tokens
    Caller-->>Token: cancel token when requested
    Token-->>Start: propagate shutdown
    Start-->>OS_Thread: shutdown services and exit
    OS_Thread-->>Caller: JoinHandle completes
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I spawn a thread, a tiny runtime hum,

Tokens hop in circles till the shutdown drum.
Ports find new homes, each test gets its spot,
Threads nap politely when their token's not.
A carrot for each bug—now that's my plot! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective of the changeset: enabling parallel execution of API tests through dynamic port allocation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch parallel-api-tests

Review rate limit: 8/10 reviews remaining, refill in 11 minutes and 5 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

tokio-util is already a regular dependency; no need to repeat it under
[dev-dependencies].

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 81.25000% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
clash-lib/src/lib.rs 81.25% 2 Missing and 4 partials ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
clash-lib/tests/common/mod.rs (1)

1-6: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Run CI’s nightly rustfmt on this helper module.

cargo fmt --check is already failing on the regrouped std imports and the replace(...) chain in this file.

As per coding guidelines: **/*.rs: CI uses nightly rustfmt with case-sensitive ASCII import order (uppercase before lowercase); do not trust local stable rustfmt, check CI fmt output

Also applies to: 38-42

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clash-lib/tests/common/mod.rs` around lines 1 - 6, Reformat this helper
module to satisfy CI's nightly rustfmt: reorder the regrouped std imports into
case-sensitive ASCII import order (place uppercase before lowercase and split
std groups appropriately) for the std::net and std::sync imports and run rustfmt
with the nightly toolchain; also reformat the long replace(...) chain (the
string manipulation using replace calls) so it breaks into a formatted chain or
intermediate variables that rustfmt will accept (avoid a single long chained
expression). Target the imports shown (futures::TryFutureExt,
hyper::body::Incoming, hyper_util::rt::TokioIo, std::net::{Shutdown, TcpStream},
std::sync::atomic::{AtomicU16, Ordering}) and the replace(...) chain referenced
in the file so cargo fmt --check (nightly) passes.
🧹 Nitpick comments (2)
clash-lib/tests/api_tests.rs (2)

191-218: ⚡ Quick win

Wait for the server’s Shadowsocks listener before sending traffic through it.

ClashInstance::start() only waits for the first port in the list, so server_base + 1 can still be binding when the client/request task starts. You already had to add this guard for client_base + 3; the backend listener needs the same treatment.

Suggested fix
     let _server = ClashInstance::start(
         Options {
             config: Config::Str(server_config_str),
             cwd: Some(wd_server.to_string_lossy().to_string()),
             rt: None,
@@
     )
     .expect("Failed to start server");
+    wait_port_ready(server_base + 1)
+        .expect("server shadowsocks port not ready");
 
     // Start client instance with RAII guard
     let _client = ClashInstance::start(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clash-lib/tests/api_tests.rs` around lines 191 - 218, The test starts the
server with ClashInstance::start but doesn't wait for the server's Shadowsocks
listener (server_base + 1) to be ready before the client sends traffic; add a
wait_port_ready(server_base + 1) check after creating the server RAII guard
(similar to the existing wait for client_base + 3) so the server's backend
listener is fully bound before the client/request task runs.

124-129: ⚡ Quick win

Wait on the new controller port instead of sleeping 500ms.

This test already moves the controller from port_base to port_base + 7, so a fixed sleep is still flaky on slower CI. Waiting for the new port to accept connections makes the assertion deterministic.

Suggested fix
-    tokio::time::sleep(Duration::from_millis(500)).await;
+    wait_port_ready(port_base + 7)
+        .expect("reloaded external-controller port not ready");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clash-lib/tests/api_tests.rs` around lines 124 - 129, Replace the fixed sleep
after moving the controller from port_base to port_base + 7 with an active wait
that attempts to connect to the new controller port until it accepts
connections, then run the assertion; specifically, instead of
tokio::time::sleep(Duration::from_millis(500)).await, poll the new port
(port_base + 7) using a connect attempt (e.g. via TcpStream::connect or an async
equivalent) with a short retry loop/timeout, and only after the connection
succeeds call get_allow_lan(port_base + 7).await and assert its result.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clash-lib/src/lib.rs`:
- Around line 166-171: Run the repository’s nightly rustfmt on the new helper to
fix the doc-comment/signature formatting for start_scaffold_instance in lib.rs:
run the same formatter used by CI (nightly rustfmt) and apply the repo’s
formatting rules (case-sensitive ASCII import order, uppercase before lowercase)
so the doc-comment and function signature match cargo fmt --check; reformat the
file and commit the changes.
- Around line 169-189: start_scaffold_instance currently ignores Options::rt and
always constructs a multi-thread Tokio runtime; update the function to honor
opts.rt by matching on opts.rt (e.g., TokioRuntime::SingleThread vs
TokioRuntime::MultiThread) and build the appropriate runtime
(Builder::new_current_thread().enable_all().build() for single-thread,
Builder::new_multi_thread().enable_all().build() for multi-thread) before
calling rt.block_on(start(...)). Ensure the chosen runtime is created inside the
spawned thread (as currently) and propagate the same enable_all/expect error
handling used today.

In `@clash-lib/tests/api_tests.rs`:
- Around line 87-103: In test_config_reload_via_payload() the reload payload
hardcodes "socks-port: 7892" which can collide; update the new_payload string in
the test to interpolate a dynamically allocated port (use port_base plus an
offset) instead of 7892 — e.g., replace the literal 7892 with {port_base + X}
inside new_payload so the socks-port uses the test's allocated port range (refer
to new_payload and port_base in the test).
- Around line 1-20: The file fails CI formatting because it wasn’t run with the
repo’s nightly rustfmt (case-sensitive ASCII import order and formatting
differences); run nightly rustfmt on clash-lib/tests/api_tests.rs and reformat
the grouped imports at the top and any format!-based layouts (including the
import block containing
ClashInstance/alloc_ports/make_client_config_str/send_http_request/wait_port_ready
and the start_unique_client function and CLIENT_PORT_BLOCK constant usages) so
imports are ordered with uppercase before lowercase and the formatting matches
CI’s rustfmt rules; recommit the reformatted file.

---

Outside diff comments:
In `@clash-lib/tests/common/mod.rs`:
- Around line 1-6: Reformat this helper module to satisfy CI's nightly rustfmt:
reorder the regrouped std imports into case-sensitive ASCII import order (place
uppercase before lowercase and split std groups appropriately) for the std::net
and std::sync imports and run rustfmt with the nightly toolchain; also reformat
the long replace(...) chain (the string manipulation using replace calls) so it
breaks into a formatted chain or intermediate variables that rustfmt will accept
(avoid a single long chained expression). Target the imports shown
(futures::TryFutureExt, hyper::body::Incoming, hyper_util::rt::TokioIo,
std::net::{Shutdown, TcpStream}, std::sync::atomic::{AtomicU16, Ordering}) and
the replace(...) chain referenced in the file so cargo fmt --check (nightly)
passes.

---

Nitpick comments:
In `@clash-lib/tests/api_tests.rs`:
- Around line 191-218: The test starts the server with ClashInstance::start but
doesn't wait for the server's Shadowsocks listener (server_base + 1) to be ready
before the client sends traffic; add a wait_port_ready(server_base + 1) check
after creating the server RAII guard (similar to the existing wait for
client_base + 3) so the server's backend listener is fully bound before the
client/request task runs.
- Around line 124-129: Replace the fixed sleep after moving the controller from
port_base to port_base + 7 with an active wait that attempts to connect to the
new controller port until it accepts connections, then run the assertion;
specifically, instead of tokio::time::sleep(Duration::from_millis(500)).await,
poll the new port (port_base + 7) using a connect attempt (e.g. via
TcpStream::connect or an async equivalent) with a short retry loop/timeout, and
only after the connection succeeds call get_allow_lan(port_base + 7).await and
assert its result.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e5d78d68-406e-4e9c-b0d1-4a29dc777502

📥 Commits

Reviewing files that changed from the base of the PR and between 09f868e and b099f00.

📒 Files selected for processing (4)
  • clash-lib/Cargo.toml
  • clash-lib/src/lib.rs
  • clash-lib/tests/api_tests.rs
  • clash-lib/tests/common/mod.rs

Comment thread clash-lib/src/lib.rs Outdated
Comment thread clash-lib/src/lib.rs
Comment thread clash-lib/tests/api_tests.rs Outdated
Comment thread clash-lib/tests/api_tests.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
clash-lib/src/lib.rs (1)

166-196: 💤 Low value

Consider documenting that logging is not configured for scaffold instances.

Unlike start_scaffold, this function doesn't call app::logging::setup_logging(). The broadcast channel is created (line 188) so log events can still be collected via the API, but they won't be written to stdout/files. This is likely intentional for test isolation, but documenting this behavior difference would help callers.

,

📝 Suggested doc enhancement
 /// Start a Clash instance in a background thread with independent lifecycle.
 /// Returns the thread handle and a CancellationToken to shut it down.
 /// Unlike `start_scaffold`, this does NOT register in the global
 /// SHUTDOWN_TOKEN.
+/// 
+/// Note: Logging is not configured for instances started this way. Log events
+/// are still collected internally but not written to stdout or log files.
 pub fn start_scaffold_instance(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clash-lib/src/lib.rs` around lines 166 - 196, Add a short doc comment to
start_scaffold_instance explaining that, unlike start_scaffold, it does not call
app::logging::setup_logging(), so while a broadcast channel (log_tx) is created
and log events can be collected via the API, logs will not be written to stdout
or files; update the function-level documentation for start_scaffold_instance to
state this behavior and the likely intent (test isolation) so callers know to
call setup_logging() themselves if they need persisted/stdout logging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@clash-lib/src/lib.rs`:
- Around line 166-196: Add a short doc comment to start_scaffold_instance
explaining that, unlike start_scaffold, it does not call
app::logging::setup_logging(), so while a broadcast channel (log_tx) is created
and log events can be collected via the API, logs will not be written to stdout
or files; update the function-level documentation for start_scaffold_instance to
state this behavior and the likely intent (test isolation) so callers know to
call setup_logging() themselves if they need persisted/stdout logging.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ad48157c-5f3b-4b3e-9b21-af0bedf368ea

📥 Commits

Reviewing files that changed from the base of the PR and between b099f00 and 87459f1.

📒 Files selected for processing (3)
  • clash-lib/src/lib.rs
  • clash-lib/tests/api_tests.rs
  • clash-lib/tests/common/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • clash-lib/tests/common/mod.rs
  • clash-lib/tests/api_tests.rs

ibigbug and others added 2 commits April 30, 2026 16:53
…n reload payload

- start_scaffold_instance() now respects TokioRuntime::SingleThread vs
  MultiThread instead of always building a multi-thread runtime
- test_config_reload_via_payload: replace hardcoded socks-port 7892 with
  port_base+2 to stay within the test's allocated port block

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
clash-lib/tests/api_tests.rs (2)

129-135: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace the fixed post-reload sleeps with state-based waits.

Both reload tests still synchronize with wall-clock sleeps, and in test_get_configs_dns_listen_when_enabled() the follow-up wait_port_ready(api_port) can succeed before the new config is actually observable because that controller port was already bound before the PUT. Under parallel CI load this is a likely flake source. Poll GET /configs until the expected field changes, and in the port-changing case also wait for port_base + 7 to bind before asserting.

Possible direction
-    tokio::time::sleep(Duration::from_millis(500)).await;
-    assert!(
-        !get_allow_lan(port_base + 7).await,
-        "expected allow-lan=false after reload"
-    );
+    wait_port_ready(port_base + 7).expect("reloaded controller port not ready");
+    for _ in 0..20 {
+        if !get_allow_lan(port_base + 7).await {
+            break;
+        }
+        tokio::time::sleep(Duration::from_millis(100)).await;
+    }
+    assert!(
+        !get_allow_lan(port_base + 7).await,
+        "expected allow-lan=false after reload"
+    );

Apply the same pattern to the DNS test by polling /configs until dns-listen matches the reloaded value instead of sleeping for a fixed duration.

Also applies to: 483-486

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clash-lib/tests/api_tests.rs` around lines 129 - 135, Replace the fixed
tokio::time::sleep(...) waits in the reload tests (e.g., in
test_get_configs_dns_listen_when_enabled and the allow-lan reload case) with
state-based polling: repeatedly GET /configs until the expected field
(dns-listen or allow-lan) reflects the new value, then in the port-changing case
call wait_port_ready(port_base + 7) to ensure the new controller port is bound
before asserting, and finally assert !get_allow_lan(port_base + 7).await or the
dns value; use get_allow_lan, wait_port_ready and the existing GET /configs
helper to implement the poll with a timeout to avoid hangs.

236-240: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid hitting a public endpoint from the parallelized proxy-chain test.

https://httpbin.yba.dev/drip?... makes this test depend on external internet reachability and remote latency, which can still flake the suite even after the port-isolation work. Since this file already uses local httpmock fixtures elsewhere, switch this request to a local delayed/streamed endpoint so the test only exercises the proxy chain logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clash-lib/tests/api_tests.rs` around lines 236 - 240, The test currently
calls client.get("https://httpbin.yba.dev/drip?duration=2&delay=1&numbytes=500")
which hits an external service; replace that external URL with a local httpmock
stub used elsewhere in this test file: register a mock that returns a
delayed/streamed response (e.g., chunked body or sleeps between writes to
simulate drip/delay), point client.get(...) at the mock server's URL, and keep
the same timing/byte behavior so the rest of the test that reads into response
behaves identically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@clash-lib/tests/api_tests.rs`:
- Around line 129-135: Replace the fixed tokio::time::sleep(...) waits in the
reload tests (e.g., in test_get_configs_dns_listen_when_enabled and the
allow-lan reload case) with state-based polling: repeatedly GET /configs until
the expected field (dns-listen or allow-lan) reflects the new value, then in the
port-changing case call wait_port_ready(port_base + 7) to ensure the new
controller port is bound before asserting, and finally assert
!get_allow_lan(port_base + 7).await or the dns value; use get_allow_lan,
wait_port_ready and the existing GET /configs helper to implement the poll with
a timeout to avoid hangs.
- Around line 236-240: The test currently calls
client.get("https://httpbin.yba.dev/drip?duration=2&delay=1&numbytes=500") which
hits an external service; replace that external URL with a local httpmock stub
used elsewhere in this test file: register a mock that returns a
delayed/streamed response (e.g., chunked body or sleeps between writes to
simulate drip/delay), point client.get(...) at the mock server's URL, and keep
the same timing/byte behavior so the rest of the test that reads into response
behaves identically.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5753753a-2d13-4d37-9537-01c8dda44c04

📥 Commits

Reviewing files that changed from the base of the PR and between 87459f1 and 9770bde.

📒 Files selected for processing (2)
  • clash-lib/src/lib.rs
  • clash-lib/tests/api_tests.rs

@ibigbug ibigbug merged commit 5c661be into master May 1, 2026
34 of 35 checks passed
@ibigbug ibigbug deleted the parallel-api-tests branch May 1, 2026 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant