test: run api_tests in parallel with dynamic port allocation#1354
Conversation
Remove #[serial_test::serial] from all API tests. Each test now allocates a unique block of ports via an atomic counter, generates a temp config, and starts its own isolated Clash instance. Tests run fully in parallel. Also reduce wait_port_ready poll interval 2s→100ms for faster startup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
📝 WalkthroughWalkthroughstart_scaffold now creates and registers a caller-scoped CancellationToken and passes it into start(). A new start_scaffold_instance() runs a Clash instance on a spawned OS thread with its own Tokio runtime and returns the thread JoinHandle plus the instance token. Tests moved to per-test dynamic port allocation and per-instance tokens. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant OS_Thread as "OS Thread"
participant Tokio as "Tokio Runtime"
participant StartScaffold as "start_scaffold_instance"
participant Start as "start(...)"
participant Token as "CancellationToken"
Caller->>StartScaffold: invoke with Options
StartScaffold->>OS_Thread: spawn thread
OS_Thread->>Tokio: build Tokio runtime
OS_Thread->>Token: create unregistered token
OS_Thread->>Start: call start(config, cwd, log_tx, token_clone)
Start->>Token: receive and use for shutdown coordination
Start->>Start: run services, derive child tokens
Caller-->>Token: cancel token when requested
Token-->>Start: propagate shutdown
Start-->>OS_Thread: shutdown services and exit
OS_Thread-->>Caller: JoinHandle completes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 8/10 reviews remaining, refill in 11 minutes and 5 seconds. Comment |
tokio-util is already a regular dependency; no need to repeat it under [dev-dependencies]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
clash-lib/tests/common/mod.rs (1)
1-6:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRun CI’s nightly rustfmt on this helper module.
cargo fmt --checkis already failing on the regroupedstdimports and thereplace(...)chain in this file.As per coding guidelines:
**/*.rs: CI uses nightly rustfmt with case-sensitive ASCII import order (uppercase before lowercase); do not trust local stable rustfmt, check CI fmt outputAlso applies to: 38-42
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clash-lib/tests/common/mod.rs` around lines 1 - 6, Reformat this helper module to satisfy CI's nightly rustfmt: reorder the regrouped std imports into case-sensitive ASCII import order (place uppercase before lowercase and split std groups appropriately) for the std::net and std::sync imports and run rustfmt with the nightly toolchain; also reformat the long replace(...) chain (the string manipulation using replace calls) so it breaks into a formatted chain or intermediate variables that rustfmt will accept (avoid a single long chained expression). Target the imports shown (futures::TryFutureExt, hyper::body::Incoming, hyper_util::rt::TokioIo, std::net::{Shutdown, TcpStream}, std::sync::atomic::{AtomicU16, Ordering}) and the replace(...) chain referenced in the file so cargo fmt --check (nightly) passes.
🧹 Nitpick comments (2)
clash-lib/tests/api_tests.rs (2)
191-218: ⚡ Quick winWait for the server’s Shadowsocks listener before sending traffic through it.
ClashInstance::start()only waits for the first port in the list, soserver_base + 1can still be binding when the client/request task starts. You already had to add this guard forclient_base + 3; the backend listener needs the same treatment.Suggested fix
let _server = ClashInstance::start( Options { config: Config::Str(server_config_str), cwd: Some(wd_server.to_string_lossy().to_string()), rt: None, @@ ) .expect("Failed to start server"); + wait_port_ready(server_base + 1) + .expect("server shadowsocks port not ready"); // Start client instance with RAII guard let _client = ClashInstance::start(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clash-lib/tests/api_tests.rs` around lines 191 - 218, The test starts the server with ClashInstance::start but doesn't wait for the server's Shadowsocks listener (server_base + 1) to be ready before the client sends traffic; add a wait_port_ready(server_base + 1) check after creating the server RAII guard (similar to the existing wait for client_base + 3) so the server's backend listener is fully bound before the client/request task runs.
124-129: ⚡ Quick winWait on the new controller port instead of sleeping 500ms.
This test already moves the controller from
port_basetoport_base + 7, so a fixed sleep is still flaky on slower CI. Waiting for the new port to accept connections makes the assertion deterministic.Suggested fix
- tokio::time::sleep(Duration::from_millis(500)).await; + wait_port_ready(port_base + 7) + .expect("reloaded external-controller port not ready");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clash-lib/tests/api_tests.rs` around lines 124 - 129, Replace the fixed sleep after moving the controller from port_base to port_base + 7 with an active wait that attempts to connect to the new controller port until it accepts connections, then run the assertion; specifically, instead of tokio::time::sleep(Duration::from_millis(500)).await, poll the new port (port_base + 7) using a connect attempt (e.g. via TcpStream::connect or an async equivalent) with a short retry loop/timeout, and only after the connection succeeds call get_allow_lan(port_base + 7).await and assert its result.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clash-lib/src/lib.rs`:
- Around line 166-171: Run the repository’s nightly rustfmt on the new helper to
fix the doc-comment/signature formatting for start_scaffold_instance in lib.rs:
run the same formatter used by CI (nightly rustfmt) and apply the repo’s
formatting rules (case-sensitive ASCII import order, uppercase before lowercase)
so the doc-comment and function signature match cargo fmt --check; reformat the
file and commit the changes.
- Around line 169-189: start_scaffold_instance currently ignores Options::rt and
always constructs a multi-thread Tokio runtime; update the function to honor
opts.rt by matching on opts.rt (e.g., TokioRuntime::SingleThread vs
TokioRuntime::MultiThread) and build the appropriate runtime
(Builder::new_current_thread().enable_all().build() for single-thread,
Builder::new_multi_thread().enable_all().build() for multi-thread) before
calling rt.block_on(start(...)). Ensure the chosen runtime is created inside the
spawned thread (as currently) and propagate the same enable_all/expect error
handling used today.
In `@clash-lib/tests/api_tests.rs`:
- Around line 87-103: In test_config_reload_via_payload() the reload payload
hardcodes "socks-port: 7892" which can collide; update the new_payload string in
the test to interpolate a dynamically allocated port (use port_base plus an
offset) instead of 7892 — e.g., replace the literal 7892 with {port_base + X}
inside new_payload so the socks-port uses the test's allocated port range (refer
to new_payload and port_base in the test).
- Around line 1-20: The file fails CI formatting because it wasn’t run with the
repo’s nightly rustfmt (case-sensitive ASCII import order and formatting
differences); run nightly rustfmt on clash-lib/tests/api_tests.rs and reformat
the grouped imports at the top and any format!-based layouts (including the
import block containing
ClashInstance/alloc_ports/make_client_config_str/send_http_request/wait_port_ready
and the start_unique_client function and CLIENT_PORT_BLOCK constant usages) so
imports are ordered with uppercase before lowercase and the formatting matches
CI’s rustfmt rules; recommit the reformatted file.
---
Outside diff comments:
In `@clash-lib/tests/common/mod.rs`:
- Around line 1-6: Reformat this helper module to satisfy CI's nightly rustfmt:
reorder the regrouped std imports into case-sensitive ASCII import order (place
uppercase before lowercase and split std groups appropriately) for the std::net
and std::sync imports and run rustfmt with the nightly toolchain; also reformat
the long replace(...) chain (the string manipulation using replace calls) so it
breaks into a formatted chain or intermediate variables that rustfmt will accept
(avoid a single long chained expression). Target the imports shown
(futures::TryFutureExt, hyper::body::Incoming, hyper_util::rt::TokioIo,
std::net::{Shutdown, TcpStream}, std::sync::atomic::{AtomicU16, Ordering}) and
the replace(...) chain referenced in the file so cargo fmt --check (nightly)
passes.
---
Nitpick comments:
In `@clash-lib/tests/api_tests.rs`:
- Around line 191-218: The test starts the server with ClashInstance::start but
doesn't wait for the server's Shadowsocks listener (server_base + 1) to be ready
before the client sends traffic; add a wait_port_ready(server_base + 1) check
after creating the server RAII guard (similar to the existing wait for
client_base + 3) so the server's backend listener is fully bound before the
client/request task runs.
- Around line 124-129: Replace the fixed sleep after moving the controller from
port_base to port_base + 7 with an active wait that attempts to connect to the
new controller port until it accepts connections, then run the assertion;
specifically, instead of tokio::time::sleep(Duration::from_millis(500)).await,
poll the new port (port_base + 7) using a connect attempt (e.g. via
TcpStream::connect or an async equivalent) with a short retry loop/timeout, and
only after the connection succeeds call get_allow_lan(port_base + 7).await and
assert its result.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: e5d78d68-406e-4e9c-b0d1-4a29dc777502
📒 Files selected for processing (4)
clash-lib/Cargo.tomlclash-lib/src/lib.rsclash-lib/tests/api_tests.rsclash-lib/tests/common/mod.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
clash-lib/src/lib.rs (1)
166-196: 💤 Low valueConsider documenting that logging is not configured for scaffold instances.
Unlike
start_scaffold, this function doesn't callapp::logging::setup_logging(). The broadcast channel is created (line 188) so log events can still be collected via the API, but they won't be written to stdout/files. This is likely intentional for test isolation, but documenting this behavior difference would help callers.,
📝 Suggested doc enhancement
/// Start a Clash instance in a background thread with independent lifecycle. /// Returns the thread handle and a CancellationToken to shut it down. /// Unlike `start_scaffold`, this does NOT register in the global /// SHUTDOWN_TOKEN. +/// +/// Note: Logging is not configured for instances started this way. Log events +/// are still collected internally but not written to stdout or log files. pub fn start_scaffold_instance(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clash-lib/src/lib.rs` around lines 166 - 196, Add a short doc comment to start_scaffold_instance explaining that, unlike start_scaffold, it does not call app::logging::setup_logging(), so while a broadcast channel (log_tx) is created and log events can be collected via the API, logs will not be written to stdout or files; update the function-level documentation for start_scaffold_instance to state this behavior and the likely intent (test isolation) so callers know to call setup_logging() themselves if they need persisted/stdout logging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@clash-lib/src/lib.rs`:
- Around line 166-196: Add a short doc comment to start_scaffold_instance
explaining that, unlike start_scaffold, it does not call
app::logging::setup_logging(), so while a broadcast channel (log_tx) is created
and log events can be collected via the API, logs will not be written to stdout
or files; update the function-level documentation for start_scaffold_instance to
state this behavior and the likely intent (test isolation) so callers know to
call setup_logging() themselves if they need persisted/stdout logging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ad48157c-5f3b-4b3e-9b21-af0bedf368ea
📒 Files selected for processing (3)
clash-lib/src/lib.rsclash-lib/tests/api_tests.rsclash-lib/tests/common/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- clash-lib/tests/common/mod.rs
- clash-lib/tests/api_tests.rs
…n reload payload - start_scaffold_instance() now respects TokioRuntime::SingleThread vs MultiThread instead of always building a multi-thread runtime - test_config_reload_via_payload: replace hardcoded socks-port 7892 with port_base+2 to stay within the test's allocated port block Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
clash-lib/tests/api_tests.rs (2)
129-135:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReplace the fixed post-reload sleeps with state-based waits.
Both reload tests still synchronize with wall-clock sleeps, and in
test_get_configs_dns_listen_when_enabled()the follow-upwait_port_ready(api_port)can succeed before the new config is actually observable because that controller port was already bound before the PUT. Under parallel CI load this is a likely flake source. PollGET /configsuntil the expected field changes, and in the port-changing case also wait forport_base + 7to bind before asserting.Possible direction
- tokio::time::sleep(Duration::from_millis(500)).await; - assert!( - !get_allow_lan(port_base + 7).await, - "expected allow-lan=false after reload" - ); + wait_port_ready(port_base + 7).expect("reloaded controller port not ready"); + for _ in 0..20 { + if !get_allow_lan(port_base + 7).await { + break; + } + tokio::time::sleep(Duration::from_millis(100)).await; + } + assert!( + !get_allow_lan(port_base + 7).await, + "expected allow-lan=false after reload" + );Apply the same pattern to the DNS test by polling
/configsuntildns-listenmatches the reloaded value instead of sleeping for a fixed duration.Also applies to: 483-486
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clash-lib/tests/api_tests.rs` around lines 129 - 135, Replace the fixed tokio::time::sleep(...) waits in the reload tests (e.g., in test_get_configs_dns_listen_when_enabled and the allow-lan reload case) with state-based polling: repeatedly GET /configs until the expected field (dns-listen or allow-lan) reflects the new value, then in the port-changing case call wait_port_ready(port_base + 7) to ensure the new controller port is bound before asserting, and finally assert !get_allow_lan(port_base + 7).await or the dns value; use get_allow_lan, wait_port_ready and the existing GET /configs helper to implement the poll with a timeout to avoid hangs.
236-240:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid hitting a public endpoint from the parallelized proxy-chain test.
https://httpbin.yba.dev/drip?...makes this test depend on external internet reachability and remote latency, which can still flake the suite even after the port-isolation work. Since this file already uses localhttpmockfixtures elsewhere, switch this request to a local delayed/streamed endpoint so the test only exercises the proxy chain logic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clash-lib/tests/api_tests.rs` around lines 236 - 240, The test currently calls client.get("https://httpbin.yba.dev/drip?duration=2&delay=1&numbytes=500") which hits an external service; replace that external URL with a local httpmock stub used elsewhere in this test file: register a mock that returns a delayed/streamed response (e.g., chunked body or sleeps between writes to simulate drip/delay), point client.get(...) at the mock server's URL, and keep the same timing/byte behavior so the rest of the test that reads into response behaves identically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@clash-lib/tests/api_tests.rs`:
- Around line 129-135: Replace the fixed tokio::time::sleep(...) waits in the
reload tests (e.g., in test_get_configs_dns_listen_when_enabled and the
allow-lan reload case) with state-based polling: repeatedly GET /configs until
the expected field (dns-listen or allow-lan) reflects the new value, then in the
port-changing case call wait_port_ready(port_base + 7) to ensure the new
controller port is bound before asserting, and finally assert
!get_allow_lan(port_base + 7).await or the dns value; use get_allow_lan,
wait_port_ready and the existing GET /configs helper to implement the poll with
a timeout to avoid hangs.
- Around line 236-240: The test currently calls
client.get("https://httpbin.yba.dev/drip?duration=2&delay=1&numbytes=500") which
hits an external service; replace that external URL with a local httpmock stub
used elsewhere in this test file: register a mock that returns a
delayed/streamed response (e.g., chunked body or sleeps between writes to
simulate drip/delay), point client.get(...) at the mock server's URL, and keep
the same timing/byte behavior so the rest of the test that reads into response
behaves identically.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 5753753a-2d13-4d37-9537-01c8dda44c04
📒 Files selected for processing (2)
clash-lib/src/lib.rsclash-lib/tests/api_tests.rs
All 38 API tests were marked
#[serial_test::serial]and used hardcoded ports, causing sequential execution — 9+ minutes on Windows with 12 tests triggering Rust's 'has been running over 60s' warning.Changes
clash-lib/src/lib.rs: Addedstart_scaffold_instance()with its ownCancellationTokennot in the global shutdown list — dropping one test instance no longer kills othersclash-lib/tests/common/mod.rs:AtomicU16port allocator (base 20000, 7 ports/test), config templating fromrules.yaml, poll interval 2s → 100msclash-lib/tests/api_tests.rs: Removed all 38#[serial_test::serial]; dynamicapi_portviastart_unique_client()Result
Summary by CodeRabbit