Skip to content

statement-store: fix benchmark EMFILE by pooling RPC connections#11070

Merged
DenzelPenzel merged 2 commits intomasterfrom
denzelpenzel/statement-store-fix-bench
Feb 20, 2026
Merged

statement-store: fix benchmark EMFILE by pooling RPC connections#11070
DenzelPenzel merged 2 commits intomasterfrom
denzelpenzel/statement-store-fix-bench

Conversation

@DenzelPenzel
Copy link
Copy Markdown
Contributor

@DenzelPenzel DenzelPenzel commented Feb 13, 2026

Description

  • Fix "Too many open files" (EMFILE) error in all statement-store benchmarks
  • Replace per-participant RPC connections with a shared connection pool (100 per node)
  • Participants share connections via RpcClient::clone() which multiplexes over the same transport

Root Cause

Each of ~50,000 benchmark participants called node.rpc().await? to create its own
TCP/WebSocket connection, exhausting the OS per-process file descriptor limit (EMFILE error 24)

Fix

Introduce RPC_POOL_SIZE = 100 constant. Create a pool of conn per node, then
distribute them round-robin to participants reduces tot file descriptors from ~50,000 to at most 600 (6 nodes x 100)

Test plan

  • Run statement_store_many_nodes_bench with zombienet to verify no EMFILE error
  • Verify benchmarks complete successfully with pooled connections

@DenzelPenzel DenzelPenzel added the R0-no-crate-publish-required The change does not require any crates to be re-published. label Feb 13, 2026
Copy link
Copy Markdown
Contributor

@alexggh alexggh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you!

@DenzelPenzel DenzelPenzel force-pushed the denzelpenzel/statement-store-fix-bench branch 2 times, most recently from f474f39 to 210c2d0 Compare February 16, 2026 09:44
@AndreiEres
Copy link
Copy Markdown
Contributor

Do we have the same problem in the latency bench? Or only one_node/many_nodes benches?

@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

Do we have the same problem in the latency bench? Or only one_node/many_nodes benches?

Actually I found this issue for the all bench tests

@DenzelPenzel DenzelPenzel requested a review from alexggh February 16, 2026 16:46
Comment thread cumulus/zombienet/zombienet-sdk/tests/zombie_ci/statement_store_bench.rs Outdated
@DenzelPenzel DenzelPenzel force-pushed the denzelpenzel/statement-store-fix-bench branch 2 times, most recently from b417255 to 895d5dc Compare February 17, 2026 09:40
info!("");

let mut rpc_clients = Vec::new();
let mut rpc_pools: Vec<Vec<RpcClient>> = Vec::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that running a test with 1000 submitters we don't create 10_000*nodes rpc clients

@DenzelPenzel DenzelPenzel force-pushed the denzelpenzel/statement-store-fix-bench branch from 895d5dc to 0fc865d Compare February 19, 2026 14:01
@DenzelPenzel DenzelPenzel added this pull request to the merge queue Feb 20, 2026
Merged via the queue into master with commit dc18933 Feb 20, 2026
222 of 225 checks passed
@DenzelPenzel DenzelPenzel deleted the denzelpenzel/statement-store-fix-bench branch February 20, 2026 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

R0-no-crate-publish-required The change does not require any crates to be re-published.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants