You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On EC2 i4i instance type, when run in the usual way via rpk iotune, the iotune process will fail with EAGAIN error in io_setup.
This is most likely due to the large number of CPUs (128) combined with slightly or very off aio cb calculations as we will try to consume more or less exactly all 1m aio cbs at this instance size (as evidence by the log message about reducing networking io cbs).
What should have happened instead?
iotune completes succesfully
How to reproduce the issue?
run rpk iotune on an i4i instance with aio-max-nr set to 1m
Additional information
Log:
[client 0:0] Overriding evaluation directories with: ["/mnt/vectorized/0ada61eec5354e22856072fe0cbe7cff"]
[client 0:0] Starting iotune...
[client 0:0] 02:46:43.693 DEBUG Running 'iotune-redpanda' with '[`--evaluation-directory` `/mnt/vectorized/0ada61eec5354e22856072fe0cbe7cff` `--format` `seastar` `--properties-file` `/mnt/vectorized/0ada61eec5354e22856072fe0cbe7cff/io-config.yaml` `--duration` `600`]'
[client 0:0] 02:46:43.693 DEBUG Running command 'iotune-redpanda' with arguments '[--evaluation-directory /mnt/vectorized/0ada61eec5354e22856072fe0cbe7cff --format seastar --properties-file /mnt/vectorized/0ada61eec5354e22856072fe0cbe7cff/io-config.yaml --duration 600]'
[client 0:0] error during iotune execution: err=exit status 1, stderr=WARN 2024-03-18 02:46:43,896 seastar - Requested AIO slots too large, please increase request capacity in /proc/sys/fs/aio-max-nr. configured:1048576 available:1048576 requested:1411328
[client 0:0] WARN 2024-03-18 02:46:43,896 seastar - max-networking-io-control-blocks adjusted from 10000 to 7166, since AIO slots are unavailable
[client 0:0] INFO 2024-03-18 02:46:43,896 seastar - Reactor backend: linux-aio
[client 0:0] INFO 2024-03-18 02:46:44,355 [shard 0:n/a ] seastar - Perf-based stall detector creation failed (EACCESS), try setting /proc/sys/kernel/perf_event_paranoid to 1 or less to enable kernel backtraces: falling back to posix timer.
[client 0:0] INFO 2024-03-18 02:46:44,367 [shard 0:n/a ] cpu_profiler - Perf-based cpu profiler creation failed (EACCESS), try setting /proc/sys/kernel/perf_event_paranoid to 1 or less to enable kernel backtraces: falling back to posix timer.
[client 0:0] INFO 2024-03-18 02:46:44,378 [shard 0:main] seastar - Created fair group io-queue-0 for 64 queues, capacity rate 2147483:2147483, limit 12582912, rate 16777216 (factor 1), threshold 2000, per tick grab 196608
[client 0:0] INFO 2024-03-18 02:46:44,378 [shard 0:main] seastar - IO queue uses 0.75ms latency goal for device 0
[client 0:0] INFO 2024-03-18 02:46:44,378 [shard 0:main] seastar - Created io group dev(0), length limit 4194304:4194304, rate 2147483647:2147483647
[client 0:0] INFO 2024-03-18 02:46:44,378 [shard 0:main] seastar - Created io queue dev(0) capacities: 512:2000:2000 1024:3000:3000 2048:5000:5000 4096:9000:9000 8192:17000:17000 16384:33000:33000 32768:65000:65000 65536:129000:129000 131072:257000:257000
[client 0:0] INFO 2024-03-18 02:46:44,837 [shard 109:main] seastar - Created fair group io-queue-0 for 64 queues, capacity rate 2147483:2147483, limit 12582912, rate 16777216 (factor 1), threshold 2000, per tick grab 196608
[client 0:0] INFO 2024-03-18 02:46:44,837 [shard 109:main] seastar - IO queue uses 0.75ms latency goal for device 0
[client 0:0] INFO 2024-03-18 02:46:44,837 [shard 109:main] seastar - Created io group dev(0), length limit 4194304:4194304, rate 2147483647:2147483647
[client 0:0] ERROR 2024-03-18 02:46:44,956 [shard 0:main] seastar - Exiting on unhandled exception: std::__1::system_error (error system:11, io_setup: Resource temporarily unavailable)
[client 0:0]
This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.
This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.
Version & Environment
Redpanda version: 23.3
What went wrong?
On EC2 i4i instance type, when run in the usual way via
rpk iotune
, the iotune process will fail with EAGAIN error inio_setup
.This is most likely due to the large number of CPUs (128) combined with slightly or very off aio cb calculations as we will try to consume more or less exactly all 1m aio cbs at this instance size (as evidence by the log message about reducing networking io cbs).
What should have happened instead?
iotune completes succesfully
How to reproduce the issue?
rpk iotune
on an i4i instance with aio-max-nr set to 1mAdditional information
Log:
JIRA Link: CORE-1885
The text was updated successfully, but these errors were encountered: