ignore flaky test by bw-solana · Pull Request #8278 · anza-xyz/agave

bw-solana · 2025-10-01T08:32:07Z

Problem

test_scheduler_drop_idle seems to be flaky and hit this assert:

thread 'tests::test_scheduler_drop_idle' panicked at unified-scheduler-pool/src/lib.rs:2886:9:
--
  | assertion `left == right` failed
  | left: 0
  | right: 1

See https://buildkite.com/anza/agave/builds/31204#01999eda-ad53-4c13-8e73-77882137f119 for an example.

This is causing CI to fail and/or retry and take a long time and slowing down merging into master

Summary of Changes

ignore this flaky test

gregcusack

lgtm based on slack discussions

codecov-commenter · 2025-10-01T09:17:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.1%. Comparing base (617d477) to head (70043d8).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff            @@
##           master    #8278     +/-   ##
=========================================
- Coverage    83.2%    83.1%   -0.1%     
=========================================
  Files         836      836             
  Lines      366600   366592      -8     
=========================================
- Hits       305201   304982    -219     
- Misses      61399    61610    +211

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Remove #[ignore] attribute and fix timing to eliminate race condition in test_scheduler_drop_idle, which was disabled in PR anza-xyz#8278. The test verifies the scheduler pool's cleaner thread correctly removes idle schedulers while preserving recently-pooled ones. Root cause: The original test used a 100ms idle threshold with 1000ms sleep, but timing was still unreliable due to system variations. The race occurred when old_scheduler and new_scheduler had unclear age differences. Fix: - Use explicit 300ms idle threshold for this test (instead of 100ms) - Sleep 350ms before returning new_scheduler (provides 50ms+ safety margin) - This guarantees old_scheduler is idle while new_scheduler is not Result: - old_scheduler: 350ms old (> 300ms threshold) → definitely idle → removed - new_scheduler: ~0ms old (< 300ms threshold) → definitely not idle → kept - Test is now deterministic and matches expected checkpoint sequence Fixes anza-xyz#8279 Signed-off-by: AvhiMaz <avhimazumder5@outlook.com>

…9042) * fix: re-enable test_scheduler_drop_idle by resolving race condition Remove #[ignore] attribute and fix timing to eliminate race condition in test_scheduler_drop_idle, which was disabled in PR #8278. The test verifies the scheduler pool's cleaner thread correctly removes idle schedulers while preserving recently-pooled ones. Root cause: The original test used a 100ms idle threshold with 1000ms sleep, but timing was still unreliable due to system variations. The race occurred when old_scheduler and new_scheduler had unclear age differences. Fix: - Use explicit 300ms idle threshold for this test (instead of 100ms) - Sleep 350ms before returning new_scheduler (provides 50ms+ safety margin) - This guarantees old_scheduler is idle while new_scheduler is not Result: - old_scheduler: 350ms old (> 300ms threshold) → definitely idle → removed - new_scheduler: ~0ms old (< 300ms threshold) → definitely not idle → kept - Test is now deterministic and matches expected checkpoint sequence Fixes #8279 Signed-off-by: AvhiMaz <avhimazumder5@outlook.com> * Address all reviewer feedback on test_scheduler_drop_idle - Move SHORTENED_MAX_POOLING_DURATION constant to test_scheduler_drop_stale where it's used - Convert test_max_pooling_duration from let to const TEST_MAX_POOLING_DURATION following conventions - Create TEST_WAIT_FOR_IDLE constant (500ms) with explicit safety margin tied to TEST_MAX_POOLING_DURATION - Increase safety margin from 50ms to 200ms for CI reliability - Add detailed comments explaining 300ms pooling duration tradeoff (speed vs race condition window) - Update assertion comment with explicit timing guarantees: old_scheduler ~500ms old (exceeds 300ms threshold), new_scheduler ~50ms old (below threshold) Signed-off-by: AvhiMaz <avhimazumder5@outlook.com> * refactor: tie test duration constants together and improve comment accuracy Signed-off-by: AvhiMaz <avhimazumder5@outlook.com> * fix: cargo fmt Signed-off-by: AvhiMaz <avhimazumder5@outlook.com> --------- Signed-off-by: AvhiMaz <avhimazumder5@outlook.com>

…nza-xyz#9042) * fix: re-enable test_scheduler_drop_idle by resolving race condition Remove #[ignore] attribute and fix timing to eliminate race condition in test_scheduler_drop_idle, which was disabled in PR anza-xyz#8278. The test verifies the scheduler pool's cleaner thread correctly removes idle schedulers while preserving recently-pooled ones. Root cause: The original test used a 100ms idle threshold with 1000ms sleep, but timing was still unreliable due to system variations. The race occurred when old_scheduler and new_scheduler had unclear age differences. Fix: - Use explicit 300ms idle threshold for this test (instead of 100ms) - Sleep 350ms before returning new_scheduler (provides 50ms+ safety margin) - This guarantees old_scheduler is idle while new_scheduler is not Result: - old_scheduler: 350ms old (> 300ms threshold) → definitely idle → removed - new_scheduler: ~0ms old (< 300ms threshold) → definitely not idle → kept - Test is now deterministic and matches expected checkpoint sequence Fixes anza-xyz#8279 Signed-off-by: AvhiMaz <avhimazumder5@outlook.com> * Address all reviewer feedback on test_scheduler_drop_idle - Move SHORTENED_MAX_POOLING_DURATION constant to test_scheduler_drop_stale where it's used - Convert test_max_pooling_duration from let to const TEST_MAX_POOLING_DURATION following conventions - Create TEST_WAIT_FOR_IDLE constant (500ms) with explicit safety margin tied to TEST_MAX_POOLING_DURATION - Increase safety margin from 50ms to 200ms for CI reliability - Add detailed comments explaining 300ms pooling duration tradeoff (speed vs race condition window) - Update assertion comment with explicit timing guarantees: old_scheduler ~500ms old (exceeds 300ms threshold), new_scheduler ~50ms old (below threshold) Signed-off-by: AvhiMaz <avhimazumder5@outlook.com> * refactor: tie test duration constants together and improve comment accuracy Signed-off-by: AvhiMaz <avhimazumder5@outlook.com> * fix: cargo fmt Signed-off-by: AvhiMaz <avhimazumder5@outlook.com> --------- Signed-off-by: AvhiMaz <avhimazumder5@outlook.com>

ignore flaky test

70043d8

gregcusack self-requested a review October 1, 2025 08:33

bw-solana mentioned this pull request Oct 1, 2025

Re-enable test_scheduler_drop_idle #8279

Closed

bw-solana marked this pull request as ready for review October 1, 2025 08:38

bw-solana requested a review from steviez October 1, 2025 08:38

gregcusack approved these changes Oct 1, 2025

View reviewed changes

lijunwangs approved these changes Oct 1, 2025

View reviewed changes

yihau merged commit 9151384 into anza-xyz:master Oct 1, 2025
42 of 43 checks passed

yihau mentioned this pull request Oct 2, 2025

revert to upstream quinn #8307

Merged

AvhiMaz mentioned this pull request Nov 12, 2025

fix: re-enable test_scheduler_drop_idle by resolving race condition #9042

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ignore flaky test#8278

ignore flaky test#8278
yihau merged 1 commit intoanza-xyz:masterfrom
bw-solana:ignore_flaky_test

bw-solana commented Oct 1, 2025 •

edited

Loading

Uh oh!

gregcusack left a comment

Uh oh!

codecov-commenter commented Oct 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

bw-solana commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Summary of Changes

Uh oh!

gregcusack left a comment

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Oct 1, 2025

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bw-solana commented Oct 1, 2025 •

edited

Loading