[CI][Fix] deterministic seed for flaky CI runs on structured outputs#24380
Conversation
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
There was a problem hiding this comment.
Code Review
This pull request addresses flaky CI runs by introducing a deterministic seed for tests that use structured outputs. The change re-enables a previously flaky test and adds a seed to the LLM constructor within test_structured_output. This is a good improvement for test stability. My review suggests extending this fix to other non-deterministic tests in the same file to prevent future flakiness and ensure comprehensive CI reliability.
| guided_decoding_backend=guided_decoding_backend, | ||
| guided_decoding_disable_any_whitespace=(guided_decoding_backend | ||
| in {"xgrammar", "guidance"}), | ||
| seed=120, |
There was a problem hiding this comment.
While adding a seed here helps make this test deterministic, the fix for CI flakiness seems incomplete. Other tests in this file also use non-deterministic sampling parameters but are not seeded, which could lead to continued flakiness.
Specifically, the LLM initializations in the following tests should also have a deterministic seed (e.g., seed=120):
test_structured_output_with_reasoning_matricestest_structured_output_auto_modetest_structured_output_batched_with_non_guided_requests
Applying this change consistently will help ensure CI stability.
|
Thanks for fixing! |
…llm-project#24380) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
…llm-project#24380) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
…llm-project#24380) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
…llm-project#24380) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
This PR re-enables ministral tests for structured outputs with fixed seed in the hope to avoid flakiness in CI runs.
Signed-off-by: Aaron Pham contact@aarnphm.xyz