Skip to content

[CI Failure]: Nightly B200 LM Eval Failure #28401

@robertgshaw2-redhat

Description

@robertgshaw2-redhat

Name of failing test

FAILED evals/gsm8k/test_gsm8k_correctness.py::test_gsm8k_correctness_param[Qwen1.5-MoE-W4A16-CT-tp1] - AssertionError: Accuracy too low: 0.000 < 0.450 - 0.080

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

Crash

📝 History of failing test

https://buildkite.com/vllm/ci/builds/38163#019a66fd-11a1-407d-8b8e-047044b28f45

CC List.

No response

Metadata

Metadata

Assignees

Labels

ci-failureIssue about an unexpected test failure in CI

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions