{ai}[foss/2024a] PyTorch v2.9.1#25240
Conversation
…2.9.1_avoid-multiprocess-tests-hanging-forever.patch, PyTorch-2.9.1_fix-hypothesis-deadline.patch, PyTorch-2.9.1_fix-iteration-in-fligh-reporter.patch, PyTorch-2.9.1_fix-test_dist2-decorators.patch, PyTorch-2.9.1_ignore-warning-incompatible-pointer-types.patch, PyTorch-2.9.1_skip-RingFlexAttentionTest.patch, PyTorch-2.9.1_skip-tests-requiring-SM90.patch
|
Diff of new easyconfig(s) against existing ones is too long for a GitHub comment. Use |
|
Test report by @verdurin |
|
@verdurin You ran out of space in /dev/shm |
Yes, retrying with a different |
|
Test report by @verdurin |
|
Test report by @Flamefire |
|
Test report by @pavelToman ERROR EasyBuild encountered an error: Couldn't find file PyTorch-check-cutlass.py anywhere |
|
Test report by @Flamefire |
|
@boegelbot please test @ jsc-zen3 |
|
@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 3893382597 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @Flamefire |
|
Test report by @boegel |
|
@boegel That's unfortunate. Any hint in the full log about those 2:
The rest look OK-ish: |
This was with 16 cores, 30GB of RAM available, which may be too tight? edit: I've kickstarted another test with more RAM available... |
|
SIGKILL does indeed sound like it was OOM killed, or Slurm memory limits exceeded. The
|
|
Test report by @boegel |
|
Test report by @boegelbot |
|
Going in, thanks @Flamefire! |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
(created using
eb --new-pr)