{ai,lib}[GCCcore/12.2.0,foss/2022b] PyTorch v2.1.2, NCCL v2.18.3 w/ CUDA 12.0.0#20520
Conversation
|
Test report by @SebastianAchilles |
|
Test report by @SebastianAchilles |
|
Test report by @SebastianAchilles |
That first one failed with
I see that every now and then in various different tests especially I'll do a larger repeated run for both PRs over the weekend so I'll have the results to compare on Tuesday (Monday is a public holiday here) |
Updated software
|
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
f27d797 to
a9a5a6b
Compare
|
Test report by @akesandgren |
| github_account = 'NVIDIA' | ||
| source_urls = [GITHUB_SOURCE] | ||
| sources = ['v%(version)s-1.tar.gz'] | ||
| patches = ['NCCL-2.16.2_fix-cpuid.patch'] |
There was a problem hiding this comment.
Doesn't this one also need NCCL-2.18.3_fix-cudaMemcpyAsync.patch like NCCL-2.18.3-GCCcore-12.3.0-CUDA-12.1.1.eb
There was a problem hiding this comment.
Makes sense I guess, added
|
Test report by @akesandgren |
|
Going in, thanks @Flamefire! |
(created using
eb --new-pr)This is meant as an alternative to #20155 using a newer NCCL version as the older one currently included in foss/2022b doesn't seem to work with PyTorch 2.1.2
Update: Seems #20155 works now. So putting this one on hold
Requires: