AWS OFI NCCL v1.7.3
This release is intended only for use on AWS P* instances. A general release that supports other Libfabric networks will be made in the near future. This release includes the following changes:
- Do not disable LL and LL128 protocols on P5 instances.
- Add support for g5.48xlarge instance types.
- Fix a block in use leak in the freelist implementation.
- For NCCL 2.18.5 or later, don't disable NVLS support.
- Fix bug in handling retry error issues from Libfabric in the RDMA transport (P5 instance types).
This release has been tested on P3dn, P4d/P4de, and P5 using the EFA provider in Libfabric.