v0.7.0
The testing and benchmarking infrastructure has been entirely rewritten to be significantly more comprehensive and cleaner. There are also now scripts for nicely plotting benchmark results.
Numerous bugfixes and similar improvements:
- Aluminum no longer attempts to use bitwise reductions for
long double
. - Fixed bug in the host-transfer
Allreduce
on one processor. - Fix in-place bugs in the NCCL
Gather
,Gatherv
,Scatter
, andScatterv
, operations. - Fix MPI type for
long int
. - The
throw_al_exception
macro works outside of theAl
namespace. - Added a check for version mismatches in the version of HWLOC Aluminum was compiled with versus the one that is used at runtime.
- All internal Aluminum headers are now included with the
aluminum/
prefix to avoid conflicts with other projects.