Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

CUB 1.6.3

Compare
Choose a tag to compare
@brycelelbach brycelelbach released this 19 May 08:45

Summary

CUB 1.6.3 improves support for Windows, changes cub::BlockLoad/cub::BlockStore interface to take the local data type, and enhances radix sort performance for SM6x (Pascal) GPUs.

Breaking Changes

  • cub::BlockLoad and cub::BlockStore are now templated by the local data type, instead of the Iterator type. This allows for output iterators having void as their value_type (e.g. discard iterators).

Other Enhancements

  • Radix sort tuning policies updated for SM6x (Pascal) GPUs - 6.2B 4 byte keys/s on GP100.
  • Improved support for Windows (warnings, alignment, etc).

Bug Fixes

  • #74: cub::WarpReduce executes reduction operator for out-of-bounds items.
  • #72: cub:InequalityWrapper::operator should be non-const.
  • #71: cub::KeyValuePair won't work if Key has non-trivial constructor.
  • #69: cub::BlockStore::Storedoesn't compile ifOutputIteratorT::value_typeisn'tT`.
  • #68: cub::TilePrefixCallbackOp::WarpReduce doesn't permit PTX arch specialization.