CLBlast 1.5.2
CLBlast version 1.5.2. Changes since previous release (version 1.5.1):
- Changed XAMAX/XAMIN to more likely return first rather than last min/max index, updated API docs
- Added batched routines to pyclblast
- Added CLBLAST_VERSION_MAJOR/MINOR/PATCH defines in headers to store version numbering
- Several small improvements to the benchmark script (thanks to 'baryluk')
- Fixed a bug in the caching when using a context with multiple devices
- Fixed a bug in the tuners related to global workgroup size not being a multiple of the local
- Various minor fixes and enhancements
- Added tuned parameters for various devices (see doc/tuning.md)