Performance optimizations

Improved GEMM-based convolutions performance.
Improved softmax performance.
Added arbitrary eltwise fusion support in GEMM-based convolutions and inner product.

New functionality

Introduced bfloat16 data type support in reorders, (de-)convolution, pooling, batch normalization, local response normalization, eltwise, inner product, shuffle, sum, and concat. The implementation relies on new instructions targeting future Intel Xeon Scalable processor (codename Cooper Lake). On the processors with Intel AVX512 support bfloat16 arithmetic is emulated.

Thanks to the contributors

This release contains contributions from many Intel Performance Libraries developers. We would also like to thank everyone who asked questions and reported issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.20

Performance optimizations

New functionality

Thanks to the contributors