Skip to content

v3.9.0

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 21 Feb 12:45
4d14bf9

Change logs

  • Add cublasStrsm_v2, cublasSgelsBatched.
  • [Runtime] Do nothing instead of panicking. (release-mode only)
  • Minor fixes.
  • [Nightly] Enable FP16 Conv2d.
  • [Nightly] Enable PyTorch Flash Attention 2.

cuDNN on Windows

From ZLUDA v3.9.0, the nightly build includes FP16 Conv2d and Flash Attention 2 support.

Supported Architectures

  • gfx908, gfx90a
  • gfx940, gfx941, gfx942
  • gfx1030
  • gfx1100, gfx1101, gfx1102
  • gfx1150

In order to enable cuDNN acceleration on supported device, you should download and unpack HIP SDK extension upon your existing HIP SDK 6.2 installation.

HIP SDK extension: DOWNLOAD
(unzip and paste folders upon path/to/AMD/ROCm/6.2)

※ HIP SDK extension does not include hipBLASLt.