Skip to content

v0.3: Merge pull request #248 from raffenet/0.3-changes

Latest
Compare
Choose a tag to compare
@raffenet raffenet released this 05 Oct 19:50
· 13 commits to main since this release
91873ad

Changes in 0.3

  • Default to detecting the CUDA device capabilities at configure
    time. If no device is found on the build system, build all "major"
    CUDA capabilities to cut down on build time and library size. (thanks
    to Jeff Hammond for contributing)

  • Add support for mixed memory types (thanks to ParTec AG for
    contributing)

  • Add HIP backend for stream APIs

  • Add automatic HIP SM detection

  • Add automatic CUDA SM detection

  • Add support for user-specified CUDA compiler

  • Add support in --ze-native option to compile for multiple devices

  • Add support for --pup-max-nesting < 2 in genpup.py

  • Add support for --ze-revision-id to pass to ocloc compiler

  • Other bug fixes and code cleanup