chipStar enables porting HIP and CUDA applications to platforms which support SPIR-V as the device intermediate representation. It supports OpenCL and Level Zero as the low-level runtime alternatives.
chipStar was initially built by combining the prototyping work done in the (now obsolete) HIPCL and HIPLZ projects.
If you wish to cite chipStar in academic publications, please refer to the HIPCL poster abstract when discussing the OpenCL backend and/or the HIPLZ conference paper when mentioning the Level Zero backend. The core developers of chipStar are writing a proper article of the integrated chipStar project, but it is in progress.
The name chipStar comes from c
uda and hip
and the word Star
which means asterisk, a typical shell wildcard, denoting the intention to make "CUDA and HIP applications run everywhere". The project was previously called CHIP-SPV.
While chipStar 1.1 can already be used to run various large HPC applications successfully, it is still heavily in development mode with plenty of known issues and unimplemented features. There are also known low-performance optimizations that are still to be done. However, we consider chipStar ready for wider-range testing and welcome community contributions in form of reproducible bug reports and good quality pull requests.
Release notes for 1.1, 1.0 and 0.9.
- Cmake >= 3.20.0
- Clang and LLVM 17 (Clang/LLVM 15 and 16 might also work)
- Can be installed, for example, by adding the LLVM's Debian/Ubuntu repository and installing packages 'clang-17 llvm-17 clang-tools-17'.
- For the best results, install Clang/LLVM from a chipStar LLVM/Clang branch which has fixes that are not yet in the LLVM upstream project. See below for a scripted way to build and install the patched versions.
- SPIRV-LLVM-Translator from a branch matching the LLVM major version:
(e.g. llvm_release_170 for LLVM 17)
llvm-spirv.
- Make sure the built llvm-spirv binary is installed into the same path as clang binary, otherwise clang might find and use a different llvm-spirv, leading to errors.
- For the best results, install it from a chipStar branch which has fixes that are not yet upstreamed.
It's recommended to use the chipStar forks of LLVM and SPIRV-LLVM-Translator. For this you can use a script included in the chipStar repository:
# chipStar/scripts/configure_llvm.sh <version 15/16/17> <install_dir> <static/dynamic>
chipStar/scripts/configure_llvm.sh 17 /opt/install/llvm/17.0 dynamic
cd llvm-project/llvm/build_17
make -j 16
<sudo> make install
Or you can do the steps manually:
git clone --depth 1 https://github.com/CHIP-SPV/llvm-project.git -b chipStar-llvm-17
cd llvm-project/llvm/projects
git clone --depth 1 https://github.com/CHIP-SPV/SPIRV-LLVM-Translator.git -b chipStar-llvm-17
# DLLVM_ENABLE_PROJECTS="clang;openmp" OpenMP is optional but many apps use it
# DLLVM_TARGETS_TO_BUILD Speed up compilation by building only the necessary CPU host target
# CMAKE_INSTALL_PREFIX Where to install LLVM
cmake -S llvm -B build \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS="clang;openmp" \
-DLLVM_TARGETS_TO_BUILD=X86 \
-DCMAKE_INSTALL_PREFIX=$HOME/local/llvm-17
make -C build -j8 all install
- An OpenCL 2.0 or 3.0 driver with at least the following features supported:
- Coarse-grained buffer Shared Virtual Memory
- Generic address space
- SPIR-V input
- Program scope variables
- Further OpenCL extensions or features might be needed depending on the compiled CUDA/HIP application. For example, to support warp-primitives, the OpenCL driver should support also additional subgroup features such as shuffles, ballots and cl_intel_required_subgroup_size.
- Intel Compute Runtime or oneAPI
- oneAPI Level Zero Loader
- For HIP-SYCL and HIP-MKL Interoperability: oneAPI
You can download and unpack the latest released source package or clone the development branch via git. We aim to keep the main
development branch stable, but it might have stability issues during the development cycle.
To clone the sources from Github:
git clone https://github.com/CHIP-SPV/chipStar.git
cd chipStar
git submodule update --init --recursive
mkdir build && cd build
# LLVM_CONFIG_BIN is optional if LLVM can be found in PATH or if not using a version-sufficed
# binary (for example, llvm-config-17)
cmake .. \
-DLLVM_CONFIG_BIN=/path/to/llvm-config
-DCMAKE_INSTALL_PREFIX=/path/to/install
make all build_tests install -j8
NOTE: If you don't have libOpenCL.so (for example from the ocl-icd-opencl-dev
package), but only libOpenCL.so.1 installed, CMake fails to find it and disables the OpenCL backend. This issue describes a workaround.
There's a script check.py
which can be used to run unit tests and which filters out known failing tests for different platforms. Its usage is as follows.
# BACKEND={opencl/level0-{reg,imm}/pocl}
# ^ Which backend/driver/platform you wish to test:
# "opencl" = Intel OpenCL runtime, "level0" = Intel LevelZero runtime with regular command lists (reg) or immediate command lists (imm), "pocl" = PoCL OpenCL runtime
# DEVICE={cpu,igpu,dgpu} # What kind of device to test.
# ^ This selects the expected test pass lists.
# 'igpu' is a Intel Iris Xe iGPU, 'dgpu' a typical recent Intel dGPU such as Data Center GPU Max series or an Arc.
# PARALLEL={N} # How many tests to run in parallel.
# export CHIP_PLATFORM=N # If there are multiple OpenCL platforms present on the system, selects which one to use
python3 $SOURCE_DIR/scripts/check.py -m off --num-threads $PARALLEL $BUILD_DIR $DEVICE $BACKEND
Please refer to the user documentation for instructions on how to use the installed chipStar to build CUDA/HIP programs.
CHIP_BE=<opencl/level0> # Selects the backend to use. If both Level Zero and OpenCL are available, Level Zero is used by default
CHIP_PLATFORM=<N> # If there are multiple platforms present on the system, selects which one to use. Defaults to 0
CHIP_DEVICE=<N> # If there are multiple devices present on the system, selects which one to use. Defaults to 0
CHIP_LOGLEVEL=<trace/debug/info/warn/err/crit> # Sets the log level. If compiled in RELEASE, only err/crit are available
CHIP_DUMP_SPIRV=<ON/OFF(default)> # Dumps the generated SPIR-V code to a file
CHIP_JIT_FLAGS=<flags> # String to override the default JIT flags. Defaults to -cl-kernel-arg-info -cl-std=CL3.0
CHIP_L0_COLLECT_EVENTS_TIMEOUT=<N(30s default)> # Timeout in seconds for collecting Level Zero events
CHIP_L0_IMM_CMD_LISTS=<ON(default)/OFF> # Use immediate command lists in Level Zero
Example:
╭─pvelesko@cupcake ~
╰─$ clinfo -l
Platform #0: Intel(R) OpenCL Graphics
`-- Device #0: Intel(R) Arc(TM) A380 Graphics
Platform #1: Intel(R) OpenCL Graphics
`-- Device #0: Intel(R) UHD Graphics 770
Based on these values, if we want to run on OpenCL iGPU:
export CHIP_BE=opencl
export CHIP_PLATFORM=1
export CHIP_DEVICE=0
NOTE: Level Zero doesn't have a clinfo equivalent. Normally if you have more than one Level Zero device, there will only be a single platform so set CHIP_PLATFORM=0 and then CHIP_DEVICE to the device you want to use.
*You can check the name of the device by running a sample which prints the name such as build/samples/0_MatrixMultiply/MatrixMultiply
This occurs often when the latest installed GCC version doesn't include libstdc++, and Clang++ by default chooses the latest found one regardless, and ends up failing to link C++ programs. The problem is discussed here.
The issue can be resolved by defining a Clang++ configuration file which forces the GCC to what we want. Example:
echo --gcc-install-dir=/usr/lib/gcc/x86_64-linux-gnu/11 > ~/local/llvm-17/bin/x86_64-unknown-linux-gnu-clang++.cfg
When running the tests on OpenCL devices which do not support double precision floats, there will be multiple tests that will error out.
It might be possible to enable software emulation of double precision floats for Intel iGPUs by setting two environment variables to make kernels using doubles work but with the major overhead of software emulation:
export IGC_EnableDPEmulation=1
export OverrideDefaultFP64Settings=1