diff --git a/README.md b/README.md index a310acf4ee72..4e5bf22346e3 100644 --- a/README.md +++ b/README.md @@ -211,6 +211,25 @@ For detailed instructions on how to debug Triton's frontend, please refer to thi - `LLVM_ENABLE_TIMING` dumps the timing information for each LLVM pass. - `TRITON_DEFAULT_FP_FUSION` overrides the default behavior of allowing fp fusion (mul+add->fma). - `MLIR_ENABLE_REMARK` enables the performance warnings that are emitted as remarks. +- `TRITON_KERNEL_DUMP` enables the dumping of the IR from each compilation stage and the final ptx. +- `TRITON_DUMP_DIR` specifies the directory to save the dumped IR and ptx when `TRITON_KERNEL_DUMP` is set to 1. +- `TRITON_KERNEL_OVERRIDE` enables the override of the compiled kernel with a user-specified IR/ptx at the beginning of each compilation stage. +- `TRITON_OVERRIDE_DIR` specifies the directory from which to load the IR/ptx files when `TRITON_KERNEL_OVERRIDE` is set to 1. + +**Kernel Override Steps** + +```bash +export TRITON_ALWAYS_COMPILE=1 +export TRITON_KERNEL_DUMP=1 +export TRITON_DUMP_DIR= +export TRITON_KERNEL_OVERRIDE=1 +export TRITON_OVERRIDE_DIR= +# Step 1: Run the kernel once to dump kernel's IRs and ptx in $TRITON_DUMP_DIR +# Step 2: Copy $TRITON_DUMP_DIR/ to $TRITON_OVERRIDE_DIR +# Step 3: Delete the stages that you do not want to override and modify the stage you do want to override +# Step 4: Run the kernel again to see the overridden result +``` + # Changelog