Skip to content

Fix CI break for rocprof trace decoder#3966

Merged
ApoKalipse-V merged 2 commits into
mainfrom
users/gbaraldi/fix_trace_decoder_amdgpu
Mar 13, 2026
Merged

Fix CI break for rocprof trace decoder#3966
ApoKalipse-V merged 2 commits into
mainfrom
users/gbaraldi/fix_trace_decoder_amdgpu

Conversation

@ApoKalipse-V
Copy link
Copy Markdown
Contributor

@ApoKalipse-V ApoKalipse-V commented Mar 13, 2026

Motivation

Fixes breakage by PR: #3723
The build infra requires DIST_AMDGPU_TARGETS even if the list isn't used by the underlying project.

Technical Details

Test Plan

Test Result

Submission Checklist

@ApoKalipse-V ApoKalipse-V requested a review from ScottTodd March 13, 2026 17:11
@ApoKalipse-V ApoKalipse-V changed the title Fix cmake for rocprof trace decoder Fix CI break for rocprof trace decoder Mar 13, 2026
@ScottTodd
Copy link
Copy Markdown
Member

@ApoKalipse-V
Copy link
Copy Markdown
Contributor Author

Copy link
Copy Markdown
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test run at https://github.com/ROCm/TheRock/actions/runs/23063324268/job/66996069227 got past the previously failing stage (https://github.com/ROCm/TheRock/actions/runs/23055657477/job/66969204535#step:12:189). Looks fine to merge this fix-forward (given that multi-arch CI is not yet running on PRs, if it was then I'd prefer a revert instead of a fix-forward)

@ApoKalipse-V ApoKalipse-V merged commit f521dd8 into main Mar 13, 2026
54 of 59 checks passed
@ApoKalipse-V ApoKalipse-V deleted the users/gbaraldi/fix_trace_decoder_amdgpu branch March 13, 2026 19:26
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Mar 13, 2026
@ScottTodd
Copy link
Copy Markdown
Member

Ah, this didn't fully fix the build: https://github.com/ROCm/TheRock/actions/runs/23067115978/job/67013062466#step:13:215

[199/260] Configure sub-project composable_kernel (in background)
[200/260] Building sub-project composable_kernel (in background)
FAILED: ml-libs/composable_kernel/stamp/build.stamp /__w/TheRock/TheRock/build/ml-libs/composable_kernel/stamp/build.stamp 
cd /__w/TheRock/TheRock/build/ml-libs/composable_kernel/build && /opt/python/cp312-cp312/bin/python3.12 /__w/TheRock/TheRock/build_tools/teatime.py --log-timestamps --label composable_kernel --interactive /__w/TheRock/TheRock/build/logs/composable_kernel_build.log -- /usr/local/therock-tools/bin/cmake -E env --unset=ROCM_PATH --unset=ROCM_DIR --unset=HIP_PATH --unset=HIP_DIR -- /usr/local/therock-tools/bin/cmake --build /__w/TheRock/TheRock/build/ml-libs/composable_kernel/build && /usr/local/therock-tools/bin/cmake -E touch /__w/TheRock/TheRock/build/ml-libs/composable_kernel/stamp/build.stamp
[composable_kernel] [1/1686] Building CXX object library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o
[composable_kernel] FAILED: library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o 
[composable_kernel] ccache /__w/TheRock/TheRock/build/core/clr/dist/lib/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TILE_USE_WMMA=1 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DCK_USE_XDL -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/__w/TheRock/TheRock/rocm-libraries/projects/composablekernel/library/include -I/__w/TheRock/TheRock/rocm-libraries/projects/composablekernel/include -I/__w/TheRock/TheRock/build/ml-libs/composable_kernel/build/include -I/__w/TheRock/TheRock/build/profiler/rocprofiler-sdk/stage/include -I/__w/TheRock/TheRock/build/profiler/rocprof-trace-decoder/stage/include -I/__w/TheRock/TheRock/build/profiler/roctracer/stage/include -I/__w/TheRock/TheRock/build/base/half/stage/include -I/__w/TheRock/TheRock/build/third-party/sysdeps/linux/libdrm/build/stage/lib/rocm_sysdeps/include -I/__w/TheRock/TheRock/build/third-party/sysdeps/linux/numactl/build/stage/lib/rocm_sysdeps/include -isystem /__w/TheRock/TheRock/build/core/clr/dist/include -Wno-documentation-unknown-command -Wno-documentation-pedantic -Wno-unused-command-line-argument -Wno-explicit-specialization-storage-class --hip-path=/__w/TheRock/TheRock/build/core/clr/dist --hip-device-lib-path=/__w/TheRock/TheRock/build/core/clr/dist/lib/llvm/amdgcn/bitcode -O3 -DNDEBUG -std=c++20 -fPIC   -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-c++20-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -fbracket-depth=1024 -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -Werror -Weverything -fcolor-diagnostics --offload-compress -x hip --offload-arch=gfx1200 --offload-arch=gfx1201 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o -c /__w/TheRock/TheRock/rocm-libraries/projects/composablekernel/library/src/tensor_operation_instance/gpu/conv1d_bwd_data/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp
[composable_kernel] clang++: error: no such include directory: '/__w/TheRock/TheRock/build/profiler/rocprof-trace-decoder/stage/include' [-Werror,-Wmissing-include-dirs]

We may need to revert this and #3723

@ScottTodd
Copy link
Copy Markdown
Member

I'm working with @marbre to test some potential fix-forwards. If those don't fully resolve this we may revert instead.

marbre added a commit that referenced this pull request Mar 17, 2026
## Motivation

Following up on these PRs:
* #3723
* #3966

the build was still broken:
```
[199/260] Configure sub-project composable_kernel (in background)
[200/260] Building sub-project composable_kernel (in background)
FAILED: ml-libs/composable_kernel/stamp/build.stamp /__w/TheRock/TheRock/build/ml-libs/composable_kernel/stamp/build.stamp 
cd /__w/TheRock/TheRock/build/ml-libs/composable_kernel/build && /opt/python/cp312-cp312/bin/python3.12 /__w/TheRock/TheRock/build_tools/teatime.py --log-timestamps --label composable_kernel --interactive /__w/TheRock/TheRock/build/logs/composable_kernel_build.log -- /usr/local/therock-tools/bin/cmake -E env --unset=ROCM_PATH --unset=ROCM_DIR --unset=HIP_PATH --unset=HIP_DIR -- /usr/local/therock-tools/bin/cmake --build /__w/TheRock/TheRock/build/ml-libs/composable_kernel/build && /usr/local/therock-tools/bin/cmake -E touch /__w/TheRock/TheRock/build/ml-libs/composable_kernel/stamp/build.stamp
[composable_kernel] [1/1686] Building CXX object library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o
[composable_kernel] FAILED: library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o 
[composable_kernel] ccache /__w/TheRock/TheRock/build/core/clr/dist/lib/llvm/bin/clang++ -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_TILE_USE_WMMA=1 -DCK_TIME_KERNEL=1 -DCK_USE_OCP_FP8 -DCK_USE_WMMA -DCK_USE_WMMA_FP8 -DCK_USE_XDL -DDPP_KERNELS -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/__w/TheRock/TheRock/rocm-libraries/projects/composablekernel/library/include -I/__w/TheRock/TheRock/rocm-libraries/projects/composablekernel/include -I/__w/TheRock/TheRock/build/ml-libs/composable_kernel/build/include -I/__w/TheRock/TheRock/build/profiler/rocprofiler-sdk/stage/include -I/__w/TheRock/TheRock/build/profiler/rocprof-trace-decoder/stage/include -I/__w/TheRock/TheRock/build/profiler/roctracer/stage/include -I/__w/TheRock/TheRock/build/base/half/stage/include -I/__w/TheRock/TheRock/build/third-party/sysdeps/linux/libdrm/build/stage/lib/rocm_sysdeps/include -I/__w/TheRock/TheRock/build/third-party/sysdeps/linux/numactl/build/stage/lib/rocm_sysdeps/include -isystem /__w/TheRock/TheRock/build/core/clr/dist/include -Wno-documentation-unknown-command -Wno-documentation-pedantic -Wno-unused-command-line-argument -Wno-explicit-specialization-storage-class --hip-path=/__w/TheRock/TheRock/build/core/clr/dist --hip-device-lib-path=/__w/TheRock/TheRock/build/core/clr/dist/lib/llvm/amdgcn/bitcode -O3 -DNDEBUG -std=c++20 -fPIC   -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Wno-missing-field-initializers -Wno-error=deprecated-declarations -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-unused-template -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-unused-lambda-capture -Wno-nvcc-compat -Wno-c++20-compat -Wno-bit-int-extension -Wno-pass-failed -Wno-switch-default -Wno-unique-object-duplication -fbracket-depth=1024 -Wno-nrvo -fno-offload-uniform-block -mllvm --lsr-drop-solution=1 -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -Werror -Weverything -fcolor-diagnostics --offload-compress -x hip --offload-arch=gfx1200 --offload-arch=gfx1201 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/conv1d_bwd_data/CMakeFiles/device_conv1d_bwd_data_instance.dir/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp.o -c /__w/TheRock/TheRock/rocm-libraries/projects/composablekernel/library/src/tensor_operation_instance/gpu/conv1d_bwd_data/device_conv1d_bwd_data_xdl_nwc_kxc_nwk_int8_instance.cpp
[composable_kernel] clang++: error: no such include directory: '/__w/TheRock/TheRock/build/profiler/rocprof-trace-decoder/stage/include' [-Werror,-Wmissing-include-dirs]
```

## Technical Details

`INTERFACE_INCLUDE_DIRS`/`INTERFACE_LINK_DIRS` leaked transitively
through rocprofiler-sdk to consumers in other stages (e.g.
composable_kernel, MIOpen), causing `-Wmissing-include-dirs` errors

The trace decoder appears to be a runtime-only dependency (loaded by
`rocprofv3 --att`), so it needs no compile-time interface dirs. This
decouples it from the rocprofiler-sdk subproject dependency chain and
instead lists it directly in the rocprofiler-sdk artifact
`SUBPROJECT_DEPS` for packaging.

## Test Plan

* Multi-arch
* Passed multi-arch CI builds:
https://github.com/ROCm/TheRock/actions/runs/23157787783
* This might later fail some tests (including after ROCm itself, in
pytorch), I haven't fully tested that
* Non-multi-arch
  * CI on this PR itself (build + test)

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Marius Brehler <marius.brehler@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants