Skip to content

Add intial runfile installer code written by David Bielecki.#3643

Closed
pbhandar-amd wants to merge 6 commits into
mainfrom
users/pbhandar/runfile-installer
Closed

Add intial runfile installer code written by David Bielecki.#3643
pbhandar-amd wants to merge 6 commits into
mainfrom
users/pbhandar/runfile-installer

Conversation

@pbhandar-amd
Copy link
Copy Markdown

@pbhandar-amd pbhandar-amd commented Feb 26, 2026

Motivation

Add code for the runfile installer which is going to be another install method, alongside tarball, python wheels and package manager.
The installer provides a single ROCm installer for all-gfx archs, handling dependency install as well as post-installation for setting up ROCm, and other installation options from a single unified .run file. Provided ROCm/amdgpu dependencies are installed on a system, the runfile installer may be used for offline install of ROCm and/or the amdgpu driver.
The runfile installer is an installation method available for installing ROCm and/or the amdgpu driver in the legacy ROCm releases (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/rocm-runfile-installer.html). We are adding support for ROCm releases from TheRock starting with re-introduction of the installer in 7.12 (ROCM-1861).

Technical Details

File changes/additions:

  • build_tools/packaging/runfile-installer (addition to https://github.com/ROCm/TheRock/tree/main/build_tools/packaging)
    • runfile-installer - the installer project, which includes the main build script : build-runfile-installer.sh for building the install .run.
    • runfile-installer/build-installer
      • Contains the build scripts used by build-runfile-installer.sh for setting up and building the output .run file for the installer.
    • runfile-installer/package-extractor
      • Contains the package-extractor scripts used in the build pipeline to extract packages used for the content of the installer ie. core-sdk
        • we use the current packages built by TheRock as a source for the installer (this implementation carried forward from the legacy installer)
    • runfile-installer/package-puller
      • Pulls/downloads the packages needed for installer (core-sdk / gfxXYZ).
    • runfile-installer/rocm-installer
      • Main installer which will contain all the contents needed for installation of ROCm and amdgpu driver, as well as the GUI frontend.
    • runfile-installer/UI
      • Source for the GUI frontend interface to the runfile installer.
      • GUI is optional and users can use the command-line interface to the installer if they choose.

The Build:
- Run the build-runfile-installer script with args required for the type of installer build job: dev, nightly, prerelease
- Builds are run using the ManyLinux docker from TheRock, locally or via build jobs being developed for TheRock.

build-runfile-installer.sh
	|
	—> setup-installer —> package-puller —> build-installer —> package-extractor —> GUI build —> .run create (makeself)

Test Plan

  • For legacy runfile installer, built runfile installers were tested by SQA and the Installer team. The plan is the same for TheRock version.
  • The installer testing is performed across the 15+ Linux distros supported by ROCm using internal validation scripts.
  • For TheRock, we would like to add a build test to validate the construction of the runfile installer .run file.

Test Result

Build pass:
- Is the file rocm-installer.run created

Build failure:
- Package-pull failure - failed to download packages need to build the installer.
- Package-extraction failure - failed to install tools need to extract packages or failure during extraction.
- GUI build failure
- makeself failure when constructing the rocm-installer .run file

Install pass:
- ROCm and amdgpu installs without error using command-line or GUI interfaces to the installer
- Basic validation passes of ROCm/amdgpu

Install failure:
- ROCm or amdgpu fails during install from command-line or GUI
- ROCm or amdgpu installs, but is not functional.

Submission Checklist

@pbhandar-amd pbhandar-amd marked this pull request as ready for review February 26, 2026 16:55
Copy link
Copy Markdown
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fill in the PR description. I'm not going to look at this until it meets that bare minimum bar.

@stellaraccident stellaraccident self-requested a review February 26, 2026 22:05
Copy link
Copy Markdown
Collaborator

@stellaraccident stellaraccident left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No tests? How are we supposed to maintain this?

Separately: it looks like you leaked gpg keys. Please explain to me internally how this is ok and whether we need to rotate anything.

@astrelsky
Copy link
Copy Markdown
Contributor

No tests? How are we supposed to maintain this?

Separately: it looks like you leaked gpg keys. Please explain to me internally how this is ok and whether we need to rotate anything.

$ gpg --list-packets rocm-keyring.gpg
# off=0 ctb=99 tag=6 hlen=3 plen=525
:public key packet:
        version 4, algo 1, created 1470083360, expires 0
        pkey[0]: [4096 bits]
        pkey[1]: [17 bits]
        keyid: 9386B48A1A693C5C
# off=528 ctb=b4 tag=13 hlen=2 plen=40
:user ID packet: "AMD MLSE DevOps <dl.MLSE.DevOps@amd.com>"
# off=570 ctb=89 tag=2 hlen=3 plen=574
:signature packet: algo 1, keyid 9386B48A1A693C5C
        version 4, created 1643876755, md5len 0, sigclass 0x13
        digest algo 2, begin of digest 8d 40
        hashed subpkt 27 len 1 (key flags: 03)
        hashed subpkt 11 len 5 (pref-sym-algos: 9 8 7 3 2)
        hashed subpkt 21 len 5 (pref-hash-algos: 8 2 9 10 11)
        hashed subpkt 22 len 3 (pref-zip-algos: 2 3 1)
        hashed subpkt 30 len 1 (features: 01)
        hashed subpkt 23 len 1 (keyserver preferences: 80)
        hashed subpkt 2 len 4 (sig created 2022-02-03)
        hashed subpkt 9 len 4 (key expires after 10y186d11h56m)
        subpkt 16 len 8 (issuer key ID 9386B48A1A693C5C)
        data: [4096 bits]
# off=1147 ctb=b9 tag=14 hlen=3 plen=525
:public sub key packet:
        version 4, algo 1, created 1470083360, expires 0
        pkey[0]: [4096 bits]
        pkey[1]: [17 bits]
        keyid: 30C07AF01A6D36BA
# off=1675 ctb=89 tag=2 hlen=3 plen=549
:signature packet: algo 1, keyid 9386B48A1A693C5C
        version 4, created 1643876791, md5len 0, sigclass 0x18
        digest algo 10, begin of digest c0 1c
        hashed subpkt 27 len 1 (key flags: 0C)
        hashed subpkt 2 len 4 (sig created 2022-02-03)
        hashed subpkt 9 len 4 (key expires after 10y186d11h57m)
        subpkt 16 len 8 (issuer key ID 9386B48A1A693C5C)
        data: [4096 bits]
$ gpg --list-packets rocm-internal.gpg
# off=0 ctb=99 tag=6 hlen=3 plen=269
:public key packet:
        version 4, algo 1, created 1664219662, expires 0
        pkey[0]: [2048 bits]
        pkey[1]: [17 bits]
        keyid: AC344D255E48C714
# off=272 ctb=b4 tag=13 hlen=2 plen=59
:user ID packet: "compute-artifactory (Signing key) <jenkins-compute@amd.com>"
# off=333 ctb=89 tag=2 hlen=3 plen=313
:signature packet: algo 1, keyid AC344D255E48C714
        version 4, created 1664219662, md5len 0, sigclass 0x13
        digest algo 2, begin of digest 50 32
        hashed subpkt 2 len 4 (sig created 2022-09-26)
        hashed subpkt 27 len 1 (key flags: 03)
        hashed subpkt 11 len 6 (pref-sym-algos: 9 8 7 3 2 1)
        hashed subpkt 21 len 5 (pref-hash-algos: 8 2 9 10 11)
        hashed subpkt 22 len 3 (pref-zip-algos: 2 3 1)
        hashed subpkt 30 len 1 (features: 01)
        hashed subpkt 23 len 1 (keyserver preferences: 80)
        subpkt 16 len 8 (issuer key ID AC344D255E48C714)
        data: [2047 bits]
# off=649 ctb=b9 tag=14 hlen=3 plen=269
:public sub key packet:
        version 4, algo 1, created 1664219662, expires 0
        pkey[0]: [2048 bits]
        pkey[1]: [17 bits]
        keyid: 8165C203DA3B8702
# off=921 ctb=89 tag=2 hlen=3 plen=287
:signature packet: algo 1, keyid AC344D255E48C714
        version 4, created 1664219662, md5len 0, sigclass 0x18
        digest algo 2, begin of digest b1 98
        hashed subpkt 2 len 4 (sig created 2022-09-26)
        hashed subpkt 27 len 1 (key flags: 0C)
        subpkt 16 len 8 (issuer key ID AC344D255E48C714)
        data: [2048 bits]

Hopefully this helps answer that question @stellaraccident

@stellaraccident
Copy link
Copy Markdown
Collaborator

$ gpg --list-packets rocm-keyring.gpg
# off=0 ctb=99 tag=6 hlen=3 plen=525
:public key packet:
        version 4, algo 1, created 1470083360, expires 0
        pkey[0]: [4096 bits]
        pkey[1]: [17 bits]
        keyid: 9386B48A1A693C5C
# off=528 ctb=b4 tag=13 hlen=2 plen=40
:user ID packet: "AMD MLSE DevOps <dl.MLSE.DevOps@amd.com>"
# off=570 ctb=89 tag=2 hlen=3 plen=574
:signature packet: algo 1, keyid 9386B48A1A693C5C
        version 4, created 1643876755, md5len 0, sigclass 0x13
        digest algo 2, begin of digest 8d 40
        hashed subpkt 27 len 1 (key flags: 03)
        hashed subpkt 11 len 5 (pref-sym-algos: 9 8 7 3 2)
        hashed subpkt 21 len 5 (pref-hash-algos: 8 2 9 10 11)
        hashed subpkt 22 len 3 (pref-zip-algos: 2 3 1)
        hashed subpkt 30 len 1 (features: 01)
        hashed subpkt 23 len 1 (keyserver preferences: 80)
        hashed subpkt 2 len 4 (sig created 2022-02-03)
        hashed subpkt 9 len 4 (key expires after 10y186d11h56m)
        subpkt 16 len 8 (issuer key ID 9386B48A1A693C5C)
        data: [4096 bits]
# off=1147 ctb=b9 tag=14 hlen=3 plen=525
:public sub key packet:
        version 4, algo 1, created 1470083360, expires 0
        pkey[0]: [4096 bits]
        pkey[1]: [17 bits]
        keyid: 30C07AF01A6D36BA
# off=1675 ctb=89 tag=2 hlen=3 plen=549
:signature packet: algo 1, keyid 9386B48A1A693C5C
        version 4, created 1643876791, md5len 0, sigclass 0x18
        digest algo 10, begin of digest c0 1c
        hashed subpkt 27 len 1 (key flags: 0C)
        hashed subpkt 2 len 4 (sig created 2022-02-03)
        hashed subpkt 9 len 4 (key expires after 10y186d11h57m)
        subpkt 16 len 8 (issuer key ID 9386B48A1A693C5C)
        data: [4096 bits]
$ gpg --list-packets rocm-internal.gpg
# off=0 ctb=99 tag=6 hlen=3 plen=269
:public key packet:
        version 4, algo 1, created 1664219662, expires 0
        pkey[0]: [2048 bits]
        pkey[1]: [17 bits]
        keyid: AC344D255E48C714
# off=272 ctb=b4 tag=13 hlen=2 plen=59
:user ID packet: "compute-artifactory (Signing key) <jenkins-compute@amd.com>"
# off=333 ctb=89 tag=2 hlen=3 plen=313
:signature packet: algo 1, keyid AC344D255E48C714
        version 4, created 1664219662, md5len 0, sigclass 0x13
        digest algo 2, begin of digest 50 32
        hashed subpkt 2 len 4 (sig created 2022-09-26)
        hashed subpkt 27 len 1 (key flags: 03)
        hashed subpkt 11 len 6 (pref-sym-algos: 9 8 7 3 2 1)
        hashed subpkt 21 len 5 (pref-hash-algos: 8 2 9 10 11)
        hashed subpkt 22 len 3 (pref-zip-algos: 2 3 1)
        hashed subpkt 30 len 1 (features: 01)
        hashed subpkt 23 len 1 (keyserver preferences: 80)
        subpkt 16 len 8 (issuer key ID AC344D255E48C714)
        data: [2047 bits]
# off=649 ctb=b9 tag=14 hlen=3 plen=269
:public sub key packet:
        version 4, algo 1, created 1664219662, expires 0
        pkey[0]: [2048 bits]
        pkey[1]: [17 bits]
        keyid: 8165C203DA3B8702
# off=921 ctb=89 tag=2 hlen=3 plen=287
:signature packet: algo 1, keyid AC344D255E48C714
        version 4, created 1664219662, md5len 0, sigclass 0x18
        digest algo 2, begin of digest b1 98
        hashed subpkt 2 len 4 (sig created 2022-09-26)
        hashed subpkt 27 len 1 (key flags: 0C)
        subpkt 16 len 8 (issuer key ID AC344D255E48C714)
        data: [2048 bits]

Thanks. So public keys only. I would have preferred that this was described in the PR description.

@astrelsky
Copy link
Copy Markdown
Contributor

astrelsky commented Feb 26, 2026

Thanks. So public keys only. I would have preferred that this was described in the PR description.

Better safe than sorry, especially when they have internal in their name.

@dbieleck
Copy link
Copy Markdown

The GPG keys will be removed. It terms of the testing, we have some test scripts that we use as part of the BM testing, which are run on the 15+ Distros we are documenting as supported by ROCm. As part of this testing, we validate if the install was successful, and base functionality is working. @stellaraccident are you looking for tests for just validating the build? We're not sure what's done in terms of the rpm/deb package build/generation and what's tested there. For the runfile installer, we could do something similar to the package testing. In terms of maintenance, myself and my team would be doing that ie. add new features to the installer, bug fixing, adding new distros, updating the GUI frontend, etc. This goes back to something I was wondering, if it makes sense to just have the 1 or 2 base build scripts in TheRock and have the installer parts and GUI source in a separate repo that are cloned as part of the build. We're open to suggestions...

pbhandar-amd and others added 4 commits February 27, 2026 14:22
    - Removed GPG keys etc. from pullers - no longer required.
    - Removed amdgpu hidden repo config templates - no longer required.
[UI]
    - Fixed a permission issue see with using dkms status within the GUI to detect amdgpu-dkms on Debian 12/13.
    - Added missing distros to the support list.

[Build]
    - Fixed issue with package-puller doing a partial package name match, which would cause incorrect or double downloads of packages depending on order in the repo.  Now we do an exact match on the package name.
@ScottTodd ScottTodd added the ci:skip Skip all CI builds/tests for this PR label Mar 2, 2026
@ScottTodd
Copy link
Copy Markdown
Member

I'm adding the "skip-ci" label to this PR for now, since these new files are not referenced by any CI workflows and continuous syncing of this branch keeps retriggering (expensive) CI workflows.

I see that the PR has a description now, but this sort of large code dump is going to take significant time to review from design, policy, and code perspectives. I ran some automated code review tooling on this PR offline and it quickly spotted many style guide violations (https://github.com/ROCm/TheRock/tree/main/docs/development/style_guides), testing gaps (no unit tests), and security issues (command injection, out of bounds access, buffer overflows, missing error handling, etc.).

I don't think we'll be able to accept this code in a bulk drop in its current form (especially given the confusing authorship in the PR title - code needs a clear owner, it isn't just written once by one person, added to a repo by another, and then called finished), but finding the right contribution and maintenance model is beyond the scope of PR review threads - may need a higher bandwidth office hours discussion.

@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Mar 2, 2026
jayhawk-commits pushed a commit that referenced this pull request Mar 11, 2026
## Motivation

Bump rocm-systems from 93bc019 to 093b66c (includes fix for hip-tests
issue and revert for mathlib hiprtc issues and revert for rccl-test,
added revert for miopen failures due to PR 653):

Commits:
093b66c (HEAD, origin/develop, origin/HEAD) Revert "SWDEV-546177 -
hipModuleGetLoadingMode API impl (#653)" (#3858)
d8a0adb [AMD-SMI] Hide libamd_smi.so internal symbols (#3777)
d4da458 [rocprofiler-sdk] [Documentation ] Updating changelog (#3827)
19fadeb (origin/users/abchoudh/fix_dispatch_count) [RCCL][Tuner
Plugin] Enable tuning of RCCL tuning constants (#3757)
b4f5f8a rocr: Fix IPC dmabuf hang with large allocations (#3211)
64efea0 RCCL: allow users to override max and per job memory & fix
defaults. (#3797)
9b3dd10 Removing ready_for_review (#3849)
7e43880 [rocprofiler-systems] Update ROCm version to 7.2.0 in CI
workflows for Debian, RedHat, and Ubuntu (#3431)
1fdb6b9 [rocshmem] add gda/topology unit tests (#3715)
be1ea24 Move hipMipmappedArrayGetMemoryRequirements test to common
tests
e4513f0 Update amdgpu-windows-interop with latest changes, pal
58aa0bab2ced0cc9ebe8d2d0932db6774feb4e49 2026-03-04(#3773)
b1f964d [rocprofiler-compute] Ensure long kernel name fully shows in
compute analyze (#3665)
4dcf1e3 SWDEV-567112 - Replace test names (#3787)
33f5f30 ROCM-2428 - fixes hipStreamBatchMemOp invalid operation
checks (#3099)
139f4bf [SWDEV-556456] Align HIP_UUID with rocminfo (#3614)
8e89285 Reduce buffers alignment to 4 bytes (#3821)
51be29a AIRUNTIME-125: Consolidate Windows optimization and debug
flags (#3825)
1407392 [AMD-SMI] CI: Fix root workflow to use ASIC-specific test
filters (#3807)
63f78a9 (origin/users/mcao/fix_rocpdsummary) [ROCM-SMI] Fix DRM
include dirs leaking absolute build paths to consumers (#3808)
caf2f7e [ROCM-186] amd-smi: Add support for a VRAM and GTT tuning
interface (#3636)
a0712d4 [TheRock CI] Update projects_to_test lists (#3749)
02090c4 rocrtst: install gfx .hsaco files to share/rocrtst (#3744)
4a0a1cb Merge other simd table (#3696)
0d07657 Add missing kwargs from
rocprofiler_add_integration_validate_test in .cmake-format.yaml (#2336)
3a3df30 Optimize device counting service GPU interactions (#1583)
95d9da0 Add SPM Enable flag in build infrastructure (#3677)
12bb943 [rocprofiler-sdk] On-demand GPU profile queue
creation/destruction (#3586)
941057c  Navi4 tuning table iter 1 (#3052)
dbf2b73 [AMD-SMI] Display N/A for cu_occupancy when file is
unavailable (#3589)
b0efc7c [RCCL] [UT] Add ROCTX test (#3625)
ba7a20e Reducing the p2pnChannels for half-subscription A2A on
multi-node MI350 (#3381)
75238c9 [clr] Fix memory leak in getOrCreateHostcallBuffer (#3699)
af2ee0e [hip-tests] ASAN Check for image support before we create
context (#3834)
ad44966 Update windows ci subtree in include amdgpu-windows-interop
(#3814)
c8ad252 [rocprofiler-register] Fix compilation with system fmt/glog
(#1243)
7818815 Update README to include dbgapi and debug agent components
(#3731)
88e4a78 ROCProfiler and ROCTracer: Modifying deprecation note (#3831)
b5918a5 [ROCM-3124-3125-3126] CUID file generation hangs on MI350
systems/CUID test failures/Segmentation fault in CUID example code
(#3548)
97a5dd9 Update copyright to use SPDX IDs (#3805)
511730a [rocshmem]: add flood-amo tester (#3653)
2d650a0 [clr] Fix heap use after free error in device allocations
(#3789)
b6b179a Disable hipHostRegister_Negative test for ASAN (#3832)
39ec318 [RCCL] Add GDA alltoallv via rocshmem integration (#3613)
fb0f4d5 [RCCL] [CUMEM] Fix cuMem multi-process runs (#3811)
c3de7d4 SWDEV-526201 - Fix and enable disabled HIP tests from warp
group (#3089)
8d9a8ca roofline: code cleanup and refactor vector types (#3813)
8957e49 Don't wait on command completion if worker thread is
destroyed (#3790)
9e7586a [rocshmem] Add barrier APIs and expose `ROCSHMEM_TEAM_WORLD`
on device (#3651)
91b0923 Revert "fix local gpu release static build failure (#3667)"
(#3799)
0fda754 libhsakmt: Add secondary KFD context creation support
ee43db9 Revert "Update TheRock reference to 20260303 commit (#3709)"
(#3826)
86e28b9 Added fix to update GL2C counters instance count for GFX11.5
(#3100)
93f69f7 Adjust includes to match use (#3742)
e9fbc3f (develop) Update TheRock reference to 20260303 commit (#3709)
be0675a (HEAD) Revert "Support fp8 types in hiprtc (#2605)" (#3792)
3e3a94a [rocprofiler-systems] Add trace_cache support for
std::optional<T> serialization (#3490)
0b42a7f clr: Eliminate unnecessary kernel name string copies (#3774)
b6b0d77 rocr: Add hsa_amd_memory_async_batch_copy API for batched
memory copies (#3259)
486e6d1 Resolve staircase RS regression with 48 max channels (#3684)
eb59c85 [gfx942][gfx950] Leverage new cache bypass builtins for
simple protocol where available (#2847)
4d74d27 (origin/users/raramakr/rocm-smi-target) Revert "Auto Labeler:
Add ci:regression-detection label to rccl PRs (#3543)" (#3769)
8f07955 [AMD-SMI] CI: Use ASIC-specific test blacklists in workflows
(#3775)
7cef5b6 Fix MFMA total FLOPS calculation (#3371)
aea3751 Remove duplicated tests (#3235)
b6c656f Remove duplicated tests in memory module (#3087)
ca3137d [rocprofiler-sdk] Install integration tests without building
for therock & Misc. fixes (#3047)
0ab5c41 [rdc] Enable on-demand queue mode in rocprofiler-sdk to
prevent inference degradation (#3629)
a1eb2a1 rocr/wsl: a library should not output to std::out by default
(#3718)
b7da296 Reenable flood_put/get testers on mlx5 since they should work
after pr2732 (#3748)
000e24d [rocprofiler-sdk] Add automatic late-start support to
rocprofiler_force_configure (#2168)
64ea87f [hip-tests] Fix memory leaks in hipMemPoolTrimTo tests
(#3643)
543a7d7 rocr: Include code object allocs in lightweight coredump
a58da37 [rocdecode] - update rocdecode ctest (#3768)
f88e4ee [rocprofiler-systems] Make CDash submit non-fatal and add
GitHub Actions logging (#3525)
cb14deb [rocprofiler-systems] Update nlohmann-json submodule (#3391)
4492530 SWDEV-567112 - Introduce new mechanism for tagging and
disabling tests - Part 2 (#3707)
8ca9913 disabling rccl from full build (linux), covered in RCCL CI
(#3770)
c4fdb20 [ATT] Re-enable tests. Add option to specify perf to target
CU only (#2819)
615aab9 ROCM-3816 Out of Memory fix (#3588)
8ffad41 Fix rocm_smi64 exporting invalid absolute paths to consumers
(#3717)
042d76a rocr: Remove dependency on KFD in Runtime::VMemoryHandleMap
(#2515)
555db59 [AMD-SMI] CPU: Added support for family 1A Models 50h-57h
(#3206)
3affa2c [SWDEV-555935] Fix shared mutex and self-heal (#3729)
ba0bf0f Replace hipMemGetInfo with ihipMemGetInfo and use it for
internal calls. (#2845)
c5cef9b Fix HIP_RETURN on all HIP API calls. (#2838)
241ce7b Revert "memory: fix "contiguous_bytes" calculation in generic
conversion (#3285)" (#3755)
8a690f4 [kpack/clr] Windows PE/COFF support for kpack artifact
splitting and runtime loading (#3728)
863bdf8 MFMA pre-processor guards for ipc.hip (#3724)
90bb9b1 Release queue outside of vgpusAccess lock (#3705)
de45239 clr: Add build support of ROCR and PAL backends together
(#3722)
dfb7abc [rocprofiler-sdk] RCCL API changes for
RCCL_API_TRACE_VERSION_PATCH = 3 (#3477)
d69d4f2 [AICOMRCCL-633] - Fixed warnings in tests (#3402)
067d86d rocr/wsl: Disable AQL Queue usage with flag ROCR_USE_PM4
(#3663)
594eb60 [TheRock CI] rocm-systems build full ROCm stack (#3182)
27d17e8 [ROCProfiler-SDK] Fix SWDEV-556922: Handle comments before
checking for pmc: (#1723)
c80d904 memory: fix "contiguous_bytes" calculation in generic
conversion (#3285)
669987c [hip-tests] ASAN - add missing release handles (#3735)
a24bbd7 fix local gpu release static build failure (#3667)
259b2ff Speed up DeviceId (#2803)
65d9264 Simplify MPI trace merge logic and remove legacy guards
(#3562)
1076c08 use system to look for zcat path instead (#3720)
22f1d19 [AICOMRCCL-355] Enable threshold-based p2p-batching (#3000)
a2e4c79 Partially flatten template tests cases (#2597)
e242abe Pass space separated gfx target list to RCCL build command
(#3701)
4f78aea SWDEV-570074 - Refactor Memset memory object handling.
(#2228)
b3ad12d Support Nvidia build on theRock for HIP-tests (#3335)
a1cf15e Support fp8 types in hiprtc (#2605)
8ef84b0 [rocprofiler-systems] Add HPC examples to automated testing
(#3437)
db3a70d Free memory which was allocated in tests (#3710)
27e6809 [rocprofiler-systems]: Fix rhel CI failure on for MPI and UCX
tests (#3700)
0d9aaf5 rccl/topo_expl: fix build issue. (#3719)
be04d75 Fix zcat path used for checking kernel configs (#3423)
cab60a7 rocr/thunk/win: Add CU mask support (#3518)
5b3d826 [CUMEM] Initial support for cuMem APIs (#2763)
0606ff4 [HIP] [PLAT-194496] Improve Stress_hipMalloc_HighSizeAlloc
reliability (#3550)
05750a7 fix hip-test name in config (#3716)
33f777f hsakmt: Remove --high functionality from run_kfdtest.sh
(#2486)
e4c46e3 Hide the retain under direct dispatch check (#3698)
bfe0ca0 Add rocprof trace decoder to CI tests (#3690)
a769b6f [rocSHMEM] Edgar/abstract allocator ipc part1 (#3411)
659fb52 [AMD-SMI] Fix bugs, improve error handling, and clean up
NIC/switch code (#3654)
0eb26ea hsakmt: Fix Import/Export of dmabuf_fd for WSL/Windows
(#3348)
a122936 [SWDEV-567812] Add UBB power and power_limit fields to
npm_info (#3262)
c3bec09 [rocprofiler-sdk][rocprofv3][rocpd] Updates for KFD data
(#340)
7c44d47 SWDEV-547659 - Remove HIP_VERSION_GITHASH in logs (#448)
74b6487 SWDEV-547008 - Documentation fix for function return values
(#463)
af21cd4 SWDEV-545553 - Improve clarity and robustness of CALLBACK
unit tests (#546)
180d639 SWDEV-544900 - Change hip-test test case name (#547)
feeca99 Doc improvements (#3688)
c1822b6 ROCprofiler-SDK: deprecation of legacy tools (#3609)
5d7aff8 Fix rocprof-compute-viewer link (#3459)
0b0b484 AIRUNTIME-129 - Fix Ocl test failures of 2D image with
pitches. (#3584)
ac569b8 Fix memory tests config (#3687)
603fe7a [hip-tests] Enable hipMipmappedArrayGetMemoryRequirements
test via cmake
4fad445 [hip] Docs: Updates to some memory management pages
8cc5955 AICOMRCCL-656 fix memory leak in ncclCommInitRankFunc (#3628)
94a4595 Fix missing amd_comgr linkage in pc-sampling integration test
(#3453)
2a68565 rocrtst: CMAke file: strip xnack/feature suffixes from gfxNum
in build_kernel (#3652)
c3542bf [rocprofv3] Deprecating input text files for counter
collection (#1562)
ff122e7 SWDEV-573073 - Cleanup hipHostAlloc/Malloc/Register tests
(#3017)
5b1deaf SWDEV-567112 - Introduce new mechanism for tagging and
disabling tests - Part 1 - Core (#2351)
6e0cc30 rocrtst: MaxSingleAllocationTest: skip CPU NUMA nodes >0
(#3208)
d65f601 [AICOMRCCL-667] rccl: Change GDR selection logic. (#3607)
f1c44ab Patch Back to Old Repo: fixes from manual runs (#3621)
fe53bcd [AMD-SMI] Allow amdsmi init to succeed when no NIC hardware
is present (#3403)
b25600e [ROCM SMI] Fix fw pldm version not displayed in default
amd-smi (#3594)
169d2ef root to module wiring, remove legacy source collection
(#3482)
7469781 [LRT][clr] SWDEV-512963-Fix CTS test failures for 1D buffer
copy (#3520)
c8f55d9 Adding rocprof trace decoder (#3576)
425e983 Trace decoder codeowners (#3600)
a176efd [hip-tests] Add return statements to HIP_SKIP_TEST (#3647)
32687cf rocrtst: CPUAccessToGPUMemoryTest: Cap host allocation to 512
MB under ASAN (#3407)
97c0206 Update codeowners for thunk DXG (#3334)
be44b28 [rocdecode][rocjpeg] - ctest CMakeLists cleanup (#3632)
80ff0b8 Various memory leak fixes in hip-tests (#3605)
0988f67 fix typo in help text (#3314)
9f823c5 Fix CUID file lookup by loading files before searching
entries (#3436)
064c892 SWDEV-546177 - hipModuleGetLoadingMode API impl (#653)
006213e ROCM-2696: Ignare size and base if null ptr (#3336)
6060b99 Improve atomic min max test perf (#2580)
3fbcc13 Change printf capture impl (#1127)
93bc019 (tag: hip-version_7.12.60610,
origin/users/mradosav-amd/rocprofsys-selective-region) [ROCM-CORE]
Update rdhc script to support rocm install prefix
(ROCm/rocm-systems#3596)

[AICOMRCCL-355]:
https://amd-hub.atlassian.net/browse/AICOMRCCL-355?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:skip Skip all CI builds/tests for this PR

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants