Skip to content

[release/7.11] Cherry-pick: Handle missing Artifactory during packaging #3100

Merged
araravik-psd merged 7 commits into
release/therock-7.11from
users/raramakr/cherry-pick
Jan 29, 2026
Merged

[release/7.11] Cherry-pick: Handle missing Artifactory during packaging #3100
araravik-psd merged 7 commits into
release/therock-7.11from
users/raramakr/cherry-pick

Conversation

@raramakr
Copy link
Copy Markdown
Contributor

@raramakr raramakr commented Jan 26, 2026

Packaging builds were failing when artifactories for certain gfx architectures
were missing. These cases are now handled gracefully, allowing the build to
continue and providing clearer diagnostics. https://amd-hub.atlassian.net/browse/ROCM-1283

A real‑time package summary is now generated based on build status, improving
visibility into which components succeeded, failed, or were skipped.

Debian release metadata has been updated to include required hash fields
(md5, sha256, etc.), ensuring compliance with Debian tooling expectations.
Additionally, the large regenerate_repo_metadata_from_s3 function has been
split into smaller deb/rpm‑specific functions for better maintainability and
clearer separation of responsibilities.

PRs cherry-picked as part of this PR:
#3017 : Fix for packaging build failures due to missing artifactory
#3059 : Prevent the uploading of .txt files to package repository
#3006 : Fix for missing hash values like md5, sha256 in the Release file.

raramakr and others added 3 commits January 26, 2026 16:12
- Skip packages whose Artifactory directories are missing, except the
metapackage
- Generate a build status report (built vs. skipped) in a text file
- Print a summary of built and skipped packages to the terminal

Test Results:
https://github.com/user-attachments/files/24756331/TestResults.txt

---------

Co-authored-by: raramakr <raramakr@amd.com>
Improve  packaging summary logic and output handling
- generate packaging summary based on actual build successes and
failures
- write only built_packages.txt and include skipped packages in the
output
- ensure packaging summary is printed automatically when a build fails
- remove dual rpm/deb build logic; only one package type is produced per
build
- prevent built_packages.txt from being uploaded to the repository.

https://github.com/user-attachments/files/24810406/TestOutput.txt

---------

Co-authored-by: raramakr <raramakr@amd.com>
The Debian release file didnt had hash values like md5 sha256. The new
change adds them to the release file. Also the big
regenerate_repo_metadata_from_s3 fn has been split into deb/rpm smaller
functions for better maintanance

## Motivation

Latest Debian standards expect hash values like md5, sha256 in the
Release file. Current release file didnt had hash values like md5
sha256. The new change adds them to the release file. Also the big
regenerate_repo_metadata_from_s3 fn has been split into deb/rpm smaller
functions for easier maintenance
Internal ticket tracking it ROCM-1331
## Test Plan

The debian packages build shoudnt fail and setting up repo shouldnt show
any warnings.

## Test Result
Test run was success in
https://github.com/ROCm/TheRock/actions/runs/21155463197/job/60839415580
Packages are available in
https://rocm.devreleases.amd.com/deb/20260120-21085375476/dists/stable/index.html
With this repo setting is not showing any warnings.
```
 echo "deb [trusted=yes] https://rocm.devreleases.amd.com/deb/20260120-21085375476 stable main" > /etc/apt/sources.list.d/rocm.list
apt update
Hit:1 http://archive.ubuntu.com/ubuntu noble InRelease
Ign:2 https://rocm.devreleases.amd.com/deb/20260120-21085375476 stable InRelease
Hit:3 http://security.ubuntu.com/ubuntu noble-security InRelease
Hit:4 http://archive.ubuntu.com/ubuntu noble-updates InRelease
Hit:5 http://archive.ubuntu.com/ubuntu noble-backports InRelease
Get:6 https://rocm.devreleases.amd.com/deb/20260120-21085375476 stable Release [759 B]
Ign:7 https://rocm.devreleases.amd.com/deb/20260120-21085375476 stable Release.gpg
Get:8 https://rocm.devreleases.amd.com/deb/20260120-21085375476 stable/main amd64 Packages [9958 B]
Fetched 10.7 kB in 1s (18.4 kB/s)
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
20 packages can be upgraded. Run 'apt list --upgradable' to see them.
```
Release file contents now has hash values:

https://rocm.devreleases.amd.com/deb/20260120-21085375476/dists/stable/Release
```
Origin: AMD ROCm
Label: ROCm dev Packages
Suite: stable
Codename: stable
Architectures: amd64
Components: main
Description: ROCm APT Repository
Date: Tue, 20 Jan 2026 00:54:45 UTC
MD5Sum:
 38847cd83192631a2ab1657567fe2f6f            49251 main/binary-amd64/Packages
 a5448e780baf917188ac93c19dcef69f             9958 main/binary-amd64/Packages.gz
SHA1:
 6d3e01aa59e971cd1c042613fb80b04c387d918c            49251 main/binary-amd64/Packages
 abff58d8f3e8d6366a3eac30b66d8c31d9a09cb6             9958 main/binary-amd64/Packages.gz
SHA256:
 f8f4963e74fefc4deb4c0326281e860f453cf202a412ebc9a5071443fdc62f83            49251 main/binary-amd64/Packages
 a63db6d17ab66c627330666e5e223ce149e2745bbad8cfb473cdde62017d689d             9958 main/binary-amd64/Packages.gz


```
Copy link
Copy Markdown
Contributor

@nunnikri nunnikri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Started a job to see if it works fine. https://github.com/ROCm/TheRock/actions/runs/21378125093

@marbre
Copy link
Copy Markdown
Member

marbre commented Jan 28, 2026

@raramakr this needs to go to main first. After it was landed we can cherry-pick it afterwards and target the release branch. Furthermore, make sure to assign @HereThereBeDragons and @araravik-psd as reviewers if you have something that must go into the release branch but this is limited to isolated infra related cherry-picks.

@raramakr
Copy link
Copy Markdown
Contributor Author

@raramakr this needs to go to main first. After it was landed we can cherry-pick it afterwards and target the release branch. Furthermore, make sure to assign @HereThereBeDragons and @araravik-psd as reviewers if you have something that must go into the release branch but this is limited to isolated infra related cherry-picks.

The commits in the PR are already in the main branch.
Will add the mentioned reviewers

@ScottTodd ScottTodd removed their request for review January 28, 2026 21:59
@ScottTodd
Copy link
Copy Markdown
Member

Cherry-picks should be clearly labeled as such, linking to the commits they are cherry-picking with some rationale for why the commits are also needed on the release branch. This PR was a poor use of reviewer time and attention.

@raramakr
Copy link
Copy Markdown
Contributor Author

Cherry-picks should be clearly labeled as such, linking to the commits they are cherry-picking with some rationale for why the commits are also needed on the release branch. This PR was a poor use of reviewer time and attention.

Noted. Will take care of this in future PR.
Update the PR description with the details as well

@HereThereBeDragons HereThereBeDragons changed the title Handle missing Artifactory during packaging [release/7.11] Cherry-pick: Handle missing Artifactory during packaging Jan 29, 2026
Copy link
Copy Markdown
Contributor

@HereThereBeDragons HereThereBeDragons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as these are cherry picked prs i guess it is ok to merge as-is
there are some questions that need to be answered.
and quiet a few of quality improvements that should go into main.

especially:
run_command() should behave like subprocess.run() and as such should have a list of strings with each command and param as an entry of the list, and not a single string containing the entire command as whole.

please either create a pr with those fixes, or create an issue for it to be tracked and not get lost.

if not config.enable_rpath:
create_nonversioned_rpm_package(pkg_name, config)

create_versioned_rpm_package(pkg_name, config)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is here an else missing?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HereThereBeDragons for rpath packages as per requirement only versioned packages are needed. As a follow up patch will add a comment to make it clear.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. my question was here if create_versioned_rpm_package would overwrite create_nonversioned_rpm_package because there is no else branch?

Copy link
Copy Markdown
Contributor

@nunnikri nunnikri Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it wont. Its a totally different set of packages. nonversioned depends on versioned packages. Here by default we need to create both versioned and nonversioned packages. But if rpath is enabled we need to create only versioned. Hence the code is bit confusing, Will make it more clear with comments

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nunnikri so if by default we are building both it will be nice if this is added as a comment here. As we are not passing enable_rpath here by default now.

print(f"Create {pkg_type} package.")
if pkg_type == "rpm":
output_list = create_rpm_package(pkg_name, config)
else:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good practice would be

if pkg_type == "rpm":
      ...
elif pkg_type == "deb":
      ...
else: 
   <output error>

this is a fail-safe to catch problems if new package types are added

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pkg_type check is already done in line #782. Thats why the else for failsafe is not added here. But can see some more options to optimize the code. Will do as a follow up PR.

"""

pkg_list = []
skipped = []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this be skipped_pkg_list or skip_list or skipped_list for consistency? as already used in other functions in sthis file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. will do as a follow up PR

for entry in dir_entries:
path = Path(artifact_dir) / entry

if entry.startswith(artifact_name) and path.is_dir():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is startswith the correct match? what would be the full dir_name? and can you predict the name and with this reduce the loop duration?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. will look at optimizing the loop

shutil.copy2(rpm_file, new_arch_dir / Path(rpm_file).name)

# Generate repodata for new packages with clean paths (no baseurl)
run_command(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split into ["createrepo_c", "--no-database", ...]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. will do as the follow up PR

)

# MD5Sum section
if md5_entries:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont understand thoses ifs here. you either have all the hashes or none. the logic above does not allow to have md5 hashes but not sha1 hashes,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per Debian standard, atleast one hash value is needed. Its not mandatory to have all 3.
md5sum and sha1 are added for older systems to work as backward compatibility. Newer systems will be looking for sha256

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i understand, but we are calculating always all 3 hashes. as such we will also always add either none or all 3.
i dont see any selection logic where you can say "only sha256". i am just saying: those 3 ifs can be put together in a single one

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern. Since Debian standard allows to continue even if one hash value generation fails did the coding as seperate if. for example even if hashlib.sha1() fails, functionality can continue with md5 and sha256. But will discuss with other experts in this area and take a call here

Comment thread build_tools/packaging/linux/upload_package_repo.py Outdated
Comment thread build_tools/packaging/linux/upload_package_repo.py Outdated
current_entry = []
current_filename = None

for line in f:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs more explanation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. Will add more comments to make the code more readable.

job_type: Job type for Release file metadata (default: 'nightly')
"""
if pkg_type == "rpm":
regenerate_rpm_metadata_from_s3(s3, bucket, prefix, uploaded_packages)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no job_type for rpm?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

job_type is used to distinguish dev/nightly or prerelease builds

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rpms seems not to take job_type into account, while the function for deb does

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah.. now i understood your comment. debian repo creation needs to create a Release file. Which is a text file meta data about the repo. We are differentiating the repo metadata bw dev/nightly/prerelease. Hence that parameter is passed. In rpm case no such provision is available, hence its not used

nunnikri and others added 2 commits January 29, 2026 07:39
Co-authored-by: Laura Promberger <laura.promberger@amd.com>
Co-authored-by: Laura Promberger <laura.promberger@amd.com>
@nunnikri
Copy link
Copy Markdown
Contributor

as these are cherry picked prs i guess it is ok to merge as-is there are some questions that need to be answered. and quiet a few of quality improvements that should go into main.

especially: run_command() should behave like subprocess.run() and as such should have a list of strings with each command and param as an entry of the list, and not a single string containing the entire command as whole.

please either create a pr with those fixes, or create an issue for it to be tracked and not get lost.

Sure for all the comments will raise a new PR and bring in main and then cherry pick to release if needed later.

Comment thread build_tools/packaging/linux/build_package.py
Co-authored-by: nunnikri <71024015+nunnikri@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@HereThereBeDragons HereThereBeDragons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nunnikri for all the comments and changes.

One last request is fixing the pre-commit complain:
https://github.com/ROCm/TheRock/actions/runs/21487752843/job/61901605662?pr=3100

diff --git a/build_tools/packaging/linux/upload_package_repo.py b/build_tools/packaging/linux/upload_package_repo.py
index d0ec10a..08c36ac 100644
--- a/build_tools/packaging/linux/upload_package_repo.py
+++ b/build_tools/packaging/linux/upload_package_repo.py
@@ -369,7 +369,9 @@ def regenerate_deb_metadata_from_s3(
             with open(existing_packages, "r") as f:
                 content = f.read()
                 pkg_count = content.count("\nPackage: ")
-            print(f"✅ Downloaded existing Packages file containing ({pkg_count} packages)")
+            print(
+                f"✅ Downloaded existing Packages file containing ({pkg_count} packages)"
+            )
         except Exception as e:
             print(f"⚠️  No existing Packages file found (new repo?): {e}")
             existing_packages = None

if you want to automatize it you can install it in your local python env, see
https://github.com/ROCm/TheRock/tree/main/docs/development/style_guides

@raramakr
Copy link
Copy Markdown
Contributor Author

Thanks @nunnikri for all the comments and changes.

One last request is fixing the pre-commit complain: https://github.com/ROCm/TheRock/actions/runs/21487752843/job/61901605662?pr=3100

diff --git a/build_tools/packaging/linux/upload_package_repo.py b/build_tools/packaging/linux/upload_package_repo.py
index d0ec10a..08c36ac 100644
--- a/build_tools/packaging/linux/upload_package_repo.py
+++ b/build_tools/packaging/linux/upload_package_repo.py
@@ -369,7 +369,9 @@ def regenerate_deb_metadata_from_s3(
             with open(existing_packages, "r") as f:
                 content = f.read()
                 pkg_count = content.count("\nPackage: ")
-            print(f"✅ Downloaded existing Packages file containing ({pkg_count} packages)")
+            print(
+                f"✅ Downloaded existing Packages file containing ({pkg_count} packages)"
+            )
         except Exception as e:
             print(f"⚠️  No existing Packages file found (new repo?): {e}")
             existing_packages = None

if you want to automatize it you can install it in your local python env, see https://github.com/ROCm/TheRock/tree/main/docs/development/style_guides

Fixed the pre-commit errors

@araravik-psd araravik-psd merged commit 5fe8711 into release/therock-7.11 Jan 29, 2026
11 of 12 checks passed
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Jan 29, 2026
@araravik-psd araravik-psd deleted the users/raramakr/cherry-pick branch January 29, 2026 18:09
jayhawk-commits pushed a commit that referenced this pull request Mar 11, 2026
## Motivation

Bump rocm-systems from 93bc019 to 093b66caa3 (includes fix for hip-tests
issue and revert for mathlib hiprtc issues and revert for rccl-test,
added revert for miopen failures due to PR 653):

Commits:
093b66caa3 (HEAD, origin/develop, origin/HEAD) Revert "SWDEV-546177 -
hipModuleGetLoadingMode API impl (#653)" (#3858)
d8a0adbc9f [AMD-SMI] Hide libamd_smi.so internal symbols (#3777)
d4da458f94 [rocprofiler-sdk] [Documentation ] Updating changelog (#3827)
19fadeb082 (origin/users/abchoudh/fix_dispatch_count) [RCCL][Tuner
Plugin] Enable tuning of RCCL tuning constants (#3757)
b4f5f8a6a8 rocr: Fix IPC dmabuf hang with large allocations (#3211)
64efea0435 RCCL: allow users to override max and per job memory & fix
defaults. (#3797)
9b3dd101bb Removing ready_for_review (#3849)
7e43880a64 [rocprofiler-systems] Update ROCm version to 7.2.0 in CI
workflows for Debian, RedHat, and Ubuntu (#3431)
1fdb6b9827 [rocshmem] add gda/topology unit tests (#3715)
be1ea24a96 Move hipMipmappedArrayGetMemoryRequirements test to common
tests
e4513f04c8 Update amdgpu-windows-interop with latest changes, pal
58aa0bab2ced0cc9ebe8d2d0932db6774feb4e49 2026-03-04(#3773)
b1f964d796 [rocprofiler-compute] Ensure long kernel name fully shows in
compute analyze (#3665)
4dcf1e3ce0 SWDEV-567112 - Replace test names (#3787)
33f5f302e5 ROCM-2428 - fixes hipStreamBatchMemOp invalid operation
checks (#3099)
139f4bfff8 [SWDEV-556456] Align HIP_UUID with rocminfo (#3614)
8e8928544c Reduce buffers alignment to 4 bytes (#3821)
51be29a647 AIRUNTIME-125: Consolidate Windows optimization and debug
flags (#3825)
1407392240 [AMD-SMI] CI: Fix root workflow to use ASIC-specific test
filters (#3807)
63f78a98d7 (origin/users/mcao/fix_rocpdsummary) [ROCM-SMI] Fix DRM
include dirs leaking absolute build paths to consumers (#3808)
caf2f7e1eb [ROCM-186] amd-smi: Add support for a VRAM and GTT tuning
interface (#3636)
a0712d4c2a [TheRock CI] Update projects_to_test lists (#3749)
02090c42c9 rocrtst: install gfx .hsaco files to share/rocrtst (#3744)
4a0a1cbfce Merge other simd table (#3696)
0d07657d78 Add missing kwargs from
rocprofiler_add_integration_validate_test in .cmake-format.yaml (#2336)
3a3df301dc Optimize device counting service GPU interactions (#1583)
95d9da0098 Add SPM Enable flag in build infrastructure (#3677)
12bb9435b2 [rocprofiler-sdk] On-demand GPU profile queue
creation/destruction (#3586)
941057c2c0  Navi4 tuning table iter 1 (#3052)
dbf2b7369f [AMD-SMI] Display N/A for cu_occupancy when file is
unavailable (#3589)
b0efc7c639 [RCCL] [UT] Add ROCTX test (#3625)
ba7a20ea18 Reducing the p2pnChannels for half-subscription A2A on
multi-node MI350 (#3381)
75238c98a2 [clr] Fix memory leak in getOrCreateHostcallBuffer (#3699)
af2ee0e8ad [hip-tests] ASAN Check for image support before we create
context (#3834)
ad4496678e Update windows ci subtree in include amdgpu-windows-interop
(#3814)
c8ad252208 [rocprofiler-register] Fix compilation with system fmt/glog
(#1243)
781881544d Update README to include dbgapi and debug agent components
(#3731)
88e4a7837e ROCProfiler and ROCTracer: Modifying deprecation note (#3831)
b5918a5f35 [ROCM-3124-3125-3126] CUID file generation hangs on MI350
systems/CUID test failures/Segmentation fault in CUID example code
(#3548)
97a5dd993c Update copyright to use SPDX IDs (#3805)
511730ab45 [rocshmem]: add flood-amo tester (#3653)
2d650a0065 [clr] Fix heap use after free error in device allocations
(#3789)
b6b179ad81 Disable hipHostRegister_Negative test for ASAN (#3832)
39ec318c8d [RCCL] Add GDA alltoallv via rocshmem integration (#3613)
fb0f4d53b1 [RCCL] [CUMEM] Fix cuMem multi-process runs (#3811)
c3de7d4bf6 SWDEV-526201 - Fix and enable disabled HIP tests from warp
group (#3089)
8d9a8ca161 roofline: code cleanup and refactor vector types (#3813)
8957e49028 Don't wait on command completion if worker thread is
destroyed (#3790)
9e7586a5fa [rocshmem] Add barrier APIs and expose `ROCSHMEM_TEAM_WORLD`
on device (#3651)
91b09235b0 Revert "fix local gpu release static build failure (#3667)"
(#3799)
0fda754b1b libhsakmt: Add secondary KFD context creation support
ee43db95b0 Revert "Update TheRock reference to 20260303 commit (#3709)"
(#3826)
86e28b9fae Added fix to update GL2C counters instance count for GFX11.5
(#3100)
93f69f7de4 Adjust includes to match use (#3742)
e9fbc3f1a2 (develop) Update TheRock reference to 20260303 commit (#3709)
be0675a1a6 (HEAD) Revert "Support fp8 types in hiprtc (#2605)" (#3792)
3e3a94a4ef [rocprofiler-systems] Add trace_cache support for
std::optional<T> serialization (#3490)
0b42a7f472 clr: Eliminate unnecessary kernel name string copies (#3774)
b6b0d77b29 rocr: Add hsa_amd_memory_async_batch_copy API for batched
memory copies (#3259)
486e6d12d2 Resolve staircase RS regression with 48 max channels (#3684)
eb59c85ac4 [gfx942][gfx950] Leverage new cache bypass builtins for
simple protocol where available (#2847)
4d74d27f0e (origin/users/raramakr/rocm-smi-target) Revert "Auto Labeler:
Add ci:regression-detection label to rccl PRs (#3543)" (#3769)
8f0795517c [AMD-SMI] CI: Use ASIC-specific test blacklists in workflows
(#3775)
7cef5b64c1 Fix MFMA total FLOPS calculation (#3371)
aea37512ba Remove duplicated tests (#3235)
b6c656fdd4 Remove duplicated tests in memory module (#3087)
ca3137d8f9 [rocprofiler-sdk] Install integration tests without building
for therock & Misc. fixes (#3047)
0ab5c41f65 [rdc] Enable on-demand queue mode in rocprofiler-sdk to
prevent inference degradation (#3629)
a1eb2a1f7c rocr/wsl: a library should not output to std::out by default
(#3718)
b7da296cc8 Reenable flood_put/get testers on mlx5 since they should work
after pr2732 (#3748)
000e24de2f [rocprofiler-sdk] Add automatic late-start support to
rocprofiler_force_configure (#2168)
64ea87f592 [hip-tests] Fix memory leaks in hipMemPoolTrimTo tests
(#3643)
543a7d765f rocr: Include code object allocs in lightweight coredump
a58da378d4 [rocdecode] - update rocdecode ctest (#3768)
f88e4ee44d [rocprofiler-systems] Make CDash submit non-fatal and add
GitHub Actions logging (#3525)
cb14debc3a [rocprofiler-systems] Update nlohmann-json submodule (#3391)
449253009a SWDEV-567112 - Introduce new mechanism for tagging and
disabling tests - Part 2 (#3707)
8ca991393d disabling rccl from full build (linux), covered in RCCL CI
(#3770)
c4fdb20b74 [ATT] Re-enable tests. Add option to specify perf to target
CU only (#2819)
615aab95ed ROCM-3816 Out of Memory fix (#3588)
8ffad41b24 Fix rocm_smi64 exporting invalid absolute paths to consumers
(#3717)
042d76a626 rocr: Remove dependency on KFD in Runtime::VMemoryHandleMap
(#2515)
555db59b2a [AMD-SMI] CPU: Added support for family 1A Models 50h-57h
(#3206)
3affa2c7a3 [SWDEV-555935] Fix shared mutex and self-heal (#3729)
ba0bf0f3db Replace hipMemGetInfo with ihipMemGetInfo and use it for
internal calls. (#2845)
c5cef9b18e Fix HIP_RETURN on all HIP API calls. (#2838)
241ce7ba83 Revert "memory: fix "contiguous_bytes" calculation in generic
conversion (#3285)" (#3755)
8a690f482e [kpack/clr] Windows PE/COFF support for kpack artifact
splitting and runtime loading (#3728)
863bdf8aa8 MFMA pre-processor guards for ipc.hip (#3724)
90bb9b1921 Release queue outside of vgpusAccess lock (#3705)
de4523910c clr: Add build support of ROCR and PAL backends together
(#3722)
dfb7abc2d8 [rocprofiler-sdk] RCCL API changes for
RCCL_API_TRACE_VERSION_PATCH = 3 (#3477)
d69d4f23db [AICOMRCCL-633] - Fixed warnings in tests (#3402)
067d86dcaa rocr/wsl: Disable AQL Queue usage with flag ROCR_USE_PM4
(#3663)
594eb60d42 [TheRock CI] rocm-systems build full ROCm stack (#3182)
27d17e8ea0 [ROCProfiler-SDK] Fix SWDEV-556922: Handle comments before
checking for pmc: (#1723)
c80d90439d memory: fix "contiguous_bytes" calculation in generic
conversion (#3285)
669987c83f [hip-tests] ASAN - add missing release handles (#3735)
a24bbd75a4 fix local gpu release static build failure (#3667)
259b2ff913 Speed up DeviceId (#2803)
65d9264bf4 Simplify MPI trace merge logic and remove legacy guards
(#3562)
1076c083cb use system to look for zcat path instead (#3720)
22f1d19db3 [AICOMRCCL-355] Enable threshold-based p2p-batching (#3000)
a2e4c794d2 Partially flatten template tests cases (#2597)
e242abe219 Pass space separated gfx target list to RCCL build command
(#3701)
4f78aea66d SWDEV-570074 - Refactor Memset memory object handling.
(#2228)
b3ad12d834 Support Nvidia build on theRock for HIP-tests (#3335)
a1cf15ea9a Support fp8 types in hiprtc (#2605)
8ef84b0a50 [rocprofiler-systems] Add HPC examples to automated testing
(#3437)
db3a70dfa0 Free memory which was allocated in tests (#3710)
27e6809c7e [rocprofiler-systems]: Fix rhel CI failure on for MPI and UCX
tests (#3700)
0d9aaf59d8 rccl/topo_expl: fix build issue. (#3719)
be04d75765 Fix zcat path used for checking kernel configs (#3423)
cab60a7b27 rocr/thunk/win: Add CU mask support (#3518)
5b3d826c05 [CUMEM] Initial support for cuMem APIs (#2763)
0606ff491f [HIP] [PLAT-194496] Improve Stress_hipMalloc_HighSizeAlloc
reliability (#3550)
05750a77cc fix hip-test name in config (#3716)
33f777f3e9 hsakmt: Remove --high functionality from run_kfdtest.sh
(#2486)
e4c46e3480 Hide the retain under direct dispatch check (#3698)
bfe0ca0279 Add rocprof trace decoder to CI tests (#3690)
a769b6f54e [rocSHMEM] Edgar/abstract allocator ipc part1 (#3411)
659fb52243 [AMD-SMI] Fix bugs, improve error handling, and clean up
NIC/switch code (#3654)
0eb26ea571 hsakmt: Fix Import/Export of dmabuf_fd for WSL/Windows
(#3348)
a122936abb [SWDEV-567812] Add UBB power and power_limit fields to
npm_info (#3262)
c3bec090c5 [rocprofiler-sdk][rocprofv3][rocpd] Updates for KFD data
(#340)
7c44d47740 SWDEV-547659 - Remove HIP_VERSION_GITHASH in logs (#448)
74b6487a6a SWDEV-547008 - Documentation fix for function return values
(#463)
af21cd44f1 SWDEV-545553 - Improve clarity and robustness of CALLBACK
unit tests (#546)
180d639044 SWDEV-544900 - Change hip-test test case name (#547)
feeca99950 Doc improvements (#3688)
c1822b6336 ROCprofiler-SDK: deprecation of legacy tools (#3609)
5d7aff8462 Fix rocprof-compute-viewer link (#3459)
0b0b4846f0 AIRUNTIME-129 - Fix Ocl test failures of 2D image with
pitches. (#3584)
ac569b87e0 Fix memory tests config (#3687)
603fe7a5cf [hip-tests] Enable hipMipmappedArrayGetMemoryRequirements
test via cmake
4fad4452d9 [hip] Docs: Updates to some memory management pages
8cc59559fe AICOMRCCL-656 fix memory leak in ncclCommInitRankFunc (#3628)
94a4595a5d Fix missing amd_comgr linkage in pc-sampling integration test
(#3453)
2a68565dce rocrtst: CMAke file: strip xnack/feature suffixes from gfxNum
in build_kernel (#3652)
c3542bfb2b [rocprofv3] Deprecating input text files for counter
collection (#1562)
ff122e7ed7 SWDEV-573073 - Cleanup hipHostAlloc/Malloc/Register tests
(#3017)
5b1deaf29d SWDEV-567112 - Introduce new mechanism for tagging and
disabling tests - Part 1 - Core (#2351)
6e0cc309e1 rocrtst: MaxSingleAllocationTest: skip CPU NUMA nodes >0
(#3208)
d65f601195 [AICOMRCCL-667] rccl: Change GDR selection logic. (#3607)
f1c44ab200 Patch Back to Old Repo: fixes from manual runs (#3621)
fe53bcd715 [AMD-SMI] Allow amdsmi init to succeed when no NIC hardware
is present (#3403)
b25600efdb [ROCM SMI] Fix fw pldm version not displayed in default
amd-smi (#3594)
169d2ef763 root to module wiring, remove legacy source collection
(#3482)
7469781988 [LRT][clr] SWDEV-512963-Fix CTS test failures for 1D buffer
copy (#3520)
c8f55d9b86 Adding rocprof trace decoder (#3576)
425e983502 Trace decoder codeowners (#3600)
a176efd648 [hip-tests] Add return statements to HIP_SKIP_TEST (#3647)
32687cf183 rocrtst: CPUAccessToGPUMemoryTest: Cap host allocation to 512
MB under ASAN (#3407)
97c0206753 Update codeowners for thunk DXG (#3334)
be44b28bb6 [rocdecode][rocjpeg] - ctest CMakeLists cleanup (#3632)
80ff0b8942 Various memory leak fixes in hip-tests (#3605)
0988f67a85 fix typo in help text (#3314)
9f823c53f1 Fix CUID file lookup by loading files before searching
entries (#3436)
064c89261b SWDEV-546177 - hipModuleGetLoadingMode API impl (#653)
006213e112 ROCM-2696: Ignare size and base if null ptr (#3336)
6060b99d83 Improve atomic min max test perf (#2580)
3fbcc13602 Change printf capture impl (#1127)
93bc01937c (tag: hip-version_7.12.60610,
origin/users/mradosav-amd/rocprofsys-selective-region) [ROCM-CORE]
Update rdhc script to support rocm install prefix
(ROCm/rocm-systems#3596)

[AICOMRCCL-355]:
https://amd-hub.atlassian.net/browse/AICOMRCCL-355?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants