Skip to content

Revert "[origami] Major refactoring of codebase (#2718)"#3416

Merged
jayhawk-commits merged 2 commits into
developfrom
users/bnemanich/revert_origami_changes
Dec 17, 2025
Merged

Revert "[origami] Major refactoring of codebase (#2718)"#3416
jayhawk-commits merged 2 commits into
developfrom
users/bnemanich/revert_origami_changes

Conversation

@bnemanich
Copy link
Copy Markdown
Contributor

@bnemanich bnemanich commented Dec 16, 2025

It was determined that this commit introduces memory corruption issues with TheRock python packaging build. See ROCm/TheRock#2522 and need this commit reverted to move the rocm-libraries submodule pointer in TheRock forward.

Error

corrupted size vs. prev_size in fastbins

Affected Versions

  • python 3.11, 3.12, 3.13
  • pytorch 2.7, 2.8, 2.9, main branches
  • TheRock with rocm-libraries submodule pointed to this commit or newer

Ways to Reproduce

  1. Build ROCm via TheRock.
  2. Build pytorch wheels via TheRock.
  3. Run rocm-sdk test.
  4. Run python ./external-builds/pytorch/run_pytorch_smoke_tests.py.
    Error happens during steps 2-4. I have not seen an instance of a workflow getting past step 4.

Investigation Test Results A

Investigation Test Results B

Revert Sequence

  • Cloned latest rocm-libraries.
  • Used CLI to execute git revert 283f438
  • Manually resolved merge conflict in projects/hipblaslt/library/src/amd_detail/rocblaslt/src/rocroller/solution_selection.cpp, referring to the diffs in PR [origami] Major refactoring of codebase #2718

Verifying This Pull Request

Verification Results

@jayhawk-commits
Copy link
Copy Markdown
Collaborator

The Azure CI results pass but is not reporting back to the summary step check.

@jayhawk-commits
Copy link
Copy Markdown
Collaborator

Updated PR description with run from TheRock past the previous failing points. Will merge.

@jayhawk-commits jayhawk-commits merged commit b3055ed into develop Dec 17, 2025
84 checks passed
@jayhawk-commits jayhawk-commits deleted the users/bnemanich/revert_origami_changes branch December 17, 2025 01:40
minsukim-amd added a commit that referenced this pull request Dec 17, 2025
minsukim-amd added a commit that referenced this pull request Dec 18, 2025
## Testing Plan
- Azure CI (deprecated): Runs all the Catch2 and new Python tests. ✅ 
- TheRock CI (integration): 
- CI workflow: https://github.com/ROCm/TheRock/actions/runs/20316557318
✅
- TheRock branch:
https://github.com/ROCm/TheRock/tree/users/neoblizz/minsukim-refactor ✅
(Some tests fail but build passed)
- TheRock CI (libraries: hipblaslt, etc.) ✅ 
- Math CI (performance/hipblaslt) ✅ 
- Reviews from older PRs are addressed with the few exceptions. ✅ 

## PRs History
- Original PR: #1859
- Rebased PR: #2718
- Reverted PR:
#3416 (comment)
- Hot-fix PR:
#3417 (review)
([TheRock CI
Report](https://github.com/ROCm/TheRock/actions/runs/20293546191))
- **And now this!**

## Technical Details

### Refactor
- [x] New file `types.hpp`, consolidates various origami types.
- [x] New structs to replace the growing tuple.
- [x] API updates to make it scalable, down from 20+ parameters to a
few.
- [x] Lots of redundant code removal.
- [x] Reorganize into `types.hpp`; data types (see `hardware.hpp` for
what needs to be moved)
- [x] Refactor extract APIs
- [x] Add enums for transpose
- [x] Refactor debug/log reporting
- [x] Remove mutable and statics
- [x] Decouple `latency` out of `config_t` 
- [x] Rebase develop into this branch

### Python APIs & Unit Tests
- [x] Update the tests
- [x] Update rocroller
- [x] hipblaslt/tensilelite/scripts to use the new API

### Testing Infrastructure
- Replace YAML-based tests with Catch2 C++ tests
- Replace GTest with Catch2 framework
- Convert YAML-driven parameterized tests to pure C++ tests
- Update common.hpp with direct C++ helper functions
- Update CMakeLists.txt to use Catch2 instead of GTest
- Remove Boost dependency (was only for YAML)
- All test data now hardcoded in C++ for type safety
- Better IDE support, debugging, and error messages

### Questions
- Rocroller: coordinate on the API design, should they be reused?
- Reuse dim3_t from hip_runtime.h (does HIP's dim3 have the same
functionality)?
- Should transpose/data-types be part of config as well for
precomputing?

## Motivation

This PR reapplies Origami project's refactor.

1. Make it simpler to make model updates.
2. Well-defined, scalable, clean interface.

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

## Work Moved to Future PRs

- Logger APIs
- Runtime Options
- Move instruction map into separate file
- Additional tests: #3301
- Minor batch-count specific changes:
#3289
- Move Origami's Azure CI -> TheRock CI @ibrahimw1

---------

Co-authored-by: Brad Nemanich <Brad.Nemanich@amd.com>
Co-authored-by: neoblizz <osama94@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants