Skip to content

[AIHPBLAS-3494] Fixing segfault by adding params to RequiredParameters#7349

Closed
dsaffars wants to merge 3 commits into
developfrom
users/dsaffars/fix_duplicate_kernel
Closed

[AIHPBLAS-3494] Fixing segfault by adding params to RequiredParameters#7349
dsaffars wants to merge 3 commits into
developfrom
users/dsaffars/fix_duplicate_kernel

Conversation

@dsaffars
Copy link
Copy Markdown
Contributor

@dsaffars dsaffars commented May 12, 2026

Adding KringShift and BAddrInterleave to generate more unique solutions and fixing segfault.

Motivation

When multiple solutions libraries with similar params are added to libraries, parameters defined in RequiredParameters.py are used to generate unique solutions. The above parameters are added to generate unique solutions.

Test Plan

Before adding this fix, multiple sizes will fall to a wrong solution and will generates segfault.
After adding params, the correct solution is selected and segfault is resolved.

Test Result

hipblaslt-bench --api_method c -m 6144 -n 931 -k 5120 --alpha 1.000000 --beta 0.000000 --transA T --transB N --batch_count 1 --a_type bf16_r --b_type bf16_r --c_type bf16_r --d_type bf16_r --scale_type f32_r --bias_type f32_r --compute_type f32_r --rotating 0 --cold_iters 0 --iters 1

Before:
Memory access fault by GPU node-2.

After:
Is supported 1 / Total solutions: 1
[0]:transA,transB,grouped_gemm,batch_count,m,n,k,alpha,lda,stride_a,beta,ldb,stride_b,ldc,stride_c,ldd,stride_d,a_type,b_type,c_type,d_type,compute_type,scaleA,scaleB,scaleC,scaleD,amaxD,swizzle_a,swizzle_b,activation_type,bias_vector,bias_type,aux_type,rotating_buffer,flush,...
T,N,0,1,6144,931,5120,1,5120,31457280,0,5120,4766720,6144,5720064,6144,5720064,bf16_r,bf16_r,bf16_r,bf16_r,f32_r,0,0,0,0,0,0,0,none,0,f32_r,bf16_r,0,0,...
--Solution index: 19888
--Solution name: Cijk_Alik_Bljk_BBS_BH_Bias_HA_S_SAV_UserArgs_MT128x192x128_MI16x16x1_SN_...

Adding AssertFree1DivByMT1LowbitGT1 and BAddrInterleave to differentiate kernels
in Origami and Equality.
@nakajee
Copy link
Copy Markdown
Contributor

nakajee commented May 12, 2026

These parameters came from the following PR.
#3679

We need to make sure which way we need to take either revert PR3679 or add parameters to RequiredParameters.py.

By the way, 4 parameters are added in PR3679. We need to make sure if we need to add 2 more parameters to RequiredParameters.py.
The criteria will be
we need to add any parameters which make difference in kernel asm code to RequiredParameters.py.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (69.24%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #7349       +/-   ##
============================================
- Coverage    62.77%   52.15%   -10.62%     
============================================
  Files          972      727      -245     
  Lines       185666   165378    -20288     
  Branches     22071    26572     +4501     
============================================
- Hits        116547    86246    -30301     
- Misses       61133    72722    +11589     
+ Partials      7986     6410     -1576     
Flag Coverage Δ *Carryforward flag
TensileLite 26.23% <ø> (?)
hipBLAS ?
hipBLASLt 40.24% <ø> (ø)
hipDNN 85.56% <ø> (+<0.01%) ⬆️ Carriedforward from d0fbbdd
hipFFT 49.22% <ø> (ø) Carriedforward from d0fbbdd
hipSOLVER 69.24% <ø> (ø) Carriedforward from d0fbbdd
rocBLAS ?
rocFFT 47.60% <ø> (ø) Carriedforward from d0fbbdd
rocRAND 57.02% <ø> (ø) Carriedforward from d0fbbdd

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
...t/tensilelite/Tensile/Common/RequiredParameters.py 100.00% <ø> (ø)

... and 503 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@amd-ysanchez
Copy link
Copy Markdown
Contributor

Both 'BAddrInterleave' and 'KRingShift' generate different asm code.

'AssertFree1DivByMT1LowbitGT1' is a derived parameter set to state["MacroTile1"] when 'BAddrInterleave' is set and 'AssertKRingShiftTailWrapOnly' again is derived when 'KRingShift' is set. Both are used as a runtime restrictions in Contractions.py.

I guess adding 'BAddrInterleave' and 'KRingShift' should be enough.

@nakajee
Copy link
Copy Markdown
Contributor

nakajee commented May 14, 2026

I prefer to revert PR#3679 because it includes more issues than just this kernel name related issue.

@dsaffars
Copy link
Copy Markdown
Contributor Author

The bug has been fixed on #7443

@dsaffars dsaffars closed this May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants