[AIHPBLAS-3494] Fixing segfault by adding params to RequiredParameters#7349
[AIHPBLAS-3494] Fixing segfault by adding params to RequiredParameters#7349dsaffars wants to merge 3 commits into
Conversation
Adding AssertFree1DivByMT1LowbitGT1 and BAddrInterleave to differentiate kernels in Origami and Equality.
|
These parameters came from the following PR. We need to make sure which way we need to take either revert PR3679 or add parameters to RequiredParameters.py. By the way, 4 parameters are added in PR3679. We need to make sure if we need to add 2 more parameters to RequiredParameters.py. |
Codecov Report✅ All modified and coverable lines are covered by tests. ❌ Your project status has failed because the head coverage (69.24%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #7349 +/- ##
============================================
- Coverage 62.77% 52.15% -10.62%
============================================
Files 972 727 -245
Lines 185666 165378 -20288
Branches 22071 26572 +4501
============================================
- Hits 116547 86246 -30301
- Misses 61133 72722 +11589
+ Partials 7986 6410 -1576
*This pull request uses carry forward flags. Click here to find out more.
🚀 New features to boost your workflow:
|
|
Both 'BAddrInterleave' and 'KRingShift' generate different asm code. 'AssertFree1DivByMT1LowbitGT1' is a derived parameter set to state["MacroTile1"] when 'BAddrInterleave' is set and 'AssertKRingShiftTailWrapOnly' again is derived when 'KRingShift' is set. Both are used as a runtime restrictions in Contractions.py. I guess adding 'BAddrInterleave' and 'KRingShift' should be enough. |
AssertFree1DivByMT1LowbitGT1 is a derived param.
|
I prefer to revert PR#3679 because it includes more issues than just this kernel name related issue. |
|
The bug has been fixed on #7443 |
Adding KringShift and BAddrInterleave to generate more unique solutions and fixing segfault.
Motivation
When multiple solutions libraries with similar params are added to libraries, parameters defined in RequiredParameters.py are used to generate unique solutions. The above parameters are added to generate unique solutions.
Test Plan
Before adding this fix, multiple sizes will fall to a wrong solution and will generates segfault.
After adding params, the correct solution is selected and segfault is resolved.
Test Result
hipblaslt-bench --api_method c -m 6144 -n 931 -k 5120 --alpha 1.000000 --beta 0.000000 --transA T --transB N --batch_count 1 --a_type bf16_r --b_type bf16_r --c_type bf16_r --d_type bf16_r --scale_type f32_r --bias_type f32_r --compute_type f32_r --rotating 0 --cold_iters 0 --iters 1
Before:
Memory access fault by GPU node-2.
After:
Is supported 1 / Total solutions: 1
[0]:transA,transB,grouped_gemm,batch_count,m,n,k,alpha,lda,stride_a,beta,ldb,stride_b,ldc,stride_c,ldd,stride_d,a_type,b_type,c_type,d_type,compute_type,scaleA,scaleB,scaleC,scaleD,amaxD,swizzle_a,swizzle_b,activation_type,bias_vector,bias_type,aux_type,rotating_buffer,flush,...
T,N,0,1,6144,931,5120,1,5120,31457280,0,5120,4766720,6144,5720064,6144,5720064,bf16_r,bf16_r,bf16_r,bf16_r,f32_r,0,0,0,0,0,0,0,none,0,f32_r,bf16_r,0,0,...
--Solution index: 19888
--Solution name: Cijk_Alik_Bljk_BBS_BH_Bias_HA_S_SAV_UserArgs_MT128x192x128_MI16x16x1_SN_...