Skip to content

Fix for type mismatch: AIHPBLAS-1465#6177

Closed
pdhirajkumarprasad wants to merge 6 commits into
developfrom
users/dhirajp/AIHPBLAS-1465
Closed

Fix for type mismatch: AIHPBLAS-1465#6177
pdhirajkumarprasad wants to merge 6 commits into
developfrom
users/dhirajp/AIHPBLAS-1465

Conversation

@pdhirajkumarprasad
Copy link
Copy Markdown
Contributor

@pdhirajkumarprasad pdhirajkumarprasad commented Apr 6, 2026

Motivation

https://amd-hub.atlassian.net/browse/AIHPBLAS-1465

Technical Details

when we are passing boolean type, instead of throwing error, we try to process which lead to error at different stage.

We added check so that if the type is not expected, throwing error.

Tensile::FATAL: UseBias must be an integer (0, 1, 2, or 3), got boolean value: True. Use 0 for no bias, 1 for M direction, 2 for N direction, or 3 for both M and N directions.

Test Plan

N/A

Test Result

reported yaml in issue is working fine

Submission Checklist

Signed-off-by: pdhirajkumarprasad <dhirajp@amd.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (39.95%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #6177       +/-   ##
============================================
- Coverage    72.32%   67.07%    -5.26%     
============================================
  Files         1065     1890      +825     
  Lines       129788   293629   +163841     
  Branches     14376    41375    +26999     
============================================
+ Hits         93869   196935   +103066     
- Misses       32196    79717    +47521     
- Partials      3723    16977    +13254     
Flag Coverage Δ *Carryforward flag
hipBLAS 90.67% <ø> (?) Carriedforward from 2668ceb
hipBLASLt 39.95% <ø> (ø)
hipCUB 82.21% <ø> (?) Carriedforward from 2668ceb
hipDNN 86.35% <ø> (+4.60%) ⬆️ Carriedforward from 2668ceb
hipFFT 54.82% <ø> (?) Carriedforward from 2668ceb
hipRAND 76.12% <ø> (?) Carriedforward from 2668ceb
hipSOLVER 68.73% <ø> (?) Carriedforward from 2668ceb
hipSPARSE 84.70% <ø> (?) Carriedforward from 2668ceb
rocBLAS 48.12% <ø> (?) Carriedforward from 2668ceb
rocFFT 47.54% <ø> (?) Carriedforward from 2668ceb
rocPRIM ?
rocRAND 57.07% <ø> (?) Carriedforward from 2668ceb
rocSOLVER 77.72% <ø> (?) Carriedforward from 2668ceb
rocSPARSE 71.33% <ø> (?) Carriedforward from 2668ceb
rocThrust ?

*This pull request uses carry forward flags. Click here to find out more.
see 2365 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@talumbau talumbau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preferred solution strategy would be "if there is a type mismatch, raise an Exception, tell the user the found type in the file vs. the expected type from the code."

This is what is done for Library Logic YAML files. We should do the same for this case. Please see AIGECORE-189 for additional details.

In that PR chain, we have landed many many changes to reduce type errors to almost nothing and will soon land the change "type errors result in a noisy log message warning" After that lands the logs come up clear in CI, we will change warnings to errors.

For Tensilelite input files, we can just raise an exception with a good error message and error out of the execution.

pdhirajkumarprasad and others added 2 commits April 7, 2026 02:23
…eAlphaVec

The previous fix silently converted boolean values to integers, which
masked the real issue. The correct fix is to validate that UseBias and
UseScaleAlphaVec are integers with valid values (0, 1, 2, or 3) and
reject invalid YAML configurations with clear error messages.

Changes:
- Added validation in ProblemType to check UseBias is an integer (0-3)
- Added validation in ProblemType to check UseScaleAlphaVec is integer (0-3)
- Added printExit import from Tensile.Common.Utilities
- Error messages show the actual invalid value provided for easier debugging

This ensures YAML files with incorrect boolean values like "UseBias: True"
will be rejected with a clear error message instead of being silently
converted or causing downstream issues.

Signed-off-by: pdhirajkumarprasad <dhirajp@amd.com>

Co-Authored-By: Claude (Claude-Sonnet-4.5) <noreply@anthropic.com>
@pdhirajkumarprasad
Copy link
Copy Markdown
Contributor Author

@talumbau I have reverted the previous fix and added a check so that it error out when type is not as expected

Copy link
Copy Markdown
Contributor

@talumbau talumbau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not using validateParameterTypes in Tensile/SolutionStructs/Solution.py. These changes are just checking the types of two parameters out of many. The file you are changing is not for input validation (as you can see, since there is no other input validation happening). If you want to do input yaml type validation, please use the existing infra and solve the problem for all input parameters, not just 1 at a time.

@pdhirajkumarprasad
Copy link
Copy Markdown
Contributor Author

@talumbau there are many input parameter which work for either 1 or True, example

TransposeA": [False, True]

Again, there is some other type like, which are defined as bool and explained as integer

"ExpandPointerSwap": [False, True],
# Schedule global reads and global read increments into LocalRead iterations
# Can reduce pressure on local read instruction dispatch queue
# 0=perform global reads at start of instruction loop
# 1=schedule into the local read instruction iterations

I added following but problem is that all our common test are failing and if it add these check then it will break the production code as well.

+    "UseBias": [0, 1, 2, 3],
+    # Alpha vector scaling support
+    # 0: Disabled
+    # 1: Alpha vector on M direction
+    # 2: Alpha vector on N direction
+    # 3: Alpha vector on both M and N directions
+    "UseScaleAlphaVec": [0, 1, 2, 3],
+    # MX (microscaling) block size for matrix A
+    # 0: Disabled
+    # 16, 32: Valid MX block sizes
+    "MXBlockA": [0, 16, 32],
+    # MX (microscaling) block size for matrix B
+    # 0: Disabled
+    # 16, 32: Valid MX block sizes
+    "MXBlockB": [0, 16, 32],
+    # ProblemType boolean parameters
+    "UseBeta": [False, True],
+    "UseE": [False, True],
+    "Gradient": [False, True],
+    "UseScaleCD": [False, True],
+    "HighPrecisionAccumulate": [False, True],
+    "SilentHighPrecisionAccumulate": [False, True],
+    "ComplexConjugateA": [False, True],
+    "ComplexConjugateB": [False, True],
+    "StochasticRounding": [False, True],
+    "TransposeA": [False, True],
+    "TransposeB": [False, True],
+    "Batched": [False, True],
+    "StridedBatched": [False, True],
+    "GroupedGemm": [False, True],
+    "UseInitialStridesAB": [False, True],
+    "UseInitialStridesCD": [False, True],
+    "AllowNoFreeDims": [False, True],
+    "TileAwareSelection": [False, True],
+    "Activation": [False, True],
+    "ActivationNoGuard": [False, True]

For now, we should add check only for those input parameter which don't work if 0/1 is provided in place of boolean and vice-versa.

Let me know if you have any other thought. Here is current diff but as I said, we should only flag error for those, which don't work and give runtime error.

6177.diff.txt

Signed-off-by: pdhirajkumarprasad <dhirajp@amd.com>
Copy link
Copy Markdown
Contributor

@talumbau talumbau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best path forward is for us to have a meeting for what you are trying to do and how we can best accomplish it. The path laid out here is not code that we can land.

Comment on lines +962 to +996
"TDMInst": [0, 1, 2, 3],
# Bias support for GEMM operations
# 0: No bias
# 1: Bias vector on M direction
# 2: Bias vector on N direction
# 3: Bias vector on both M and N directions
"UseBias": [0, 1, 2, 3],
# Alpha vector scaling support
# 0: Disabled
# 1: Alpha vector on M direction
# 2: Alpha vector on N direction
# 3: Alpha vector on both M and N directions
"UseScaleAlphaVec": [0, 1, 2, 3],
# MX (microscaling) block size for matrix A
# 0: Disabled
# 16, 32: Valid MX block sizes
"MXBlockA": [0, 16, 32],
# MX (microscaling) block size for matrix B
# 0: Disabled
# 16, 32: Valid MX block sizes
"MXBlockB": [0, 16, 32],
# Enable beta scaling in GEMM operation (D = alpha*A*B + beta*C)
# False: beta is not used (beta = 0)
# True: beta is used
"UseBeta": [False, True],
# Enable auxiliary output matrix E
# False: E output is not used
# True: E output is used
"UseE": [False, True],
# Enable gradient computation
# False: gradient computation is not used
# True: gradient computation is used
"Gradient": [False, True],
# Enable scaling for C and D matrices
# False: scaling for C and D is not used
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this doesn't seem correct to me that this fix requires adding all of these values to ValidParameters.py.

Comment on lines +142 to +172

# For input parameters to check, exit immediately with clear error
if key in inputParamToCheck:
errorMsg = [
"",
"=" * 80,
"ERROR: Invalid type for parameter '{}'".format(key),
]
if srcFile:
errorMsg.append("File: {}".format(srcFile))
errorMsg.extend([
"=" * 80,
" Expected type: {}".format(expectedStr),
" Actual type: {}".format(actualType.__name__),
" Value: {}".format(repr(value)),
"",
])
if key == "UseBias":
errorMsg.extend([
" UseBias must be an integer (0, 1, 2, or 3):",
" 0 = no bias",
" 1 = bias vector on M direction",
" 2 = bias vector on N direction",
" 3 = bias vector on both M and N directions",
"",
])
elif key == "UseScaleAlphaVec":
errorMsg.extend([
" UseScaleAlphaVec must be an integer (0, 1, 2, or 3):",
" 0 = disabled",
" 1 = alpha vector on M direction",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We cannot have this kind of code here.

@pdhirajkumarprasad
Copy link
Copy Markdown
Contributor Author

we are going to resolve this issue through different path. Already have PR #6607 through which we are giving warning. Plan is to upgrade the severity to ERROR once all the internal yaml is fixed. So closing this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants