Fix Cast node naming collisions and opset 10 Resize in float16 conversion#27469
Conversation
…sion Fix two bugs in convert_float_to_float16: 1. Cast node naming collision: When node.name is empty (common in PyTorch-exported models), generated Cast nodes all get identical names like "_input_cast_2", corrupting the graph. Use unique tensor names (input_name/output) as the naming base instead. 2. Opset 10 Resize scales protection: ALWAYS_FLOAT_INPUTS only protected index 2 (scales in opset 11+). Opset 10 Resize has scales at index 1, which was unprotected. Add index 1 to the protected list. Also fix a misleading comment in the output Cast section.
|
@microsoft-github-policy-service agree |
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This pull request fixes critical issues in the float16 conversion tool that caused graph corruption when processing models with unnamed nodes (common in PyTorch/TensorFlow exports) and incorrectly converted Resize scales in opset 10 models.
Changes:
- Fixed Cast node naming to use unique tensor names instead of potentially-empty node names, preventing naming collisions
- Added protection for Resize scales at input index 1 (opset 10 compatibility)
- Added comprehensive test suite with 8 tests covering naming uniqueness, opset 10/11 Resize handling, and edge cases
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
onnxruntime/python/tools/transformers/float16.py |
Updated ALWAYS_FLOAT_INPUTS to protect Resize index 1; changed Cast node naming from node.name + "_input_cast_" + str(i) to input_name + "_cast_to_fp32"; fixed misleading comment |
onnxruntime/test/python/transformers/test_float16.py |
New test file with 8 tests covering naming uniqueness for unnamed nodes, opset 10/11 Resize scales protection, blocked ops, and force_fp16_initializers behavior |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
The core fix (using tensor names instead of potentially-empty node.name for Cast node naming) is correct and well-tested. One concern: Opset 11+ roi over-protection: Adding index 1 to ALWAYS_FLOAT_INPUTS["Resize"] correctly protects opset 10 scales, but also forces opset 11+ roi to stay fp32 — even though the ONNX spec allows roi to be fp16 (T2 constraint). Practically harmless since roi is usually empty, but technically imprecise. Suggest at minimum adding a comment, or ideally making protection opset-aware. Other minor notes: shared-input edge case not fully addressed (not a regression), and the naming convention change could theoretically affect downstream tooling that pattern-matches Cast node names. |
Address review feedback: instead of unconditionally protecting both indices 1 and 2, detect the ONNX opset version from the model and adjust accordingly: - Opset 10 (Resize inputs [X, scales]): protect index 1 - Opset 11+ (Resize inputs [X, roi, scales, sizes]): protect index 2 only; roi at index 1 allows fp16 per the ONNX spec Update test to reflect that opset 11+ roi is not over-protected.
|
Thanks for the thorough review @tianleiwu! Great catch on the opset 11+ roi over-protection. I've pushed a fix that makes the protection opset-aware:
Updated the test to verify that opset 11+ roi is not over-protected. Re: the shared-input edge case and naming convention — agreed these are pre-existing behaviors, not regressions from this PR. Happy to address them in a follow-up if you'd like. |
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
Summary
convert_float_to_float16when nodes have empty names (common in PyTorch exports)ALWAYS_FLOAT_INPUTSfor opset 10 Resize where scales at index 1 was unprotectedtest_float16.py, 8 tests)Motivation
Fixes #14827
When
convert_float_to_float16processes models with unnamed nodes (emptynode.name, very common in PyTorch/TensorFlow-exported ONNX models), the generated Cast node names collide. For example, multiple Resize nodes all produce Cast nodes named"_input_cast_2"and output tensors named"_input_cast_2", corrupting the graph with duplicate names.Additionally, the
ALWAYS_FLOAT_INPUTSdict only protected Resize scales at index 2 (opset 11+ layout:[X, roi, scales, sizes]), but opset 10 Resize has scales at index 1 ([X, scales]), leaving it unprotected.Changes
onnxruntime/python/tools/transformers/float16.py(11 lines changed):input_name/output) as the base for generated Cast node and output names, instead of potentially-emptynode.nameALWAYS_FLOAT_INPUTS["Resize"]to protect opset 10 scalesonnxruntime/test/python/transformers/test_float16.py(new file, 8 tests):test_resize_opset11_cast_naming_unique— multiple unnamed Resize nodes produce unique Cast namestest_resize_opset11_scales_initializer_stays_fp32— scales initializer preserved as float32test_resize_opset10_scales_initializer_stays_fp32— opset 10 scales protected at index 1test_resize_opset10_multiple_unnamed_unique_names— opset 10 naming uniquenesstest_blocked_node_cast_naming_unique— blocked op nodes (Upsample) also get unique Cast namestest_resize_with_op_block_list— Resize in op_block_list still produces unique namestest_data_input_converted_to_fp16— data tensor correctly converts to fp16test_force_fp16_initializers— force flag overrides protectionTest Plan
python -m unittest test_float16.TestFloat16Conversion -v)test_gpt2_past_fp16test passes (no regression in existing float16 behavior)ruff checkpasses on both files