Skip to content

add patches to GCCcore 14.2.0 & 14.3.0 to fix ICE with SVE on aarch64#25090

Draft
Thyre wants to merge 1 commit intoeasybuilders:developfrom
Thyre:20260119103701_new_pr_GCCcore1420
Draft

add patches to GCCcore 14.2.0 & 14.3.0 to fix ICE with SVE on aarch64#25090
Thyre wants to merge 1 commit intoeasybuilders:developfrom
Thyre:20260119103701_new_pr_GCCcore1420

Conversation

@Thyre
Copy link
Collaborator

@Thyre Thyre commented Jan 19, 2026

(created using eb --new-pr)

Should avoid the following error when building PyTorch in #24926:

/dev/shm/reuter1/easybuild/build/PyTorch/2.9.1/foss-2025b-CUDA-12.9.1/pytorch-v2.9.1/aten/src/ATen/native/cpu/Unfold2d.cpp:225:1: error: unrecognizable insn:
  225 | }
      | ^
(insn 1302 1301 1303 97 (set (reg:VNx16BI 3167)
        (unspec:VNx16BI [
                (reg:VNx16BI 3164)
                (reg:VNx8BI 3166)
                (const_vector:VNx4BI [
                        (const_int 0 [0]) repeated x8
                    ])
            ] UNSPEC_TRN1_CONV)) "/dev/shm/reuter1/easybuild/build/PyTorch/2.9.1/foss-2025b-CUDA-12.9.1/pytorch-v2.9.1/torch/headeronly/util/bit_cast.h":40:14 -1
     (nil))
during RTL pass: vregs
/dev/shm/reuter1/easybuild/build/PyTorch/2.9.1/foss-2025b-CUDA-12.9.1/pytorch-v2.9.1/aten/src/ATen/native/cpu/Unfold2d.cpp:225:1: internal compiler error: in extract_insn, at recog.cc:2812
0x7d30df _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
        ../../gcc/rtl-error.cc:108
0x7d3113 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
        ../../gcc/rtl-error.cc:116
0xec1d17 extract_insn(rtx_insn*)
        ../../gcc/recog.cc:2812
0xc2a28b instantiate_virtual_regs_in_insn
        ../../gcc/function.cc:1612
0xc2a28b instantiate_virtual_regs
        ../../gcc/function.cc:1995
0xc2a28b execute
        ../../gcc/function.cc:2042                                                                                                                                                                                                                             Please submit a full bug report, with preprocessed source (by using -freport-bug).                                                                                                                                                                             Please include the complete backtrace with any bug report.                                                                                                                                                                                                     See <https://gcc.gnu.org/bugs/> for instructions.

See also:

@Thyre Thyre added bug fix aarch64 Related to Arm 64-bit (aarch64) 2025a issues & PRs related to 2025a common toolchains 2025b issues & PRs related to 2025b common toolchains and removed change labels Jan 19, 2026
@Thyre Thyre marked this pull request as draft January 19, 2026 09:39
@Thyre
Copy link
Collaborator Author

Thyre commented Jan 19, 2026

I'll mark this as draft until I was able to check if I get further in PyTorch with a rebuilt GCCcore...

@Thyre
Copy link
Collaborator Author

Thyre commented Jan 19, 2026

Looking through pytorch/pytorch#172445, we might need a few more GCC patches to get forward.
See e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123457, just raised a few days ago.

@Thyre
Copy link
Collaborator Author

Thyre commented Jan 19, 2026

Test report by @Thyre
SUCCESS
Build succeeded for 2 out of 2 (total: 1 hour 9 mins 0 secs) (2 easyconfigs in total)
jrc0900.jureca - Linux Rocky Linux 9.6, AArch64, ARM UNKNOWN (neoverse_v2), 1 x NVIDIA NVIDIA GH200 480GB, 580.95.05, Python 3.9.21
See https://gist.github.com/Thyre/9a756a8f83ce39803686cc1d5be2a5e3 for a full test report.

@Thyre
Copy link
Collaborator Author

Thyre commented Jan 19, 2026

Adding these patches unfortunately doesn't change anything:

/dev/shm/reuter1/easybuild/build/PyTorch/2.9.1/foss-2025b-CUDA-12.9.1/pytorch-v2.9.1/aten/src/ATen/native/cpu/Unfold2d.cpp: In function ‘void at::native::{anonymous}::unfolded2d_acc_kernel(c10::ScalarType, void*, void*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, bool)’:
/dev/shm/reuter1/easybuild/build/PyTorch/2.9.1/foss-2025b-CUDA-12.9.1/pytorch-v2.9.1/aten/src/ATen/native/cpu/Unfold2d.cpp:221:1: error: unrecognizable insn:
  221 | }
      | ^
(insn 1302 1301 1303 97 (set (reg:VNx16BI 3167)
        (unspec:VNx16BI [
                (reg:VNx16BI 3164)
                (reg:VNx8BI 3166)
                (const_vector:VNx4BI [
                        (const_int 0 [0]) repeated x8
                    ])
            ] UNSPEC_TRN1_CONV)) "/dev/shm/reuter1/easybuild/build/PyTorch/2.9.1/foss-2025b-CUDA-12.9.1/pytorch-v2.9.1/torch/headeronly/util/bit_cast.h":40:14 -1
     (nil))
during RTL pass: vregs
/dev/shm/reuter1/easybuild/build/PyTorch/2.9.1/foss-2025b-CUDA-12.9.1/pytorch-v2.9.1/aten/src/ATen/native/cpu/Unfold2d.cpp:221:1: internal compiler error: in extract_insn, at recog.cc:2812
0x7d30df _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
        ../../gcc/rtl-error.cc:108
0x7d3113 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
        ../../gcc/rtl-error.cc:116
0xec1d17 extract_insn(rtx_insn*)
        ../../gcc/recog.cc:2812
0xc2a28b instantiate_virtual_regs_in_insn
        ../../gcc/function.cc:1612
0xc2a28b instantiate_virtual_regs
        ../../gcc/function.cc:1995
0xc2a28b execute
        ../../gcc/function.cc:2042
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

Maybe there's another patch for GCC 14 which fixes the issue. The test case in the bug report works with the patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2025a issues & PRs related to 2025a common toolchains 2025b issues & PRs related to 2025b common toolchains aarch64 Related to Arm 64-bit (aarch64) bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants