mirrored from git://gcc.gnu.org/git/gcc.git
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Support Fortran 2015 teams #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
update teams branch from upstream
also add checksums
Merge upstream gcc-mirror/gcc into sourceryinstitute/gcc
Merge sourceryinstitute/master into sourceryinstitute/teams
Merge master into download-opencoarrays-mpich
…le warning-as-errors error
kraj
pushed a commit
to kraj/gcc
that referenced
this pull request
Oct 12, 2020
Prevents the following UBSAN error:
./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/torture/pr49770.C -O2 -c
/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 2, which is not a valid value for type 'bool'
#0 0x1fdb4d1 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
#1 0x1fcadaa in merge_call_side_effects(modref_summary*, gimple*, modref_summary*, bool) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:511
gcc-mirror#2 0x1fcbadd in analyze_call /home/marxin/Programming/gcc2/gcc/ipa-modref.c:642
gcc-mirror#3 0x1fcc061 in analyze_stmt /home/marxin/Programming/gcc2/gcc/ipa-modref.c:732
gcc-mirror#4 0x1fccf31 in analyze_function /home/marxin/Programming/gcc2/gcc/ipa-modref.c:823
gcc-mirror#5 0x1fd17e5 in execute /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1441
gcc-mirror#6 0x25cca6e in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
gcc-mirror#7 0x25cd39b in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2597
gcc-mirror#8 0x25cd450 in execute_pass_list_1 /home/marxin/Programming/gcc2/gcc/passes.c:2598
gcc-mirror#9 0x25cd4ee in execute_pass_list(function*, opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2608
gcc-mirror#10 0x25c7a5a in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/Programming/gcc2/gcc/passes.c:1726
gcc-mirror#11 0x25cfa3f in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2941
gcc-mirror#12 0x173572d in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2642
gcc-mirror#13 0x17364ee in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
gcc-mirror#14 0x17372d9 in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
gcc-mirror#15 0x2a1f00a in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
gcc-mirror#16 0x2a27dc8 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
gcc-mirror#17 0x2a283cc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
gcc-mirror#18 0x54f21cd in main /home/marxin/Programming/gcc2/gcc/main.c:39
gcc-mirror#19 0x7ffff6f0de09 in __libc_start_main ../csu/libc-start.c:314
gcc-mirror#20 0x9eac09 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1plus+0x9eac09)
gcc/ChangeLog:
* ipa-modref.c (merge_call_side_effects): Clear modref_parm_map
fields in the vector.
kraj
pushed a commit
to kraj/gcc
that referenced
this pull request
Oct 19, 2020
It fixes:
/home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482:22: runtime error: load of value 255, which is not a valid value for type 'bool'
#0 0x18e5df3 in modref_tree<int>::merge(modref_tree<int>*, vec<modref_parm_map, va_heap, vl_ptr>*) /home/marxin/Programming/gcc2/gcc/ipa-modref-tree.h:482
#1 0x18dc180 in ipa_merge_modref_summary_after_inlining(cgraph_edge*) /home/marxin/Programming/gcc2/gcc/ipa-modref.c:1779
gcc-mirror#2 0x18c1c72 in inline_call(cgraph_edge*, bool, vec<cgraph_edge*, va_heap, vl_ptr>*, int*, bool, bool*) /home/marxin/Programming/gcc2/gcc/ipa-inline-transform.c:492
gcc-mirror#3 0x4a3589c in inline_small_functions /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2216
gcc-mirror#4 0x4a3b230 in ipa_inline /home/marxin/Programming/gcc2/gcc/ipa-inline.c:2697
gcc-mirror#5 0x4a3d902 in execute /home/marxin/Programming/gcc2/gcc/ipa-inline.c:3096
gcc-mirror#6 0x1edf831 in execute_one_pass(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2509
gcc-mirror#7 0x1ee26af in execute_ipa_pass_list(opt_pass*) /home/marxin/Programming/gcc2/gcc/passes.c:2936
gcc-mirror#8 0x103f31b in ipa_passes /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2700
gcc-mirror#9 0x103fb40 in symbol_table::compile() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:2777
gcc-mirror#10 0x104092b in symbol_table::finalize_compilation_unit() /home/marxin/Programming/gcc2/gcc/cgraphunit.c:3022
gcc-mirror#11 0x235723b in compile_file /home/marxin/Programming/gcc2/gcc/toplev.c:485
gcc-mirror#12 0x235fff9 in do_compile /home/marxin/Programming/gcc2/gcc/toplev.c:2321
gcc-mirror#13 0x23605fc in toplev::main(int, char**) /home/marxin/Programming/gcc2/gcc/toplev.c:2460
gcc-mirror#14 0x4e2b93b in main /home/marxin/Programming/gcc2/gcc/main.c:39
gcc-mirror#15 0x7ffff6f0ae09 in __libc_start_main ../csu/libc-start.c:314
gcc-mirror#16 0x9a0be9 in _start (/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x9a0be9)
gcc/ChangeLog:
* ipa-modref.c (compute_parm_map): Clear vector.
kraj
pushed a commit
to kraj/gcc
that referenced
this pull request
Jan 1, 2021
…-calls
If the target is configured such that -mlong-call is passed
by default, the function calls these tests are trying to detect
by scanning the assembly file are performed using long calls,
like so:
| foo:
| @ memset-inline-2.c:12: memset (a, -1, 14);
| mov r2, gcc-mirror#14 @,
| mvn r1, #0 @,
| ldr r0, .L2 @,
| ldr r3, .L2+4 @ tmp112,
| bx r3 @ tmp112
Looking at .L2 (and in particular at .L2+4):
| .L2:
| .word a
| .word memset <<<---
This change adds -mno-long-calls to the list of compiler options
to make sure we generate short call code, allowing the assembly
matching to pass.
This is added unconditionally to the dg-options (as opposed to using
dg-additional-options) because this test is already specific to ARM
targets, and -mno-long-calls is available on all ARM targets.
for gcc/testsuite/ChangeLog
* gcc.target/arm/memset-inline-2.c: Add -mno-long-calls to
the test's dg-options.
* gcc.target/arm/pr78255-2.c: Likewise.
nstester
pushed a commit
to nstester/gcc
that referenced
this pull request
Jun 14, 2021
The fixed error is:
==21166==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60300000d900
#0 0x7367d7 in operator delete(void*, unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:172
#1 0x3b82e6e in pointer_equiv_analyzer::~pointer_equiv_analyzer() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:161
#2 0x3b83387 in hybrid_folder::~hybrid_folder() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:517
#3 0x3b83387 in execute_early_vrp /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:686
#4 0x1790611 in execute_one_pass(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2567
gcc-mirror#5 0x1792003 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2656
gcc-mirror#6 0x1792029 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2657
gcc-mirror#7 0x179209f in execute_pass_list(function*, opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2667
gcc-mirror#8 0x178a5f3 in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:1773
gcc-mirror#9 0x1792fac in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/plugin.h:191
gcc-mirror#10 0x1792fac in execute_ipa_pass_list(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:3001
gcc-mirror#11 0xc525fc in ipa_passes /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2154
gcc-mirror#12 0xc525fc in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2289
gcc-mirror#13 0xc5a096 in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2269
gcc-mirror#14 0xc5a096 in symbol_table::finalize_compilation_unit() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2537
gcc-mirror#15 0x1a7a17c in compile_file /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:482
gcc-mirror#16 0x69c758 in do_compile /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2210
gcc-mirror#17 0x69c758 in toplev::main(int, char**) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2349
gcc-mirror#18 0x6a932a in main /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/main.c:39
gcc-mirror#19 0x7ffff7820b34 in __libc_start_main ../csu/libc-start.c:332
gcc-mirror#20 0x6aa5fd in _start (/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/objdir/gcc/cc1+0x6aa5fd)
0x60300000d900 is located 0 bytes inside of 32-byte region [0x60300000d900,0x60300000d920)
allocated by thread T0 here:
#0 0x735ab7 in operator new[](unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:102
#1 0x3b82dac in pointer_equiv_analyzer::pointer_equiv_analyzer(gimple_ranger*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:156
gcc/ChangeLog:
* gimple-ssa-evrp.c (pointer_equiv_analyzer::~pointer_equiv_analyzer): Use delete[].
nstester
pushed a commit
to nstester/gcc
that referenced
this pull request
Apr 23, 2023
Currently on xstormy16 SImode shifts by a single bit require two
instructions, and shifts by other non-zero integer immediate constants
require five instructions. This patch implements the obvious optimization
that shifts by two bits can be done in four instructions, by using two
single-bit sequences.
Hence, ashift_2 was previously generated as:
mov r7,r2 | shl r2,#2 | shl r3,#2 | shr r7,gcc-mirror#14 | or r3,r7
ret
and with this patch we now generate:
shl r2,#1 | rlc r3,#1 | shl r2,#1 | rlc r3,#1
ret
2023-04-23 Roger Sayle <[email protected]>
gcc/ChangeLog
* config/stormy16/stormy16.cc (xstormy16_output_shift): Implement
SImode shifts by two by performing a single bit SImode shift twice.
gcc/testsuite/ChangeLog
* gcc.target/xstormy16/shiftsi.c: New test case.
hubot
pushed a commit
that referenced
this pull request
Mar 21, 2024
Consider
constexpr int VAL = 1;
struct foo {
template <int B>
void bar(typename std::conditional<B==VAL, int, float>::type arg) { }
};
template void foo::bar<1>(int arg);
where we since r11-291 fail to emit the code for the explicit
instantiation. That's because cp_walk_subtrees/TYPENAME_TYPE now
walks TYPE_CONTEXT ('conditional' here) as well, and in a template
finds the B==VAL template argument. VAL is constexpr, which implies const,
which in the global scope implies static. constrain_visibility_for_template
then makes "struct conditional<(B == VAL), int, float>" non-TREE_PUBLIC.
Then symtab_node::needed_p checks TREE_PUBLIC, sees it's 0, and we don't
emit any code.
I thought the fix would be some ODR-esque check to not consider
constexpr variables/fns that are used just for their value. But
it turned out to be tricky. For instance, we can't skip
determine_visibility in a template; we can't even skip it for value-dep
expressions. For example, no-linkage-expr1.C has
using P = struct {}*;
template <int N>
void f(int(*)[((P)0, N)]) {}
where ((P)0, N) is value-dep, but N is not relevant here: we have to
ferret out the anonymous type. When instantiating, it's already gone.
This patch uses decl_constant_var_p. This is to implement (an
approximation) [basic.def.odr]#14.5.1 and [basic.def.odr]#5.2.
PR c++/110323
gcc/cp/ChangeLog:
* decl2.cc (min_vis_expr_r) <case VAR_DECL>: Do nothing for
decl_constant_var_p VAR_DECLs.
gcc/testsuite/ChangeLog:
* g++.dg/template/explicit-instantiation6.C: New test.
* g++.dg/template/explicit-instantiation7.C: New test.
hubot
pushed a commit
that referenced
this pull request
May 7, 2024
vsetvli local eliminate is only consider the current demand instead of full demand, and it will use that incomplete info to remove vsetvli. Give following example from PR114747: vsetvli a5,a1,e8,m4,ta,mu # 57, ratio=2, sew=8, lmul=4 vsetvli zero,a5,e16,m8,ta,ma # 58, ratio=2, sew=16, lmul=8 vle8.v v8,0(a0) # 13, demand ratio=2 vzext.vf2 v24,v8 # 14, demand sew=16 and lmul=8 Insn #58 will removed because #57 has satisfied demand of #13, but it's not consider #14. It should doing more demand analyze, but this bug only present in GCC 13 branch, and we should not change too much on this release branch, so the best way is make the check more conservative - remove only if the target vsetvl_discard_result having same SEW and LMUL as the source vsetvli. gcc/ChangeLog: PR target/114747 * config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn): Check target vsetvl_discard_result and source vsetvli has same SEW and LMUL. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr114747.c: New.
Frosty515
pushed a commit
to FrostyOS-dev/gcc
that referenced
this pull request
May 23, 2024
vsetvli local eliminate is only consider the current demand instead of full demand, and it will use that incomplete info to remove vsetvli. Give following example from PR114747: vsetvli a5,a1,e8,m4,ta,mu # 57, ratio=2, sew=8, lmul=4 vsetvli zero,a5,e16,m8,ta,ma # 58, ratio=2, sew=16, lmul=8 vle8.v v8,0(a0) # 13, demand ratio=2 vzext.vf2 v24,v8 # 14, demand sew=16 and lmul=8 Insn gcc-mirror#58 will removed because gcc-mirror#57 has satisfied demand of gcc-mirror#13, but it's not consider gcc-mirror#14. It should doing more demand analyze, but this bug only present in GCC 13 branch, and we should not change too much on this release branch, so the best way is make the check more conservative - remove only if the target vsetvl_discard_result having same SEW and LMUL as the source vsetvli. gcc/ChangeLog: PR target/114747 * config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn): Check target vsetvl_discard_result and source vsetvli has same SEW and LMUL. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr114747.c: New.
NinaRanns
referenced
this pull request
in NinaRanns/gcc
Jul 9, 2024
update trunk to r15-1804-g2be2145f4f14a7
MBeijer
pushed a commit
to AmigaPorts/gcc
that referenced
this pull request
Jun 6, 2025
vsetvli local eliminate is only consider the current demand instead of full demand, and it will use that incomplete info to remove vsetvli. Give following example from PR114747: vsetvli a5,a1,e8,m4,ta,mu # 57, ratio=2, sew=8, lmul=4 vsetvli zero,a5,e16,m8,ta,ma # 58, ratio=2, sew=16, lmul=8 vle8.v v8,0(a0) # 13, demand ratio=2 vzext.vf2 v24,v8 # 14, demand sew=16 and lmul=8 Insn gcc-mirror#58 will removed because gcc-mirror#57 has satisfied demand of gcc-mirror#13, but it's not consider gcc-mirror#14. It should doing more demand analyze, but this bug only present in GCC 13 branch, and we should not change too much on this release branch, so the best way is make the check more conservative - remove only if the target vsetvl_discard_result having same SEW and LMUL as the source vsetvli. gcc/ChangeLog: PR target/114747 * config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn): Check target vsetvl_discard_result and source vsetvli has same SEW and LMUL. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr114747.c: New.
hubot
pushed a commit
that referenced
this pull request
Oct 31, 2025
It looks like during the upstreaming of BF16 we didn't implement the extend
optab for it.
As a result we go through soft-float emulation which results in massive
performance drop in projects using BF16.
As an example, for
float convert(__bf16 value) {
return (float)value;
}
we generate:
convert(__bf16):
stp x29, x30, [sp, -16]!
mov x29, sp
bl __extendbfsf2
ldp x29, x30, [sp], 16
ret
and after this patch
convert:
movi v31.4s, 0
ext v0.16b, v31.16b, v0.16b, #14
ret
We generate an ext with movi because this has same latency as a shift however
it has twice the throughput. The zero vector is zero latency as such in real
workloads this codegen is much better than using shifts.
As a reminder, BF16 -> FP32 is just shifting left 16 bits.
The expand pattern has to rely on generating multiple subregs due to a
restriction that subregs can't chang floating point size and type at the same
time.
I've tried alternative approaches like using the EXT as SF mode, but the
paradoxical subreg of BF -> SF isn't allowed and using an extend doesn't work
because extend is what we're defining.
gcc/ChangeLog:
PR target/121853
* config/aarch64/aarch64-simd.md (extendbfsf2): New.
gcc/testsuite/ChangeLog:
PR target/121853
* gcc.target/aarch64/pr121853_1.c: New test.
* gcc.target/aarch64/pr121853_2.c: New test.
hubot
pushed a commit
that referenced
this pull request
Nov 18, 2025
It looks like during the upstreaming of BF16 we didn't implement the extend
optab for it.
As a result we go through soft-float emulation which results in massive
performance drop in projects using BF16.
As an example, for
float convert(__bf16 value) {
return (float)value;
}
we generate:
convert(__bf16):
stp x29, x30, [sp, -16]!
mov x29, sp
bl __extendbfsf2
ldp x29, x30, [sp], 16
ret
and after this patch
convert:
movi v31.4s, 0
ext v0.16b, v31.16b, v0.16b, #14
ret
We generate an ext with movi because this has same latency as a shift however
it has twice the throughput. The zero vector is zero latency as such in real
workloads this codegen is much better than using shifts.
As a reminder, BF16 -> FP32 is just shifting left 16 bits.
The expand pattern has to rely on generating multiple subregs due to a
restriction that subregs can't chang floating point size and type at the same
time.
I've tried alternative approaches like using the EXT as SF mode, but the
paradoxical subreg of BF -> SF isn't allowed and using an extend doesn't work
because extend is what we're defining.
gcc/ChangeLog:
PR target/121853
* config/aarch64/aarch64-simd.md (extendbfsf2): New.
gcc/testsuite/ChangeLog:
PR target/121853
* gcc.target/aarch64/pr121853_1.c: New test.
* gcc.target/aarch64/pr121853_2.c: New test.
(cherry picked from commit 58ee207)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.