Releases/gcc 12 #65

jacopobrusini · 2022-06-04T17:20:00Z

Support for Apple Silicon!!!

jwakely · 2024-02-21T00:10:33Z

This is an unofficial mirror that has nothing to do with the GCC project, so submitting pull requests here is a waste of time.

Also, I have no idea what this pull request is trying to do but it would never be accepted even if it was submitted to the right place.

For function arguments/return, when it's BLK mode, it's put in a parallel with an expr_list, and the expr_list contains the real mode and registers. Current ix86_check_avx_upper_register only checked for SSE_REG_P, and failed to handle that. The patch extend the handle to each subrtx. gcc/ChangeLog: PR target/116512 * config/i386/i386.cc (ix86_check_avx_upper_register): Iterate subrtx to scan for avx upper register. (ix86_check_avx_upper_stores): Inline old ix86_check_avx_upper_register. (ix86_avx_u128_mode_needed): Ditto, and replace FOR_EACH_SUBRTX with call to new ix86_check_avx_upper_register. gcc/testsuite/ChangeLog: * gcc.target/i386/pr116512.c: New test. (cherry picked from commit ab214ef)

The intrin for non-optimized got a typo in mask type, which will cause the high bits of __mmask32 being unexpectedly zeroed. The test does not fail under O0 with current 1b since the testcase is wrong. We need to include avx512-mask-type.h after SIZE is defined, or it will always be __mmask8. That problem also happened in AVX10.2 testcases. I will write a seperate patch to fix that. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm512_mask_fpclass_ph_mask): Correct mask type to __mmask32. (_mm512_fpclass_ph_mask): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfpclassph-1c.c: New test.

Update analyze_parms not to disable function parameter analysis for -ffat-lto-objects. Tested on x86-64, there are no differences in zstd with "-O2 -flto=auto" -g "vs -O2 -flto=auto -g -ffat-lto-objects". PR ipa/116410 * ipa-modref.cc (analyze_parms): Always analyze function parameter for LTO. Signed-off-by: H.J. Lu <[email protected]> (cherry picked from commit 2f1689e)

Don't use temp for a PARALLEL BLKmode argument of an EXPR_LIST expression in a TImode register. Otherwise, the TImode variable will be put in the GPR save area which guarantees only 8-byte alignment. gcc/ PR target/116621 * config/i386/i386.cc (ix86_gimplify_va_arg): Don't use temp for a PARALLEL BLKmode container of an EXPR_LIST expression in a TImode register. gcc/testsuite/ PR target/116621 * gcc.target/i386/pr116621.c: New test. Signed-off-by: H.J. Lu <[email protected]> (cherry picked from commit fa7bbb0)

Loop distribution does different analysis with -g0/-g due to counting a debug stmt starting a BB against a limit which will everntually lead to different IVOPTs choices. I've fixed a possible IVOPTs issue on the way even though it doesn't make a difference here. PR tree-optimization/116290 * tree-loop-distribution.cc (determine_reduction_stmt_1): PHIs have no debug variants. Start with first non-debug real stmt. * tree-ssa-loop-ivopts.cc (find_givs_in_bb): Do not analyze debug stmts. * gcc.dg/pr116290.c: New testcase. (cherry picked from commit 5667400)

The following reverts a bogus fix done for PR101009 and instead makes sure we get into the same_access_functions () case when computing the distance vector for g[1] and g[1] where the constants ended up having different types. The generic code doesn't seem to handle loop invariant dependences. The special case gets us both ( 0 ) and ( 1 ) as distance vectors while formerly we got ( 1 ), which the PR101009 fix changed to ( 0 ) with bad effects on other cases as shown in this PR. PR tree-optimization/116768 * tree-data-ref.cc (build_classic_dist_vector_1): Revert PR101009 change. * tree-chrec.cc (eq_evolutions_p): Make sure (sizetype)1 and (int)1 compare equal. * gcc.dg/torture/pr116768.c: New testcase. (cherry picked from commit 5b5a36b)

@2

@2) Transforming -fma (-a, b, -c) to fma (a, b, c) is only valid when not rounding towards -inf or +inf as the sign of the multiplication changes. PR middle-end/116891 * match.pd ((negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)): Only enable for !HONOR_SIGN_DEPENDENT_ROUNDING. (cherry picked from commit c53bd48)

@0

On Mon, Oct 14, 2024 at 08:53:29AM +0200, Jakub Jelinek wrote: > > PR middle-end/116891 > > * match.pd ((negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)): > > Only enable for !HONOR_SIGN_DEPENDENT_ROUNDING. > > Guess it would be nice to have a testcase which FAILs without the patch and > PASSes with it, but it can be added later. I've added such a testcase now, and additionally found the fix only fixed one of the 4 problematic similar cases. Here is a patch which fixes the others too and adds the testcases. fma-pr116891.c FAILed without your patch, FAILs with your patch too (but only due to the bar/baz/qux checks) and PASSes with the patch. 2024-10-15 Jakub Jelinek <[email protected]> PR middle-end/116891 * match.pd ((negate (fmas@3 @0 @1 @2)) -> (IFN_FNMS @0 @1 @2)): Only enable for !HONOR_SIGN_DEPENDENT_ROUNDING. ((negate (IFN_FMS@3 @0 @1 @2)) -> (IFN_FNMA @0 @1 @2)): Likewise. ((negate (IFN_FNMA@3 @0 @1 @2)) -> (IFN_FMS @0 @1 @2)): Likewise. * gcc.dg/pr116891.c: New test. * gcc.target/i386/fma-pr116891.c: New test. (cherry picked from commit 4366f0c)

@0

…ication For vector types we have to make sure the comparison result is a vector type and the resulting compare operation is supported. As the resulting compare is never an equality compare I didn't bother to check for the cbranch case. PR tree-optimization/117104 * match.pd ((cmp:c (minmax:c @0 @1) @0) -> (out @0 @1)): Properly guard the vector case. * gcc.dg/pr117104.c: New testcase. (cherry picked from commit f54d42e)

The diagnostics code fails to handle non-constant domain max. PR tree-optimization/117254 * gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg): Check the array domain max is constant before using it. * gcc.dg/pr117254.c: New testcase. (cherry picked from commit d464a52)

STMT_VINFO_SLP_VECT_ONLY isn't properly computed as union of all group members and when the group is later split due to duplicates not all sub-groups inherit the flag. PR tree-optimization/117307 * tree-vect-data-refs.cc (vect_analyze_data_ref_accesses): Properly compute STMT_VINFO_SLP_VECT_ONLY. Set it on all parts of a split group. * gcc.dg/vect/pr117307.c: New testcase. (cherry picked from commit 1972230)

When we decompose a complex load only used as real and imaginary parts we fail to honor IL constraints which are that a BIT_FIELD_REF of register type should be outermost in a ref. The following simply avoids the transform when the complex load has such a BIT_FIELD_REF. PR tree-optimization/117417 * tree-ssa-forwprop.cc (pass_forwprop::execute): Avoid decomposing BIT_FIELD_REF complex load. * gcc.dg/torture/pr117417.c: New testcase. (cherry picked from commit d976daa)

This patch removes the (unnecessary) CPP_PRAGMA_EOL case from cp_parser_cache_defarg, which currently has the result that any pragmas in the NSDMI cause an error. PR c++/118147 gcc/cp/ChangeLog: * parser.cc (cp_parser_cache_defarg): Don't error when CPP_PRAGMA_EOL. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/nsdmi-defer7.C: New test. Signed-off-by: Nathaniel Shead <[email protected]> (cherry picked from commit f3ccc57)

We are initializing both the call graph node count and the entry block count of the function with the head_count value from the profile. Count propagation algorithm may refine the entry block count and we may end up with a case where the call graph node count is set to zero but the entry block count is non-zero. That becomes a problem because we have this code in execute_fixup_cfg: profile_count num = node->count; profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; bool scale = num.initialized_p () && !(num == den); Here if num is 0 but den is not 0, scale becomes true and we lose the counts in if (scale) bb->count = bb->count.apply_scale (num, den); This is what happened in the issue reported in PR116743 (a 10% regression in MySQL HAMMERDB tests). 3d9e676 made an improvement in AutoFDO count propagation, which caused a mismatch between the call graph node count (zero) and the entry block count (non-zero) and subsequent loss of counts as described above. The fix is to update the call graph node count once we've done count propagation. Tested on x86_64-pc-linux-gnu. gcc/ChangeLog: PR gcov-profile/116743 * auto-profile.cc (afdo_annotate_cfg): Fix mismatch between the call graph node count and the entry block count. (cherry picked from commit e683c6b)

…PR118255] We currently reject the following code === code here === template <int non_template> struct S { friend class non_template; }; class non_template {}; S<0> s; === code here === While EDG agrees with the current behaviour, clang and MSVC don't (see https://godbolt.org/z/69TGaabhd), and I believe that this code is valid, since the friend clause does not actually declare a type, so it cannot shadow anything. The fact that we didn't error out if the non_template class was declared before S backs this up as well. This patch fixes this by skipping the call to check_template_shadow for hidden bindings. PR c++/118255 gcc/cp/ChangeLog: * name-lookup.cc (pushdecl): Don't call check_template_shadow for hidden bindings. gcc/testsuite/ChangeLog: * g++.dg/lookup/pr99116-1.C: Adjust test expectation. * g++.dg/template/friend84.C: New test. (cherry picked from commit b5a0692)

…[PR118067] SImode and DImode moves from/to mask registers are valid only with AVX512BW, so mark relevant alternatives in *movsi_internal and *movdi_internal as such. PR target/118067 gcc/ChangeLog: * config/i386/i386.md (*movdi_internal): Disable alternatives from/to mask registers without AVX512BW. (*movsi_internal): Ditto.

Since the introduction of gdc.test/runnable/test23514.d, it's exposed an incorrect compilation when adding a 64-bit constant to a link-time address. The current cast to size_t causes a loss of precision, which can result in incorrect compilation. PR d/114434 gcc/d/ChangeLog: * expr.cc (ExprVisitor::visit (PtrExp *)): Get the offset as a dinteger_t rather than a size_t. (ExprVisitor::visit (SymOffExp *)): Likewise. gcc/testsuite/ChangeLog: * gdc.test/runnable/test23514.d: New test. (cherry picked from commit 9ab3895)

We disable gathers for zen4. It seems that gather has improved a bit compared to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when the indices are known ahead of time. Vector loads followed by shuffles result in a higher load bandwidth." however the situation seems to be more complicated. gather is 5-10% loss on parest benchmark as well as 30% loss on sparse dot products in TSVC. Curiously enough breaking these out into microbenchmark reversed the situation and it turns out that the performance depends on how indices are distributed. gather is loss if indices are sequential, neutral if they are random and win for some strides (4, 8). This seems to be similar to earlier zens, so I think (especially for backporting znver5 support) that it makes sense to be conistent and disable gather unless we work out a good heuristics on when to use it. Since we typically do not know the indices in advance, I don't see how that can be done. I opened PR116582 with some examples of wins and loses gcc/ChangeLog: * config/i386/x86-tune.def (X86_TUNE_USE_GATHER_2PARTS): Disable for ZNVER5. (X86_TUNE_USE_SCATTER_2PARTS): Disable for ZNVER5. (X86_TUNE_USE_GATHER_4PARTS): Disable for ZNVER5. (X86_TUNE_USE_SCATTER_4PARTS): Disable for ZNVER5. (X86_TUNE_USE_GATHER_8PARTS): Disable for ZNVER5. (X86_TUNE_USE_SCATTER_8PARTS): Disable for ZNVER5. (cherry picked from commit d82edbe)

PR d/111650 gcc/d/ChangeLog: * decl.cc (get_fndecl_arguments): Move generation of frame type to ... (DeclVisitor::visit (FuncDeclaration *)): ... here, after the call to build_closure. gcc/testsuite/ChangeLog: * gdc.dg/pr111650.d: New test. (cherry picked from commit 4d4929f)

2025-01-23 John David Anglin <[email protected]> gcc/ChangeLog: * config/pa/pa32-regs.h (ADDITIONAL_REGISTER_NAMES): Change register 86 name to "%fr31L".

The loop checking for built-in constant operand restrictions was missing some operands due to the loop limit being too small. Fixing that exposed a testsuite failure which is caused by a typo in the pmxvi4ger8pp definition where we had made the PMASK field too small. 2025-01-16 Peter Bergner <[email protected]> gcc/ * config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Use correct array size for the loop limit. * config/rs6000/rs6000-builtins.def: Fix field size for PMASK operand. (cherry picked from commit 1a2d63a)

For invalid constant operand values used in built-in functions, return const0_rtx to signify an error occurred during expansion. 2025-01-16 Peter Bergner <[email protected]> gcc/ * config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Return const0_rtx when there is an error. gcc/testsuite/ * gcc.target/powerpc/mma-builtin-error.c: New test. (cherry picked from commit 0696af7)

…on-r15-7214-g0710024b5bd861 Contracts nonattr rebase on r15 7214 g0710024b5bd861

atahanozbayram approved these changes Apr 2, 2024

View reviewed changes

GCC Administrator and others added 28 commits August 24, 2024 00:19

Daily bump.

15176ab

Daily bump.

19fedf7

Daily bump.

84fc228

Daily bump.

c2305c8

Daily bump.

9742dbd

Daily bump.

2875f9f

Daily bump.

a9284c5

Daily bump.

bb95e77

Daily bump.

4dc921b

Daily bump.

911eadd

Daily bump.

93e66ca

Daily bump.

87a5641

Daily bump.

71f9ca6

Daily bump.

fc14ff0

Daily bump.

0f053a8

Daily bump.

0dba957

Daily bump.

b48e7c2

Daily bump.

b64a998

Daily bump.

682cc3f

Daily bump.

0344276

Daily bump.

6aceb85

Daily bump.

5c8f84c

Daily bump.

772393c

Daily bump.

46bf97c

rguenth and others added 29 commits January 17, 2025 09:52

Daily bump.

a5ea6f4

Daily bump.

51c1abd

Daily bump.

94fdc54

Daily bump.

8aaddf0

Daily bump.

e24b17e

Daily bump.

5f01025

hppa: Fix typo in ADDITIONAL_REGISTER_NAMES in pa32-regs.h

40eafb7

2025-01-23 John David Anglin <[email protected]> gcc/ChangeLog: * config/pa/pa32-regs.h (ADDITIONAL_REGISTER_NAMES): Change register 86 name to "%fr31L".

Daily bump.

11e11d4

Daily bump.

809bf0c

Daily bump.

5032e62

Daily bump.

71c036f

Daily bump.

88e26ba

NinaRanns pushed a commit to NinaRanns/gcc that referenced this pull request Jan 28, 2025

Merge pull request gcc-mirror#65 from iains/contracts-nonattr-rebase-…

5128291

…on-r15-7214-g0710024b5bd861 Contracts nonattr rebase on r15 7214 g0710024b5bd861

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases/gcc 12 #65

Releases/gcc 12 #65

jacopobrusini commented Jun 4, 2022

jwakely commented Feb 21, 2024

Releases/gcc 12 #65

Are you sure you want to change the base?

Releases/gcc 12 #65

Conversation

jacopobrusini commented Jun 4, 2022

jwakely commented Feb 21, 2024