Skip to content

Commit eddff3a

Browse files
author
rsandifo
committed
[AArch64] Rework SVE INC/DEC handling
The scalar addition patterns allowed all the VL constants that ADDVL and ADDPL allow, but wrote the instructions as INC or DEC if possible (i.e. adding or subtracting a number of elements * [1, 16] when the source and target registers the same). That works for the cases that the autovectoriser needs, but there are a few constants that INC and DEC can handle but ADDPL and ADDVL can't. E.g.: inch x0, all, mul gcc-mirror#9 is not a multiple of the number of bytes in an SVE register, and so can't use ADDVL. It represents 36 times the number of bytes in an SVE predicate, putting it outside the range of ADDPL. This patch therefore adds separate alternatives for INC and DEC, tied to a new Uai constraint. It also adds an explicit "scalar" or "vector" to the function names, to avoid a clash with the existing support for vector INC and DEC. 2019-08-15 Richard Sandiford <[email protected]> gcc/ * config/aarch64/aarch64-protos.h (aarch64_sve_scalar_inc_dec_immediate_p): Declare. (aarch64_sve_inc_dec_immediate_p): Rename to... (aarch64_sve_vector_inc_dec_immediate_p): ...this. (aarch64_output_sve_addvl_addpl): Take a single rtx argument. (aarch64_output_sve_scalar_inc_dec): Declare. (aarch64_output_sve_inc_dec_immediate): Rename to... (aarch64_output_sve_vector_inc_dec): ...this. * config/aarch64/aarch64.c (aarch64_sve_scalar_inc_dec_immediate_p) (aarch64_output_sve_scalar_inc_dec): New functions. (aarch64_output_sve_addvl_addpl): Remove the base and offset arguments. Only handle true ADDVL and ADDPL instructions; don't emit an INC or DEC. (aarch64_sve_inc_dec_immediate_p): Rename to... (aarch64_sve_vector_inc_dec_immediate_p): ...this. (aarch64_output_sve_inc_dec_immediate): Rename to... (aarch64_output_sve_vector_inc_dec): ...this. Update call to aarch64_sve_vector_inc_dec_immediate_p. * config/aarch64/predicates.md (aarch64_sve_scalar_inc_dec_immediate) (aarch64_sve_plus_immediate): New predicates. (aarch64_pluslong_operand): Accept aarch64_sve_plus_immediate rather than aarch64_sve_addvl_addpl_immediate. (aarch64_sve_inc_dec_immediate): Rename to... (aarch64_sve_vector_inc_dec_immediate): ...this. Update call to aarch64_sve_vector_inc_dec_immediate_p. (aarch64_sve_add_operand): Update accordingly. * config/aarch64/constraints.md (Uai): New constraint. (vsi): Update call to aarch64_sve_vector_inc_dec_immediate_p. * config/aarch64/aarch64.md (add<GPI:mode>3): Don't force the second operand into a register if it satisfies aarch64_sve_plus_immediate. (*add<GPI:mode>3_aarch64, *add<GPI:mode>3_poly_1): Add an alternative for Uai. Update calls to aarch64_output_sve_addvl_addpl. * config/aarch64/aarch64-sve.md (add<mode>3): Call aarch64_output_sve_vector_inc_dec instead of aarch64_output_sve_inc_dec_immediate. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@274518 138bc75d-0d04-0410-961f-82ee72b054a4
1 parent f478677 commit eddff3a

File tree

7 files changed

+116
-42
lines changed

7 files changed

+116
-42
lines changed

gcc/ChangeLog

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,41 @@
1+
2019-08-15 Richard Sandiford <[email protected]>
2+
3+
* config/aarch64/aarch64-protos.h
4+
(aarch64_sve_scalar_inc_dec_immediate_p): Declare.
5+
(aarch64_sve_inc_dec_immediate_p): Rename to...
6+
(aarch64_sve_vector_inc_dec_immediate_p): ...this.
7+
(aarch64_output_sve_addvl_addpl): Take a single rtx argument.
8+
(aarch64_output_sve_scalar_inc_dec): Declare.
9+
(aarch64_output_sve_inc_dec_immediate): Rename to...
10+
(aarch64_output_sve_vector_inc_dec): ...this.
11+
* config/aarch64/aarch64.c (aarch64_sve_scalar_inc_dec_immediate_p)
12+
(aarch64_output_sve_scalar_inc_dec): New functions.
13+
(aarch64_output_sve_addvl_addpl): Remove the base and offset
14+
arguments. Only handle true ADDVL and ADDPL instructions;
15+
don't emit an INC or DEC.
16+
(aarch64_sve_inc_dec_immediate_p): Rename to...
17+
(aarch64_sve_vector_inc_dec_immediate_p): ...this.
18+
(aarch64_output_sve_inc_dec_immediate): Rename to...
19+
(aarch64_output_sve_vector_inc_dec): ...this. Update call to
20+
aarch64_sve_vector_inc_dec_immediate_p.
21+
* config/aarch64/predicates.md (aarch64_sve_scalar_inc_dec_immediate)
22+
(aarch64_sve_plus_immediate): New predicates.
23+
(aarch64_pluslong_operand): Accept aarch64_sve_plus_immediate
24+
rather than aarch64_sve_addvl_addpl_immediate.
25+
(aarch64_sve_inc_dec_immediate): Rename to...
26+
(aarch64_sve_vector_inc_dec_immediate): ...this. Update call to
27+
aarch64_sve_vector_inc_dec_immediate_p.
28+
(aarch64_sve_add_operand): Update accordingly.
29+
* config/aarch64/constraints.md (Uai): New constraint.
30+
(vsi): Update call to aarch64_sve_vector_inc_dec_immediate_p.
31+
* config/aarch64/aarch64.md (add<GPI:mode>3): Don't force the second
32+
operand into a register if it satisfies aarch64_sve_plus_immediate.
33+
(*add<GPI:mode>3_aarch64, *add<GPI:mode>3_poly_1): Add an alternative
34+
for Uai. Update calls to aarch64_output_sve_addvl_addpl.
35+
* config/aarch64/aarch64-sve.md (add<mode>3): Call
36+
aarch64_output_sve_vector_inc_dec instead of
37+
aarch64_output_sve_inc_dec_immediate.
38+
139
2019-08-15 Richard Sandiford <[email protected]>
240

341
* config/aarch64/iterators.md (UNSPEC_REVB, UNSPEC_REVH)

gcc/config/aarch64/aarch64-protos.h

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -476,17 +476,19 @@ bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx);
476476
bool aarch64_move_imm (HOST_WIDE_INT, machine_mode);
477477
opt_machine_mode aarch64_sve_pred_mode (unsigned int);
478478
bool aarch64_sve_cnt_immediate_p (rtx);
479+
bool aarch64_sve_scalar_inc_dec_immediate_p (rtx);
479480
bool aarch64_sve_addvl_addpl_immediate_p (rtx);
480-
bool aarch64_sve_inc_dec_immediate_p (rtx);
481+
bool aarch64_sve_vector_inc_dec_immediate_p (rtx);
481482
int aarch64_add_offset_temporaries (rtx);
482483
void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, rtx, rtx, rtx);
483484
bool aarch64_mov_operand_p (rtx, machine_mode);
484485
rtx aarch64_reverse_mask (machine_mode, unsigned int);
485486
bool aarch64_offset_7bit_signed_scaled_p (machine_mode, poly_int64);
486487
bool aarch64_offset_9bit_signed_unscaled_p (machine_mode, poly_int64);
487488
char *aarch64_output_sve_cnt_immediate (const char *, const char *, rtx);
488-
char *aarch64_output_sve_addvl_addpl (rtx, rtx, rtx);
489-
char *aarch64_output_sve_inc_dec_immediate (const char *, rtx);
489+
char *aarch64_output_sve_scalar_inc_dec (rtx);
490+
char *aarch64_output_sve_addvl_addpl (rtx);
491+
char *aarch64_output_sve_vector_inc_dec (const char *, rtx);
490492
char *aarch64_output_scalar_simd_mov_immediate (rtx, scalar_int_mode);
491493
char *aarch64_output_simd_mov_immediate (rtx, unsigned,
492494
enum simd_immediate_check w = AARCH64_CHECK_MOV);

gcc/config/aarch64/aarch64-sve.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1971,7 +1971,7 @@
19711971
"@
19721972
add\t%0.<Vetype>, %0.<Vetype>, #%D2
19731973
sub\t%0.<Vetype>, %0.<Vetype>, #%N2
1974-
* return aarch64_output_sve_inc_dec_immediate (\"%0.<Vetype>\", operands[2]);
1974+
* return aarch64_output_sve_vector_inc_dec (\"%0.<Vetype>\", operands[2]);
19751975
movprfx\t%0, %1\;add\t%0.<Vetype>, %0.<Vetype>, #%D2
19761976
movprfx\t%0, %1\;sub\t%0.<Vetype>, %0.<Vetype>, #%N2
19771977
add\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>"

gcc/config/aarch64/aarch64.c

Lines changed: 36 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -2950,6 +2950,33 @@ aarch64_output_sve_cnt_immediate (const char *prefix, const char *operands,
29502950
value.coeffs[1], 0);
29512951
}
29522952

2953+
/* Return true if we can add X using a single SVE INC or DEC instruction. */
2954+
2955+
bool
2956+
aarch64_sve_scalar_inc_dec_immediate_p (rtx x)
2957+
{
2958+
poly_int64 value;
2959+
return (poly_int_rtx_p (x, &value)
2960+
&& (aarch64_sve_cnt_immediate_p (value)
2961+
|| aarch64_sve_cnt_immediate_p (-value)));
2962+
}
2963+
2964+
/* Return the asm string for adding SVE INC/DEC immediate OFFSET to
2965+
operand 0. */
2966+
2967+
char *
2968+
aarch64_output_sve_scalar_inc_dec (rtx offset)
2969+
{
2970+
poly_int64 offset_value = rtx_to_poly_int64 (offset);
2971+
gcc_assert (offset_value.coeffs[0] == offset_value.coeffs[1]);
2972+
if (offset_value.coeffs[1] > 0)
2973+
return aarch64_output_sve_cnt_immediate ("inc", "%x0",
2974+
offset_value.coeffs[1], 0);
2975+
else
2976+
return aarch64_output_sve_cnt_immediate ("dec", "%x0",
2977+
-offset_value.coeffs[1], 0);
2978+
}
2979+
29532980
/* Return true if we can add VALUE to a register using a single ADDVL
29542981
or ADDPL instruction. */
29552982

@@ -2975,27 +3002,16 @@ aarch64_sve_addvl_addpl_immediate_p (rtx x)
29753002
&& aarch64_sve_addvl_addpl_immediate_p (value));
29763003
}
29773004

2978-
/* Return the asm string for adding ADDVL or ADDPL immediate X to operand 1
2979-
and storing the result in operand 0. */
3005+
/* Return the asm string for adding ADDVL or ADDPL immediate OFFSET
3006+
to operand 1 and storing the result in operand 0. */
29803007

29813008
char *
2982-
aarch64_output_sve_addvl_addpl (rtx dest, rtx base, rtx offset)
3009+
aarch64_output_sve_addvl_addpl (rtx offset)
29833010
{
29843011
static char buffer[sizeof ("addpl\t%x0, %x1, #-") + 3 * sizeof (int)];
29853012
poly_int64 offset_value = rtx_to_poly_int64 (offset);
29863013
gcc_assert (aarch64_sve_addvl_addpl_immediate_p (offset_value));
29873014

2988-
/* Use INC or DEC if possible. */
2989-
if (rtx_equal_p (dest, base) && GP_REGNUM_P (REGNO (dest)))
2990-
{
2991-
if (aarch64_sve_cnt_immediate_p (offset_value))
2992-
return aarch64_output_sve_cnt_immediate ("inc", "%x0",
2993-
offset_value.coeffs[1], 0);
2994-
if (aarch64_sve_cnt_immediate_p (-offset_value))
2995-
return aarch64_output_sve_cnt_immediate ("dec", "%x0",
2996-
-offset_value.coeffs[1], 0);
2997-
}
2998-
29993015
int factor = offset_value.coeffs[1];
30003016
if ((factor & 15) == 0)
30013017
snprintf (buffer, sizeof (buffer), "addvl\t%%x0, %%x1, #%d", factor / 16);
@@ -3010,8 +3026,8 @@ aarch64_output_sve_addvl_addpl (rtx dest, rtx base, rtx offset)
30103026
factor in *FACTOR_OUT (if nonnull). */
30113027

30123028
bool
3013-
aarch64_sve_inc_dec_immediate_p (rtx x, int *factor_out,
3014-
unsigned int *nelts_per_vq_out)
3029+
aarch64_sve_vector_inc_dec_immediate_p (rtx x, int *factor_out,
3030+
unsigned int *nelts_per_vq_out)
30153031
{
30163032
rtx elt;
30173033
poly_int64 value;
@@ -3045,21 +3061,21 @@ aarch64_sve_inc_dec_immediate_p (rtx x, int *factor_out,
30453061
instruction. */
30463062

30473063
bool
3048-
aarch64_sve_inc_dec_immediate_p (rtx x)
3064+
aarch64_sve_vector_inc_dec_immediate_p (rtx x)
30493065
{
3050-
return aarch64_sve_inc_dec_immediate_p (x, NULL, NULL);
3066+
return aarch64_sve_vector_inc_dec_immediate_p (x, NULL, NULL);
30513067
}
30523068

30533069
/* Return the asm template for an SVE vector INC or DEC instruction.
30543070
OPERANDS gives the operands before the vector count and X is the
30553071
value of the vector count operand itself. */
30563072

30573073
char *
3058-
aarch64_output_sve_inc_dec_immediate (const char *operands, rtx x)
3074+
aarch64_output_sve_vector_inc_dec (const char *operands, rtx x)
30593075
{
30603076
int factor;
30613077
unsigned int nelts_per_vq;
3062-
if (!aarch64_sve_inc_dec_immediate_p (x, &factor, &nelts_per_vq))
3078+
if (!aarch64_sve_vector_inc_dec_immediate_p (x, &factor, &nelts_per_vq))
30633079
gcc_unreachable ();
30643080
if (factor < 0)
30653081
return aarch64_output_sve_cnt_immediate ("dec", operands, -factor,

gcc/config/aarch64/aarch64.md

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1753,6 +1753,7 @@
17531753
/* If the constant is too large for a single instruction and isn't frame
17541754
based, split off the immediate so it is available for CSE. */
17551755
if (!aarch64_plus_immediate (operands[2], <MODE>mode)
1756+
&& !(TARGET_SVE && aarch64_sve_plus_immediate (operands[2], <MODE>mode))
17561757
&& can_create_pseudo_p ()
17571758
&& (!REG_P (op1)
17581759
|| !REGNO_PTR_FRAME_P (REGNO (op1))))
@@ -1770,21 +1771,22 @@
17701771

17711772
(define_insn "*add<mode>3_aarch64"
17721773
[(set
1773-
(match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,rk")
1774+
(match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk")
17741775
(plus:GPI
1775-
(match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,rk")
1776-
(match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uav")))]
1776+
(match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk")
1777+
(match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav")))]
17771778
""
17781779
"@
17791780
add\\t%<w>0, %<w>1, %2
17801781
add\\t%<w>0, %<w>1, %<w>2
17811782
add\\t%<rtn>0<vas>, %<rtn>1<vas>, %<rtn>2<vas>
17821783
sub\\t%<w>0, %<w>1, #%n2
17831784
#
1784-
* return aarch64_output_sve_addvl_addpl (operands[0], operands[1], operands[2]);"
1785-
;; The "alu_imm" type for ADDVL/ADDPL is just a placeholder.
1786-
[(set_attr "type" "alu_imm,alu_sreg,neon_add,alu_imm,multiple,alu_imm")
1787-
(set_attr "arch" "*,*,simd,*,*,*")]
1785+
* return aarch64_output_sve_scalar_inc_dec (operands[2]);
1786+
* return aarch64_output_sve_addvl_addpl (operands[2]);"
1787+
;; The "alu_imm" types for INC/DEC and ADDVL/ADDPL are just placeholders.
1788+
[(set_attr "type" "alu_imm,alu_sreg,neon_add,alu_imm,multiple,alu_imm,alu_imm")
1789+
(set_attr "arch" "*,*,simd,*,*,sve,sve")]
17881790
)
17891791

17901792
;; zero_extend version of above
@@ -1863,17 +1865,18 @@
18631865
;; this pattern.
18641866
(define_insn_and_split "*add<mode>3_poly_1"
18651867
[(set
1866-
(match_operand:GPI 0 "register_operand" "=r,r,r,r,r,&r")
1868+
(match_operand:GPI 0 "register_operand" "=r,r,r,r,r,r,&r")
18671869
(plus:GPI
1868-
(match_operand:GPI 1 "register_operand" "%rk,rk,rk,rk,rk,rk")
1869-
(match_operand:GPI 2 "aarch64_pluslong_or_poly_operand" "I,r,J,Uaa,Uav,Uat")))]
1870+
(match_operand:GPI 1 "register_operand" "%rk,rk,rk,rk,rk,0,rk")
1871+
(match_operand:GPI 2 "aarch64_pluslong_or_poly_operand" "I,r,J,Uaa,Uav,Uai,Uat")))]
18701872
"TARGET_SVE && operands[0] != stack_pointer_rtx"
18711873
"@
18721874
add\\t%<w>0, %<w>1, %2
18731875
add\\t%<w>0, %<w>1, %<w>2
18741876
sub\\t%<w>0, %<w>1, #%n2
18751877
#
1876-
* return aarch64_output_sve_addvl_addpl (operands[0], operands[1], operands[2]);
1878+
* return aarch64_output_sve_scalar_inc_dec (operands[2]);
1879+
* return aarch64_output_sve_addvl_addpl (operands[2]);
18771880
#"
18781881
"&& epilogue_completed
18791882
&& !reg_overlap_mentioned_p (operands[0], operands[1])
@@ -1884,8 +1887,8 @@
18841887
operands[2], operands[0], NULL_RTX);
18851888
DONE;
18861889
}
1887-
;; The "alu_imm" type for ADDVL/ADDPL is just a placeholder.
1888-
[(set_attr "type" "alu_imm,alu_sreg,alu_imm,multiple,alu_imm,multiple")]
1890+
;; The "alu_imm" types for INC/DEC and ADDVL/ADDPL are just placeholders.
1891+
[(set_attr "type" "alu_imm,alu_sreg,alu_imm,multiple,alu_imm,alu_imm,multiple")]
18891892
)
18901893

18911894
(define_split

gcc/config/aarch64/constraints.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,12 @@
4949
(and (match_code "const_int")
5050
(match_test "aarch64_pluslong_strict_immedate (op, VOIDmode)")))
5151

52+
(define_constraint "Uai"
53+
"@internal
54+
A constraint that matches a VG-based constant that can be added by
55+
a single INC or DEC."
56+
(match_operand 0 "aarch64_sve_scalar_inc_dec_immediate"))
57+
5258
(define_constraint "Uav"
5359
"@internal
5460
A constraint that matches a VG-based constant that can be added by
@@ -416,7 +422,7 @@
416422
"@internal
417423
A constraint that matches a vector count operand valid for SVE INC and
418424
DEC instructions."
419-
(match_operand 0 "aarch64_sve_inc_dec_immediate"))
425+
(match_operand 0 "aarch64_sve_vector_inc_dec_immediate"))
420426

421427
(define_constraint "vsn"
422428
"@internal

gcc/config/aarch64/predicates.md

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -144,18 +144,27 @@
144144
(and (match_operand 0 "aarch64_pluslong_immediate")
145145
(not (match_operand 0 "aarch64_plus_immediate"))))
146146

147+
(define_predicate "aarch64_sve_scalar_inc_dec_immediate"
148+
(and (match_code "const_poly_int")
149+
(match_test "aarch64_sve_scalar_inc_dec_immediate_p (op)")))
150+
147151
(define_predicate "aarch64_sve_addvl_addpl_immediate"
148152
(and (match_code "const_poly_int")
149153
(match_test "aarch64_sve_addvl_addpl_immediate_p (op)")))
150154

155+
(define_predicate "aarch64_sve_plus_immediate"
156+
(ior (match_operand 0 "aarch64_sve_scalar_inc_dec_immediate")
157+
(match_operand 0 "aarch64_sve_addvl_addpl_immediate")))
158+
151159
(define_predicate "aarch64_split_add_offset_immediate"
152160
(and (match_code "const_poly_int")
153161
(match_test "aarch64_add_offset_temporaries (op) == 1")))
154162

155163
(define_predicate "aarch64_pluslong_operand"
156164
(ior (match_operand 0 "register_operand")
157165
(match_operand 0 "aarch64_pluslong_immediate")
158-
(match_operand 0 "aarch64_sve_addvl_addpl_immediate")))
166+
(and (match_test "TARGET_SVE")
167+
(match_operand 0 "aarch64_sve_plus_immediate"))))
159168

160169
(define_predicate "aarch64_pluslong_or_poly_operand"
161170
(ior (match_operand 0 "aarch64_pluslong_operand")
@@ -602,9 +611,9 @@
602611
(and (match_code "const,const_vector")
603612
(match_test "aarch64_sve_arith_immediate_p (op, true)")))
604613

605-
(define_predicate "aarch64_sve_inc_dec_immediate"
614+
(define_predicate "aarch64_sve_vector_inc_dec_immediate"
606615
(and (match_code "const,const_vector")
607-
(match_test "aarch64_sve_inc_dec_immediate_p (op)")))
616+
(match_test "aarch64_sve_vector_inc_dec_immediate_p (op)")))
608617

609618
(define_predicate "aarch64_sve_uxtb_immediate"
610619
(and (match_code "const_vector")
@@ -687,7 +696,7 @@
687696
(define_predicate "aarch64_sve_add_operand"
688697
(ior (match_operand 0 "aarch64_sve_arith_operand")
689698
(match_operand 0 "aarch64_sve_sub_arith_immediate")
690-
(match_operand 0 "aarch64_sve_inc_dec_immediate")))
699+
(match_operand 0 "aarch64_sve_vector_inc_dec_immediate")))
691700

692701
(define_predicate "aarch64_sve_pred_and_operand"
693702
(ior (match_operand 0 "register_operand")

0 commit comments

Comments
 (0)