Commit e5569a2
aarch64: Use SVE ASRD instruction with Neon modes.
The ASRD instruction on SVE performs an arithmetic shift right by an immediate
for divide.
This patch enables the use of ASRD with Neon modes.
For example:
int in[N], out[N];
void
foo (void)
{
for (int i = 0; i < N; i++)
out[i] = in[i] / 4;
}
compiles to:
ldr q31, [x1, x0]
cmlt v30.16b, v31.16b, #0
and z30.b, z30.b, 3
add v30.16b, v30.16b, v31.16b
sshr v30.16b, v30.16b, 2
str q30, [x0, x2]
add x0, x0, 16
cmp x0, 1024
but can just be:
ldp q30, q31, [x0], 32
asrd z31.b, p7/m, z31.b, #2
asrd z30.b, p7/m, z30.b, #2
stp q30, q31, [x1], 32
cmp x0, x2
This patch also adds the following overload:
aarch64_ptrue_reg (machine_mode pred_mode, machine_mode data_mode)
Depending on the data mode, the function returns a predicate with the
appropriate bits set.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_ptrue_reg): New overload.
* config/aarch64/aarch64-protos.h (aarch64_ptrue_reg): Likewise.
* config/aarch64/aarch64-sve.md: Extended sdiv_pow2<mode>3
and *sdiv_pow2<mode>3 to support Neon modes.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/sve-asrd.c: New test.
Co-authored-by: Richard Sandiford <[email protected]>
Signed-off-by: Soumya AR <[email protected]>1 parent 65b7c8d commit e5569a2
File tree
4 files changed
+115
-12
lines changed- gcc
- config/aarch64
- testsuite/gcc.target/aarch64/sve
4 files changed
+115
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1018 | 1018 | | |
1019 | 1019 | | |
1020 | 1020 | | |
| 1021 | + | |
1021 | 1022 | | |
1022 | 1023 | | |
1023 | 1024 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5009 | 5009 | | |
5010 | 5010 | | |
5011 | 5011 | | |
5012 | | - | |
5013 | | - | |
| 5012 | + | |
| 5013 | + | |
5014 | 5014 | | |
5015 | | - | |
5016 | | - | |
| 5015 | + | |
| 5016 | + | |
5017 | 5017 | | |
5018 | 5018 | | |
5019 | 5019 | | |
5020 | 5020 | | |
5021 | 5021 | | |
5022 | | - | |
| 5022 | + | |
5023 | 5023 | | |
5024 | 5024 | | |
5025 | 5025 | | |
5026 | 5026 | | |
5027 | 5027 | | |
5028 | | - | |
5029 | | - | |
| 5028 | + | |
| 5029 | + | |
5030 | 5030 | | |
5031 | | - | |
5032 | | - | |
5033 | | - | |
| 5031 | + | |
| 5032 | + | |
| 5033 | + | |
5034 | 5034 | | |
5035 | 5035 | | |
5036 | 5036 | | |
5037 | 5037 | | |
5038 | | - | |
5039 | | - | |
| 5038 | + | |
| 5039 | + | |
5040 | 5040 | | |
5041 | 5041 | | |
5042 | 5042 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3778 | 3778 | | |
3779 | 3779 | | |
3780 | 3780 | | |
| 3781 | + | |
| 3782 | + | |
| 3783 | + | |
| 3784 | + | |
| 3785 | + | |
| 3786 | + | |
| 3787 | + | |
| 3788 | + | |
| 3789 | + | |
| 3790 | + | |
| 3791 | + | |
| 3792 | + | |
| 3793 | + | |
| 3794 | + | |
| 3795 | + | |
| 3796 | + | |
3781 | 3797 | | |
3782 | 3798 | | |
3783 | 3799 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
0 commit comments