Commit 4607799
committed
arm: Use utxb rN, rM, ror #8 to implement zero_extract on armv6.
Examining the code generated for the following C snippet on a
raspberry pi:
int popcount_lut8(unsigned *buf, int n)
{
int cnt=0;
unsigned int i;
do {
i = *buf;
cnt += lut[i&255];
cnt += lut[i>>8&255];
cnt += lut[i>>16&255];
cnt += lut[i>>24];
buf++;
} while(--n);
return cnt;
}
I was surprised to see following instruction sequence generated by the
compiler:
mov r5, r2, lsr #8
uxtb r5, r5
This sequence can be performed by a single ARM instruction:
uxtb r5, r2, ror #8
The attached patch allows GCC's combine pass to take advantage of ARM's
uxtb with rotate functionality to implement the above zero_extract, and
likewise to use the sxtb with rotate to implement sign_extract. ARM's
uxtb and sxtb can only be used with rotates of 0, 8, 16 and 24, and of
these only the 8 and 16 are useful [ror #0 is a nop, and extends with
ror #24 can be implemented using regular shifts], so the approach here
is to add the six missing but useful instructions as 6 different
define_insn in arm.md, rather than try to be clever with new predicates.
Later ARM hardware has advanced bit field instructions, and earlier
ARM cores didn't support extend-with-rotate, so this appears to only
benefit armv6 era CPUs (e.g. the raspberry pi).
Patch posted:
https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01339.html
Approved by Kyrill Tkachov:
https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01881.html
2024-05-12 Roger Sayle <[email protected]>
Kyrill Tkachov <[email protected]>
* config/arm/arm.md (*arm_zeroextractsi2_8_8, *arm_signextractsi2_8_8,
*arm_zeroextractsi2_8_16, *arm_signextractsi2_8_16,
*arm_zeroextractsi2_16_8, *arm_signextractsi2_16_8): New.
2024-05-12 Roger Sayle <[email protected]>
Kyrill Tkachov <[email protected]>
* gcc.target/arm/extend-ror.c: New test.1 parent 83fb5e6 commit 4607799
2 files changed
+104
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12647 | 12647 | | |
12648 | 12648 | | |
12649 | 12649 | | |
| 12650 | + | |
| 12651 | + | |
| 12652 | + | |
| 12653 | + | |
| 12654 | + | |
| 12655 | + | |
| 12656 | + | |
| 12657 | + | |
| 12658 | + | |
| 12659 | + | |
| 12660 | + | |
| 12661 | + | |
| 12662 | + | |
| 12663 | + | |
| 12664 | + | |
| 12665 | + | |
| 12666 | + | |
| 12667 | + | |
| 12668 | + | |
| 12669 | + | |
| 12670 | + | |
| 12671 | + | |
| 12672 | + | |
| 12673 | + | |
| 12674 | + | |
| 12675 | + | |
| 12676 | + | |
| 12677 | + | |
| 12678 | + | |
| 12679 | + | |
| 12680 | + | |
| 12681 | + | |
| 12682 | + | |
| 12683 | + | |
| 12684 | + | |
| 12685 | + | |
| 12686 | + | |
| 12687 | + | |
| 12688 | + | |
| 12689 | + | |
| 12690 | + | |
| 12691 | + | |
| 12692 | + | |
| 12693 | + | |
| 12694 | + | |
| 12695 | + | |
| 12696 | + | |
| 12697 | + | |
| 12698 | + | |
| 12699 | + | |
| 12700 | + | |
| 12701 | + | |
| 12702 | + | |
| 12703 | + | |
| 12704 | + | |
| 12705 | + | |
| 12706 | + | |
| 12707 | + | |
| 12708 | + | |
| 12709 | + | |
| 12710 | + | |
| 12711 | + | |
| 12712 | + | |
| 12713 | + | |
| 12714 | + | |
| 12715 | + | |
12650 | 12716 | | |
12651 | 12717 | | |
12652 | 12718 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
0 commit comments