Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riscv64: Introduce Compressed Instructions #6989

Merged
merged 12 commits into from
Sep 12, 2023

Conversation

afonso360
Copy link
Contributor

👋 Hey,

This PR makes introduces compressed instructions to the RISC-V backend as part of the C extension.

C extension introduction

The C Extension adds encodings for 16 bit instructions. These are allowed to be freely interleaved with uncompressed instructions (i.e. we don't need a mode switch like Thumb2).

Some C instruction formats cannot encode all registers, so they may only be emitted if they get assigned a certain physical register. Additionally C instructions (almost) always have an equivalent uncompressed instruction.

Implementation

I've implemented this as an encoding only change. So our VCode instructions remain the same, but if we happen to get the correct registers and all the stars line up we emit a compressed instruction instead of a full sized instruction.

Inst::emit is now split into 3 stages:

  1. First we run the instruction through Inst::allocate which just transforms a Inst with virtual registers, into a Inst with physical registers
  2. We then try to emit the instruction as a compressed instruction
    • This may fail, and that's ok
    • Since we now have the final physical registers we can check if it's possible to emit it as a compressed instruction
  3. If the compressed emission fails we default to the uncompressed emission

This PR only enables 2 instructions c.add and c.mv. Both of these use the CR format which allows the full range of registers. However they are enough to get us into trouble. c.mv is emitted quite often, and is enough to "unalign" most of the rest of the code.

Issues with capstone

Capstone supports decoding C instructions with a separate "Extra Mode".

Capstone currently cannot decode a number of instructions that we emit (i.e Zca, Zcb, Zcs or Vector Instructions). This is normally ok since it emits a .byte ... directive and we can still read the V-Code to figure out what is going on.

If I enable this "Extra Mode" by default it tries to decode these unknown instructions as C instructions and makes the whole thing unreadable.

My solution for this was to make C a non default extension, and only enable the Extra Mode when C is enabled. This lets us keep the normal disassembly for all current test cases. Mixing C instructions and Vector instructions still makes the whole thing unreadable, but at least its not the common case.

Runtests

I've made pretty much all runtests run with C as well as without. Even if the runtest will never directly emits a C instruction, it can emit for example a c.mv and emit the rest of the instructions "unaligned" which is also a configuration worth testing.


This is built on top of #6988 so it's worth waiting for that to be merged before looking into this. Additionally, it's probably a good idea to review this commit by commit instead of all at once.

@afonso360 afonso360 added the cranelift:area:riscv64 Issues related to the RISC-V 64 backend. label Sep 10, 2023
@afonso360 afonso360 requested review from a team as code owners September 10, 2023 13:32
@afonso360 afonso360 requested review from abrown and removed request for a team September 10, 2023 13:32
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:meta Everything related to the meta-language. isle Related to the ISLE domain-specific language labels Sep 10, 2023
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:riscv64", "cranelift:meta", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks great to me, and thanks as always for the detail explanations!

Would it be possible to add detection of these features to cranelift/native?

@afonso360
Copy link
Contributor Author

Would it be possible to add detection of these features to cranelift/native?

They should already be detected. has_c was already previously detected via HWCAP (although we didn't do too much with that information).

The rest of them (Zca, Zcd and Zcb) are detected via parsing /proc/cpuinfo. We don't have to do anything special to detect them because we currently just grab all of the extension names in the ISA string and try to enable the equivalent has_{extname} flag.

The C extension has been split a little bit since its introduction
let's support the finer grained instructions.

This also disables compressed instruction support by default.
Conditionally enable compressed instruction disassembly.
This commit reorganizes our emit function to first
try a special case for compressed instruction, before
falling back to uncompressed instructions.

Currently the compressed case does nothing.
We can only emit compressed instructions if they have a
specific physical register. For example `c.mv` does not support
using the `x0` register.

Thus, move the allocation function that converts from
virtual registers to physical registers into a separate step
before trying to emit the instruction.

This allows us to know the real register when emitting
compressed instructions.
`br_table` computes physical offsets from a certain instruction.
Thus we need to force these instructions to be uncompressed
in order to not jump into the wrong target.
This reverts commit 212dec4.

It looks like the version of QEMU that CI uses does not yet
support Zcb.
@afonso360 afonso360 added this pull request to the merge queue Sep 12, 2023
Merged via the queue into bytecodealliance:main with commit 87ab0de Sep 12, 2023
25 checks passed
@afonso360 afonso360 deleted the riscv-c branch September 12, 2023 11:12
eduardomourar pushed a commit to eduardomourar/wasmtime that referenced this pull request Sep 13, 2023
* riscv64: Add compressed extension instructions

The C extension has been split a little bit since its introduction
let's support the finer grained instructions.

This also disables compressed instruction support by default.

* ci: Enable zcb extension on RISC-V QEMU

* riscv64: Enable RVC mode in capstone

Conditionally enable compressed instruction disassembly.

* riscv64: Prepare to emit compressed instructions

This commit reorganizes our emit function to first
try a special case for compressed instruction, before
falling back to uncompressed instructions.

Currently the compressed case does nothing.

* riscv64: Move emit register allocation to separate function

We can only emit compressed instructions if they have a
specific physical register. For example `c.mv` does not support
using the `x0` register.

Thus, move the allocation function that converts from
virtual registers to physical registers into a separate step
before trying to emit the instruction.

This allows us to know the real register when emitting
compressed instructions.

* riscv64: Emit BrTable as uncompressed

`br_table` computes physical offsets from a certain instruction.
Thus we need to force these instructions to be uncompressed
in order to not jump into the wrong target.

* riscv64: Mark DWARF CIE code alignment as 2 bytes

* riscv64: Make minimum function alignment 2 bytes

* riscv64: Emit `c.add`

* riscv64: Add `c.mv`

* riscv64: Add c runtests

* Revert "ci: Enable zcb extension on RISC-V QEMU"

This reverts commit 212dec4.

It looks like the version of QEMU that CI uses does not yet
support Zcb.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:riscv64 Issues related to the RISC-V 64 backend. cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants