Skip to content

Use capstone to validate precise-output tests#5780

Merged
elliottt merged 9 commits intobytecodealliance:mainfrom
elliottt:trevor/capstone-output-filetests
Feb 16, 2023
Merged

Use capstone to validate precise-output tests#5780
elliottt merged 9 commits intobytecodealliance:mainfrom
elliottt:trevor/capstone-output-filetests

Conversation

@elliottt
Copy link
Member

@elliottt elliottt commented Feb 14, 2023

Use capstone when checking precise-output test expectations. This lets us continue to rely on printing backend-specific pseudoinstructions in the VCode output for debuggign purposes (for instance br_table on x64), while also checking the final output of the machinst buffer in filetests.

@elliottt elliottt changed the title trevor/capstone output filetests Add disassembly to existing precise-output tests Feb 14, 2023
@elliottt elliottt marked this pull request as ready for review February 14, 2023 20:57
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts below -- this is overall a good change, I think, but perhaps we can factor out more of the disassembly logic. Also I think we should try to find a way to test metadata somehow -- perhaps it's as simple as dumping relocs and trap records separately (their Debug textual form) and whatever other bits of CompileResult are relevant and including that in the golden output?

To state it here, @elliottt and I were discussing this earlier and the overall motivation is to be able to test the ultimate expansion/lowering of pseudoinstructions that only happens at emission time. The testing of the VCode pretty-printing instead of an actual disassembly is mostly an accident of path-dependence: we hadn't plumbed in Capstone, we had a textual form already, so we started testing that. But using an externally-defined source of truth for the machine code to text translation offers us a little more reassurance in our testing, IMHO; otherwise we might not catch a mis-encoding in our "assembler" layer unless/until an execution test hits it.

; Disassembly:
; 0: 55 push rbp
; 1: 48 89 e5 mov rbp, rsp
; 4: e8 00 00 00 00 call 9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few cases, like here, where the disassembly loses some info because it doesn't carry the metadata -- in this case, a relocation (the call target). Not so important for this particular test but it would be good to audit for this somehow -- any thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be pretty easy to dump them out if they're present. I'll try that and we can see if that seems like enough information.

Copy link
Member Author

@elliottt elliottt Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've inlined relocations and traps in the disassembled output. Here's the updated version you originally referenced:

;   push        rbp                                                              
;   mov rbp, rsp                                                                 
;   call        9 ; reloc_external CallPCRel4 u0:521 -4                          
;   mov rsp, rbp                                                                 
;   pop rbp                                                                      
;   ret  

Copy link
Contributor

@jameysharp jameysharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Is it possible, and is it easy, to tell Capstone about relocations in function call instructions so we can have more useful output there? Doesn't have to be in this PR but it'd be nice.

@github-actions github-actions bot added the cranelift Issues related to the Cranelift code generator label Feb 14, 2023
@elliottt elliottt force-pushed the trevor/capstone-output-filetests branch from ba5a9c9 to d4bb840 Compare February 14, 2023 21:30
@github-actions github-actions bot added cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen labels Feb 14, 2023
@elliottt elliottt changed the title Add disassembly to existing precise-output tests Use capstone to validate precise-output tests Feb 14, 2023
@elliottt elliottt force-pushed the trevor/capstone-output-filetests branch from 7b20b52 to c4b448c Compare February 15, 2023 00:17
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- thanks!

A thought below on sharing code with clif-util disasm but we can save that for later if you want to get this in first.

Copy link
Contributor

@jameysharp jameysharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. I'm glad you dug into this!

I have a few comments below.

The only additional feature I'd like to suggest is that I think we can get the blockN: labels back, and maybe add a comment with their byte offsets so we can more easily see where jumps refer to. MachBuffer::label_offsets is, if I'm reading this correctly, an array indexed by block number and holding the code offset of the first instruction of that block. Since branches may have been removed there can be multiple labels at the same offset, so before each instruction you can scan that array for all blocks at that offset.

@elliottt elliottt force-pushed the trevor/capstone-output-filetests branch 2 times, most recently from e81d4c3 to 5114dba Compare February 15, 2023 01:32
@elliottt
Copy link
Member Author

The only additional feature I'd like to suggest is that I think we can get the blockN: labels back, and maybe add a comment with their byte offsets so we can more easily see where jumps refer to. MachBuffer::label_offsets is, if I'm reading this correctly, an array indexed by block number and holding the code offset of the first instruction of that block. Since branches may have been removed there can be multiple labels at the same offset, so before each instruction you can scan that array for all blocks at that offset.

As we were discussing, we don't have access to label_offsets in MachBufferFinalized, which is unfortunate. Perhaps we can plumb that through in a future PR, as it would be really nice to get the block labels back.

Copy link
Contributor

@jameysharp jameysharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic: we can actually see potentially important differences between the two disassembly printers now. I'm a little concerned about a couple of the differences I saw while spot-checking filetests. There's no way I can review all the changes but it looks mostly innocuous.

I especially love being able to see which trap can occur at an instruction now!

; ret
; pushq %rbp
; movq %rsp, %rbp
; pextrb $1, %xmm0, %eax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For pextrb/w/d we've been pretty-printing the destination register as the 64-bit %rax but it appears that it should have been the 32-bit %eax. Does that mean we've been leaving the high 32-bits uninitialized, and if so, is that an issue? I don't remember the x86-64 rules for when registers get silently zero-extended for you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general rule on x86-64 is that instructions that write the low 32 bits of a 64-bit GPR clear the high 32 bits; I suspect this difference here is just one of notational convention (i.e., there's not actually a bit selecting rax vs eax that we're flipping).

@elliottt elliottt force-pushed the trevor/capstone-output-filetests branch from 5114dba to 2764f63 Compare February 15, 2023 02:51
@uweigand
Copy link
Member

I've been going through the remaining s390x changes. Looking at broad categories I see:

  • One actual encoding bug in cranelift: the last byte of ahy is 0x7a, not 0x4a! This is wrong code-gen, I'll work on a fix.
  • A few places where the cranelift disassembly output doesn't match the official format (operands swapped etc.). Does not affect code-gen, but should still be fixed as long as we have our own disassembler.
  • capstone apparently does not support any z15 or later instructions right now - this will need to be fixed there.
  • Even with all the wrong and missing encodings fixed, we still have inline constants. I think it would be better to use an out-of-line constant pool. This used not to be possible, but I think now we have the required infrastructure. The one thing I'm not certain is whether we can put constants with relocations in the pool right now.
  • As mentioned above already, branch targets are now shown as plain offsets, but it is basically impossible to find out which instruction this refers to.
  • capstore seems to print all immediates in hex, while the current disassembler uses decimal ...
  • We are no longer seeing any of cranelift-specific "virtual" instructions like virtual_sp_offset_adjust. Not sure if this is problem.
  • For some of the jump sequences, the new disassembler appears to display an optimized form, where the old one would show an intermediate form, e.g.
 ;   clgr %r2, %r3
-;   jge label1 ; jg label2
-; block1:
-;   jg label3
-; block2:
-;   jg label3
-; block3:
 ;   lghi %r2, 1
 ;   br %r14

I guess my primary question would be to what extent we're willing to rely on the quality of the capstone library here. I'm not sure exactly who is writing / maintaining this -- looking at the web site, the last official release is as of nearly three years ago, and even in the github repo (assuming I'm looking at the correct one?) the last significant change to the SystemZ directory is five years old, which presumably explains why it doesn't support z15 yet. Even X86 doesn't appear to be much more recent.

The actual disassembly implementation in capstone seems to be done primarily via files copied from LLVM, so that should likely be correct as far as it goes. But if this isn't regularly updated, we could run into issues with using recent instructions on other architectures as well ...

@bjorn3
Copy link
Contributor

bjorn3 commented Feb 15, 2023

Zydis is a disassembler for x86 with a focus on correctness (even knows the difference between how Intel and AMD cpus decode insts) and performance. It has official rust bindings: https://github.com/zyantific/zydis-rs (licensed under MIT)

BinaryNinja has an AArch64 disassembler generated from the actual ISA manual: https://binary.ninja/2021/04/05/groundup-aarch64.html, https://github.com/Vector35/arch-arm64/tree/master/disassembler (licensed under Apache-2.0) Not sure how easy it is reusable outside of BinaryNinja though.

@elliottt elliottt force-pushed the trevor/capstone-output-filetests branch 2 times, most recently from 681114b to 691f477 Compare February 15, 2023 19:18
@elliottt elliottt force-pushed the trevor/capstone-output-filetests branch from 691f477 to 9a40d83 Compare February 15, 2023 23:51
@elliottt elliottt force-pushed the trevor/capstone-output-filetests branch from 9a40d83 to fbdb307 Compare February 15, 2023 23:52
@elliottt
Copy link
Member Author

After the discussion today, I've added a section to each test output that also includes the printed vcode. This will help spot differences between what we're assuming we're producing and what we're actually producing, as well as help identify where capstone is falling short.

@elliottt elliottt merged commit f04decc into bytecodealliance:main Feb 16, 2023
elliottt added a commit that referenced this pull request Feb 16, 2023
As a follow-up to #5780, disassemble the regions identified by bb_starts, falling back on disassembling the whole buffer. This ensures that instructions like br_table that introduce a lot of constants don't throw off capstone for the remainder of the function.

---------

Co-authored-by: Jamey Sharp <jamey@minilop.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants