Use capstone to validate precise-output tests#5780
Use capstone to validate precise-output tests#5780elliottt merged 9 commits intobytecodealliance:mainfrom
Conversation
cfallin
left a comment
There was a problem hiding this comment.
Some thoughts below -- this is overall a good change, I think, but perhaps we can factor out more of the disassembly logic. Also I think we should try to find a way to test metadata somehow -- perhaps it's as simple as dumping relocs and trap records separately (their Debug textual form) and whatever other bits of CompileResult are relevant and including that in the golden output?
To state it here, @elliottt and I were discussing this earlier and the overall motivation is to be able to test the ultimate expansion/lowering of pseudoinstructions that only happens at emission time. The testing of the VCode pretty-printing instead of an actual disassembly is mostly an accident of path-dependence: we hadn't plumbed in Capstone, we had a textual form already, so we started testing that. But using an externally-defined source of truth for the machine code to text translation offers us a little more reassurance in our testing, IMHO; otherwise we might not catch a mis-encoding in our "assembler" layer unless/until an execution test hits it.
| ; Disassembly: | ||
| ; 0: 55 push rbp | ||
| ; 1: 48 89 e5 mov rbp, rsp | ||
| ; 4: e8 00 00 00 00 call 9 |
There was a problem hiding this comment.
There are a few cases, like here, where the disassembly loses some info because it doesn't carry the metadata -- in this case, a relocation (the call target). Not so important for this particular test but it would be good to audit for this somehow -- any thoughts?
There was a problem hiding this comment.
I think it would be pretty easy to dump them out if they're present. I'll try that and we can see if that seems like enough information.
There was a problem hiding this comment.
I've inlined relocations and traps in the disassembled output. Here's the updated version you originally referenced:
; push rbp
; mov rbp, rsp
; call 9 ; reloc_external CallPCRel4 u0:521 -4
; mov rsp, rbp
; pop rbp
; ret
jameysharp
left a comment
There was a problem hiding this comment.
Nice!
Is it possible, and is it easy, to tell Capstone about relocations in function call instructions so we can have more useful output there? Doesn't have to be in this PR but it'd be nice.
ba5a9c9 to
d4bb840
Compare
7b20b52 to
c4b448c
Compare
cfallin
left a comment
There was a problem hiding this comment.
LGTM -- thanks!
A thought below on sharing code with clif-util disasm but we can save that for later if you want to get this in first.
jameysharp
left a comment
There was a problem hiding this comment.
This is great. I'm glad you dug into this!
I have a few comments below.
The only additional feature I'd like to suggest is that I think we can get the blockN: labels back, and maybe add a comment with their byte offsets so we can more easily see where jumps refer to. MachBuffer::label_offsets is, if I'm reading this correctly, an array indexed by block number and holding the code offset of the first instruction of that block. Since branches may have been removed there can be multiple labels at the same offset, so before each instruction you can scan that array for all blocks at that offset.
e81d4c3 to
5114dba
Compare
As we were discussing, we don't have access to |
jameysharp
left a comment
There was a problem hiding this comment.
Fantastic: we can actually see potentially important differences between the two disassembly printers now. I'm a little concerned about a couple of the differences I saw while spot-checking filetests. There's no way I can review all the changes but it looks mostly innocuous.
I especially love being able to see which trap can occur at an instruction now!
| ; ret | ||
| ; pushq %rbp | ||
| ; movq %rsp, %rbp | ||
| ; pextrb $1, %xmm0, %eax |
There was a problem hiding this comment.
For pextrb/w/d we've been pretty-printing the destination register as the 64-bit %rax but it appears that it should have been the 32-bit %eax. Does that mean we've been leaving the high 32-bits uninitialized, and if so, is that an issue? I don't remember the x86-64 rules for when registers get silently zero-extended for you.
There was a problem hiding this comment.
The general rule on x86-64 is that instructions that write the low 32 bits of a 64-bit GPR clear the high 32 bits; I suspect this difference here is just one of notational convention (i.e., there's not actually a bit selecting rax vs eax that we're flipping).
5114dba to
2764f63
Compare
|
I've been going through the remaining s390x changes. Looking at broad categories I see:
I guess my primary question would be to what extent we're willing to rely on the quality of the The actual disassembly implementation in |
|
Zydis is a disassembler for x86 with a focus on correctness (even knows the difference between how Intel and AMD cpus decode insts) and performance. It has official rust bindings: https://github.com/zyantific/zydis-rs (licensed under MIT) BinaryNinja has an AArch64 disassembler generated from the actual ISA manual: https://binary.ninja/2021/04/05/groundup-aarch64.html, https://github.com/Vector35/arch-arm64/tree/master/disassembler (licensed under Apache-2.0) Not sure how easy it is reusable outside of BinaryNinja though. |
681114b to
691f477
Compare
691f477 to
9a40d83
Compare
9a40d83 to
fbdb307
Compare
|
After the discussion today, I've added a section to each test output that also includes the printed vcode. This will help spot differences between what we're assuming we're producing and what we're actually producing, as well as help identify where |
As a follow-up to #5780, disassemble the regions identified by bb_starts, falling back on disassembling the whole buffer. This ensures that instructions like br_table that introduce a lot of constants don't throw off capstone for the remainder of the function. --------- Co-authored-by: Jamey Sharp <jamey@minilop.net>
Use
capstonewhen checking precise-output test expectations. This lets us continue to rely on printing backend-specific pseudoinstructions in the VCode output for debuggign purposes (for instancebr_tableon x64), while also checking the final output of the machinst buffer in filetests.