Skip to content

Remove module-level code generation tests#5870

Merged
elliottt merged 3 commits intobytecodealliance:mainfrom
elliottt:trevor/remove-module-emit-tests
Feb 24, 2023
Merged

Remove module-level code generation tests#5870
elliottt merged 3 commits intobytecodealliance:mainfrom
elliottt:trevor/remove-module-emit-tests

Conversation

@elliottt
Copy link
Member

We have good coverage for these test cases in the per-isa precise-output tests, and after #5780 we can see both the VCode and disassembled output of those tests. These tests have historically been difficult to keep up to date, see the comment about dumping bytes out to a binary file and running objdump on each test, so removing them will help simplify the process of updating test expectations.

@elliottt elliottt marked this pull request as ready for review February 23, 2023 22:04
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen labels Feb 23, 2023
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- good riddance to test tests! They made sense at the time but indeed, our infrastructure is better now.

/// two is that cold blocks are sunk in the latter.) We might as
/// well do the test here, where we have a backend to use.
#[test]
fn test_cold_blocks() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a generic cold-blocks filetest? It looks like we have a few related to bugs (atomic-cas-bug.clif, br_table.clif, and a few legalizer/verifier tests too) but none that try to generally exercise the feature like this test does.

Copy link
Contributor

@jameysharp jameysharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I was just trying to decide what to do about this for #5850.

Have you double-checked the precise-output tests for these backends to make sure that each of these cases is covered? I suspect they are all covered but it'd be nice to make sure while we're thinking about it.

@elliottt
Copy link
Member Author

Thank you, I was just trying to decide what to do about this for #5850.

Have you double-checked the precise-output tests for these backends to make sure that each of these cases is covered? I suspect they are all covered but it'd be nice to make sure while we're thinking about it.

I had a quick look through the tests I was removing, and @cfallin is right about adding generic cold-block tests. I'll add one to each backend before merging this 👍

@elliottt elliottt force-pushed the trevor/remove-module-emit-tests branch from 21cf390 to 894ac26 Compare February 23, 2023 23:23
@jameysharp
Copy link
Contributor

Huh. On all backends, this new test produces the same lowered block order, with or without the cold annotation.

In fact as best I can tell, the only difference in generated code between the two cases is that if the cold annotation is used, all backends insert an unreachable "jump" instruction after the "return" and before the cold block. That seems like a bug?

But I think it would also help to have a test which actually produces a different lowering order with the annotation than without.

@elliottt
Copy link
Member Author

Huh. On all backends, this new test produces the same lowered block order, with or without the cold annotation.

In fact as best I can tell, the only difference in generated code between the two cases is that if the cold annotation is used, all backends insert an unreachable "jump" instruction after the "return" and before the cold block. That seems like a bug?

But I think it would also help to have a test which actually produces a different lowering order with the annotation than without.

Good call. I've replaced the one I ported from the x64 test with one that is simpler, and easier to show the block order changing with.

@elliottt
Copy link
Member Author

elliottt commented Feb 24, 2023

In fact as best I can tell, the only difference in generated code between the two cases is that if the cold annotation is used, all backends insert an unreachable "jump" instruction after the "return" and before the cold block. That seems like a bug?

It's unreachable code, that you can see really well when comparing the VCode and disassembly. VCode block8 is disassembled block5, and the unconditional jump in block6 in the VCode to block8 got inlined, the original condition got inverted, and the final jump went back to block6 again.

; VCode:                            ; Disassembled:                              
;   pushq   %rbp                    ; block0: ; offset 0x0                       
;   movq    %rsp, %rbp              ;   pushq %rbp                               
; block0:                           ;   movq %rsp, %rbp                          
;   movq    %rdi, %r11              ; block1: ; offset 0x4                       
;   addl    %r11d, $4660, %r11d     ;   movq %rdi, %r11                          
;   testl   %r11d, %r11d            ;   addl $0x1234, %r11d                      
;   jnz     label1; j label2        ;   testl %r11d, %r11d                       
; block1:                           ;   je 0x39                                  
;   movq    %r11, %r8               ; block2: ; offset 0x17                      
;   jmp     label3                  ;   movq %r11, %r8                           
; block3:                           ; block3: ; offset 0x1a                      
;   movq    %r11, %rax              ;   movq %r11, %rax                          
;   subl    %eax, $4660, %eax       ;   subl $0x1234, %eax                       
;   addl    %eax, %r8d, %eax        ;   addl %r8d, %eax                          
;   testl   %r11d, %r11d            ;   testl %r11d, %r11d                       
;   jnz     label4; j label5        ;   jne 0x39                                 
; block5:                           ; block4: ; offset 0x2f                      
;   movq    %rbp, %rsp              ;   movq %rbp, %rsp                          
;   popq    %rbp                    ;   popq %rbp                                
;   ret                             ;   retq                                     
; block8:                           ; block5: ; offset 0x34                      
;   jmp     label3                  ;   jmp 0x1a                                 
; block2:                           ; block6: ; offset 0x39                      
;   jmp     label6                  ;   movq %r11, %r8                           
; block4:                           ;   addl $0x1234, %r8d                       
;   jmp     label6                  ;   testl %r8d, %r8d                         
; block6:                           ;   je 0x1a                                  
;   movq    %r11, %r8               ; block7: ; offset 0x4c                      
;   addl    %r8d, $4660, %r8d       ;   jmp 0x39                                 
;   testl   %r8d, %r8d                                                           
;   jnz     label7; j label8                                                     
; block7:                                                                        
;   jmp     label6   

I think we could fix this pretty easily in the machinst buffer, but we might need to track a little bit of additional state to know when blocks are unreachable.

@jameysharp
Copy link
Contributor

Yes, this version of the cold-block tests looks great! I think "if-then triangle" CFG is perfect for this.

@elliottt elliottt added this pull request to the merge queue Feb 24, 2023
@cfallin
Copy link
Member

cfallin commented Feb 24, 2023

I think we could fix this pretty easily in the machinst buffer, but we might need to track a little bit of additional state to know when blocks are unreachable.

To add a little more detail to that: removing the dead block at the MachBuffer level would require the MachBuffer to know about instructions other than branches (here, a return instruction that makes the following block unreachable); in the original MachBuffer design that was an explicit anti-goal, for simplicity.

Now that we're at more of a "polish the mature code" stage of development, I think it'd be totally reasonable to give the MachBuffer a notion of explicit "informed unreachability": the emitter could invoke a method like .mark_unreachable() after a return, similar in spirit to how it informs the MachBuffer about branches.

The tricky part is weaving that into the existing mechanism, and updating the correctness proof accordingly. It probably implies a new kind of record in the "latest branch" list. Unreachability is currently implicitly recognized as a property that follows an unconditional branch instead.

All that said, I don't think this is the highest-priority change: these blocks could impact icache pressure by increasing code size a little, but are dead, i.e. not actually executed. Still, the above is the place to start if someone wants to tackle it!

Merged via the queue into bytecodealliance:main with commit c5d9d5b Feb 24, 2023
@elliottt elliottt deleted the trevor/remove-module-emit-tests branch February 24, 2023 02:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants