Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISC-V] Enable On Stack Replacement #96558

Merged
merged 10 commits into from
Jan 22, 2024
Merged

[RISC-V] Enable On Stack Replacement #96558

merged 10 commits into from
Jan 22, 2024

Conversation

sirntar
Copy link
Member

@sirntar sirntar commented Jan 5, 2024

IMPORTANT: Before this PR could be merge, it requires changes from #96136 changes have been merged

This pull request:

  • redesign PROLOG/EPILOG in order to enable OSR (now they are more similar to LA64 architecture)
  • implements genAllocLclFrame with genStackProbe to be use in order to probe stack I disabled genAllocLclFrame method for now
  • fixes the cast of floating-point types to uint32 in volatileLocal1.dll
  • fixes the register conflict for large offset addresses of OSR root frames when DOTNET_JitStressRegs=2 in more_tailcalls.sh
  • fixes an assertion failure in debug mode with verbose switch while removing conditional jump block moved to separate PR (Replace call to GetFalseTarget method with GetTarget inside fgRemoveConditionalJump #96560)

Part of #84834
cc @wscho77 @HJLeee @clamp03 @JongHeonChoi @t-mustafin @gbalykov @viewizard @ashaurtaev @brucehoult @tomeksowi @yurai007 @Bajtazar

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jan 5, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 5, 2024
@ghost
Copy link

ghost commented Jan 5, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

IMPORTANT: Before this PR could be merge, it requires changes from #96136

This pull request:

  • redesign PROLOG/EPILOG in order to enable OSR (now they are more similar to LA64 architecture)
  • implements genAllocLclFrame with genSmallStackProbe and genStackProbe to be use in order to probe stack
  • fixes the cast of floating-point types to uint32 in volatileLocal1.dll
  • fixes the register conflict for large offset addresses of OSR root frames when DOTNET_JitStressRegs=2 in more_tailcalls.sh
  • fixes an assertion failure in debug mode with verbose switch while removing conditional jump block

Part of #84834
cc @wscho77 @HJLeee @clamp03 @JongHeonChoi @t-mustafin @gbalykov @viewizard @ashaurtaev @brucehoult @tomeksowi @yurai007 @Bajtazar

Author: sirntar
Assignees: -
Labels:

area-CodeGen-coreclr, community-contribution

Milestone: -

@sirntar
Copy link
Member Author

sirntar commented Jan 5, 2024

fixes an assertion failure in debug mode with verbose switch while removing conditional jump block

I'm referring here to #95773. After this PR I had a problem to get pass through mainlooptry3 test. Back then I've been using debug compilation with verbose option turned on.

Bug details
// A fragment of method from /runtime/src/coreclr/jit/fgopt.cpp
void Compiler::fgRemoveConditionalJump(BasicBlock* block)
{
    noway_assert(block->KindIs(BBJ_COND) && block->TrueTargetIs(block->GetFalseTarget()));
    assert(compRationalIRForm == block->IsLIR());

    FlowEdge* flow = fgGetPredForBlock(block->GetFalseTarget(), block);
    noway_assert(flow->getDupCount() == 2);

    // Change the BBJ_COND to BBJ_ALWAYS, and adjust the refCount and dupCount.
    --block->GetFalseTarget()->bbRefs;
    block->SetKind(BBJ_ALWAYS);
    block->SetFlags(BBF_NONE_QUIRK);
    flow->decrementDupCount();

#ifdef DEBUG
    if (verbose)
    {
      // BUG HERE:  block is now BBJ_ALWAYS, not BBJ_COND, 
     //  so assertion inside  GetFalseTarget method will fail; correct method to call should be GetTarget
        printf("Block " FMT_BB " becoming a BBJ_ALWAYS to " FMT_BB " (jump target is the same whether the condition"
               " is true or false)\n",
               block->bbNum, block->GetFalseTarget()->bbNum); 
    }
#endif
//...

@jakobbotsch
Copy link
Member

               block->bbNum, block->GetFalseTarget()->bbNum); 

Do you mind submitting a separate PR for this fix?

@sirntar
Copy link
Member Author

sirntar commented Jan 5, 2024

               block->bbNum, block->GetFalseTarget()->bbNum); 

Do you mind submitting a separate PR for this fix?

Not at all, will do that in a minute

src/coreclr/jit/codegenriscv64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegenriscv64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegenriscv64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegenriscv64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegenriscv64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegenriscv64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegenriscv64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegenriscv64.cpp Outdated Show resolved Hide resolved
@sirntar sirntar marked this pull request as ready for review January 11, 2024 11:44
@sirntar
Copy link
Member Author

sirntar commented Jan 11, 2024

Remove stack probing (genAllocLclFrame)

After discussion with @tomeksowi, I decided to run a lot of tests on both the qemu and the VisionFive2 board. I did not find any problems that could be solved with this method, nor any reason to continue using it. So I decided to disable it for now. I left its implementation in this PR just in case.

About tests

On both: the qemu and the VisionFive2 board all OSR tests are passing with success.

emit->emitIns_R_R_R(INS_sub, EA_PTRSIZE, regCnt, REG_SPBASE, regCnt);

// Overflow, set regCnt to lowest possible value
emit->emitIns_R_R_I(INS_beq, EA_PTRSIZE, tempReg, REG_R0, 2 << 2);
emit->emitIns_R_R_I(INS_addi, EA_PTRSIZE, regCnt, REG_R0, 0);

assert(compiler->eeGetPageSize() == ((compiler->eeGetPageSize() >> 12) << 12));
emit->emitIns_R_I(INS_lui, EA_PTRSIZE, regTmp, compiler->eeGetPageSize() >> 12);
GetEmitter()->emitIns_R_I(INS_lui, EA_PTRSIZE, rPageSize, pageSize >> 12);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update GetEmitter() to emit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using emit instead of GetEmitter() is a good practice, even if GetEmitter() is marked as inline. I discussed this with the team and I decided to replace GetEmitter() with emit in all my changes.


// each instr is 4 bytes
// if (rOffset >= rLimit) goto Loop;
GetEmitter()->emitIns_R_R_I(INS_bge, EA_PTRSIZE, rOffset, rLimit, -2 << 2);
Copy link
Member

@clamp03 clamp03 Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, rLimit should be -frameSize + REG_SPBASE and initial value of rOffset should be REG_SPBASE - rPageSize (and exchange INS_sub and INS_lw in the loop)? Could you please check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, rLimit should be -frameSize + REG_SPBASE and initial value of rOffset should be REG_SPBASE - rPageSize

Nice catch! Thanks, of course it should be -frameSize + REG_SPBASE. After the update, the assembly should look more or less like this:

; load   rLimit, -frameSize
add  rLimit, rLimit, REG_SPBASE
; rLimit <-- (-frameSize + REG_SPBASE)
lui  rPageSize, pageSize >> 12
; rPageSize <-- 0x1000

sub  rOffset, REG_SPBASE, rPageSize
; rOffset <-- (REG_SPBASE - rPageSize)

; loop:
  lw   R0, rOffset ; tickle the page
  sub  rOffset, rOffset, rPageSize
  bge  rOffset, rLimit, -2 << 2

@sirntar
Copy link
Member Author

sirntar commented Jan 15, 2024

@jakobbotsch I think it's ready for review. As for the test failures, I don't think they are related to the changes. Can you check it?

@jakobbotsch
Copy link
Member

cc @dotnet/jit-contrib. @AndyAyersMS can you please take a look?

@jakobbotsch
Copy link
Member

for the test failures, I don't think they are related to the changes. Can you check it?

Yep, they all look like known issues, #94728, #96876, dotnet/dnceng#1423, #96904

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool to see this. I did not look at the codegen in detail.

Have you tried running any form of OSR stress (eg setting the various config values very low (see here for details)? We do this in the jit-experimental CI run and it has been useful for finding bugs.

@sirntar
Copy link
Member Author

sirntar commented Jan 17, 2024

@AndyAyersMS

Have you tried running any form of OSR stress (eg setting the various config values very low (see here for details)? We do this in the jit-experimental CI run and it has been useful for finding bugs.

Since morning I've been playing with stress tests by setting different values for DOTNET_OSR_HitCount, DOTNET_OSR_CounterBump and DOTNET_TC_OnStackReplacement_InitialCounter. In each configuration, the OSR test pass rate is 100% on both the QEMU and VisionFive2 board.

@jakobbotsch jakobbotsch merged commit 99b621f into dotnet:main Jan 22, 2024
123 of 129 checks passed
tmds pushed a commit to tmds/runtime that referenced this pull request Jan 23, 2024
* [RISC-V] Implement On Stack Replacement

Note: Pinned local test is failing.

* [RISC-V] Apply suggestions from code review

Co-authored-by: Tomasz Sowiński <[email protected]>

* [RISC-V] apply jit-format

* [RISC-V] Cosmetic changes after code review

* [RISC-V] Changes assuming memory page is always equal 4KiB

* [RISC-V] Remove stack probing

* [RISC-V] Replace GetEmitter() with emit

* [RISC-V] Sync frame type 1 in genPushCalleeSavedRegisters with genPopCalleeSavedRegisters

* [RISC-V] Fix assembly emmited by genStackProbe

* [RISC-V] Apply jit-formatter

---------

Co-authored-by: Tomasz Sowiński <[email protected]>
@github-actions github-actions bot locked and limited conversation to collaborators Feb 22, 2024
@sirntar sirntar deleted the rv64-osr branch October 9, 2024 10:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-riscv Related to the RISC-V architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants