[FastIsel] Get the right register type for call instruction#164565
[FastIsel] Get the right register type for call instruction#164565
Conversation
When switch from fast isel to dag isel the input value is from llvm IR instruction. If the instruction is call we should get the calling convention of the callee and pass it to RegsForValue::getCopyFromRegs, so that it can deduce the right RegisterVT of the returned value of the callee.
|
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-llvm-selectiondag Author: Luo Yuanke (LuoYuanke) ChangesWhen switch from fast isel to dag isel the input value is from llvm IR instruction. Full diff: https://github.com/llvm/llvm-project/pull/164565.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 20a0efd3afa1c..14f9fdfddf658 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -1977,8 +1977,13 @@ SDValue SelectionDAGBuilder::getValueImpl(const Value *V) {
if (const Instruction *Inst = dyn_cast<Instruction>(V)) {
Register InReg = FuncInfo.InitializeRegForValue(Inst);
+ std::optional<CallingConv::ID> CallConv = std::nullopt;
+ auto *Callee = dyn_cast<CallInst>(Inst);
+ if (Callee && !Callee->isInlineAsm())
+ CallConv = Callee->getCallingConv();
+
RegsForValue RFV(*DAG.getContext(), TLI, DAG.getDataLayout(), InReg,
- Inst->getType(), std::nullopt);
+ Inst->getType(), CallConv);
SDValue Chain = DAG.getEntryNode();
return RFV.getCopyFromRegs(DAG, FuncInfo, getCurSDLoc(), Chain, nullptr, V);
}
diff --git a/llvm/test/CodeGen/X86/bf16-fast-isel.ll b/llvm/test/CodeGen/X86/bf16-fast-isel.ll
new file mode 100644
index 0000000000000..43622566d2974
--- /dev/null
+++ b/llvm/test/CodeGen/X86/bf16-fast-isel.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s
+
+define i8 @test(ptr %f) nounwind {
+; CHECK-LABEL: test:
+; CHECK: # %bb.0: # %entry
+; CHECK-NEXT: pushq %rax
+; CHECK-NEXT: callq foo@PLT
+; CHECK-NEXT: callq bar@PLT
+; CHECK-NEXT: popq %rcx
+; CHECK-NEXT: retq
+entry:
+ %call = call bfloat @foo(ptr %f)
+ %call2 = call noundef zeroext i8 @bar(bfloat %call)
+ ret i8 %call2
+}
+
+declare bfloat @foo(ptr %f)
+declare zeroext i8 @bar(bfloat)
|
| %call2 = call noundef zeroext i8 @bar(bfloat %call) | ||
| ret i8 %call2 | ||
| } | ||
|
|
There was a problem hiding this comment.
Can you add the same testcase wit Han indirect call. Can you also test each case with a non-default calling convention
There was a problem hiding this comment.
Added test case for indirect call and fast calling convention.
| if (const Instruction *Inst = dyn_cast<Instruction>(V)) { | ||
| Register InReg = FuncInfo.InitializeRegForValue(Inst); | ||
|
|
||
| std::optional<CallingConv::ID> CallConv = std::nullopt; |
There was a problem hiding this comment.
| std::optional<CallingConv::ID> CallConv = std::nullopt; | |
| std::optional<CallingConv::ID> CallConv; |
| if (Callee && !Callee->isInlineAsm()) | ||
| CallConv = Callee->getCallingConv(); |
There was a problem hiding this comment.
This is broken for indirect calls. You should take the calling convention directly from the callsite, not the callee
There was a problem hiding this comment.
This is the callsite instruction. Let me change the variable name to CI.
|
Does this solve #134222? |
| @@ -0,0 +1,66 @@ | |||
| ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6 | |||
| ; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s | |||
There was a problem hiding this comment.
Are you missing -O0 here?
There was a problem hiding this comment.
Thank you! Let me add "--fast-isel" option for the test case. The issue happens when switching from fast isel to dag isel where fast isel fails to select call instruction.
Solve the same issue of https://godbolt.org/z/15K5Pb7GG which generate incorrect instruction "call __truncsfbf2@PLT". The callee return bf16 value through xmm0, so don't need to convert from f32 to bf16. |
Great! I spent some time on it, but didn't find a concise solution. |
|
@LuoYuanke Would you help verify the initial problems reported in #134222 and #151692 and close them if they are solved? |
Sure, I'll verify the two issues. |
| Register InReg = FuncInfo.InitializeRegForValue(Inst); | ||
|
|
||
| std::optional<CallingConv::ID> CallConv; | ||
| auto *CI = dyn_cast<CallInst>(Inst); |
There was a problem hiding this comment.
Do you mean CallBase can cover invoke, callbr instructions which also have cconv, but CallInst only cover call instruction?
There was a problem hiding this comment.
I try to create a test case for callbr, but I got llc crash with below case. It seems x86 don't support half or bfloat type in inline assembly.
define half @test1(float %x) {
entry:
%ret = callbr half asm "vxorps $1, $1, $0 jmp ${2:l}", "=x,x,!i"(float %x)
to label %normal [label %abnormal]
normal:
ret half %ret
abnormal:
ret half 1.0
}
error: couldn't allocate output register for constraint 'x'
I also got llc crash when running below test case for invoke.
declare bfloat @pers(...)
declare bfloat @foo();
define bfloat @test_lp() personality ptr @pers {
entry:
%1 = invoke bfloat @foo()
to label %then unwind label %hotlp
then:
ret bfloat %1
hotlp:
%2 = landingpad { ptr, bfloat }
cleanup
br label %lpret
lpret:
%3 = extractvalue { ptr, bfloat } %2, 1
ret bfloat %3
}
So I'd like to keep it as CallInst for now. When bfloat is better supported in callbr and invoke, we can review CallInst and support more call-alike instruction.
There was a problem hiding this comment.
callbr is special but invoke is straightforward. You should also fix whatever the crash is
There was a problem hiding this comment.
This absolutely needs to be CallBase, leaving this like this is just waiting for someone else to waste time debugging this in the future
There was a problem hiding this comment.
OK, let me create a PR to change it.
There was a problem hiding this comment.
I created a PR #164769, but it is difficult to create a test case. We need to have a call instruction to use the output of invoke instruction, besides, the call and invoke instruction should be in the same basic block to trigger the issue. Below code can be a base code for the test case.
declare i32 @pers(...)
declare bfloat @foo();
define i32 @test_lp() personality ptr @pers {
entry:
%1 = invoke bfloat @foo()
to label %then unwind label %hotlp
then:
%call = call i32 @bar2(bfloat %1)
ret i32 %call
hotlp:
%2 = landingpad { ptr, i32 }
cleanup
br label %lpret
lpret:
%3 = extractvalue { ptr, i32 } %2, 1
ret i32 %3
}
declare i32 @bar2(bfloat)
This is to follow the discussion in #164565 CallBase can cover more call-like instructions which carry caling convention flag. Co-authored-by: Yuanke Luo <ykluo@birentech.com>
This is to follow the discussion in llvm/llvm-project#164565 CallBase can cover more call-like instructions which carry caling convention flag. Co-authored-by: Yuanke Luo <ykluo@birentech.com>
) When switch from fast isel to dag isel the input value is from llvm IR instruction. If the instruction is call we should get the calling convention of the callee and pass it to RegsForValue::getCopyFromRegs, so that it can deduce the right RegisterVT of the returned value of the callee. --------- Co-authored-by: Yuanke Luo <ykluo@birentech.com>
This is to follow the discussion in llvm#164565 CallBase can cover more call-like instructions which carry caling convention flag. Co-authored-by: Yuanke Luo <ykluo@birentech.com>
) When switch from fast isel to dag isel the input value is from llvm IR instruction. If the instruction is call we should get the calling convention of the callee and pass it to RegsForValue::getCopyFromRegs, so that it can deduce the right RegisterVT of the returned value of the callee. --------- Co-authored-by: Yuanke Luo <ykluo@birentech.com>
This is to follow the discussion in llvm#164565 CallBase can cover more call-like instructions which carry caling convention flag. Co-authored-by: Yuanke Luo <ykluo@birentech.com>
When switch from fast isel to dag isel the input value is from llvm IR instruction.
If the instruction is call we should get the calling convention of the callee and
pass it to RegsForValue::getCopyFromRegs, so that it can deduce the right RegisterVT
of the returned value of the callee.