Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Coro] [async] Disable inlining in async coroutine splitting #80904

Merged
merged 1 commit into from
Feb 7, 2024

Conversation

aschwaighofer
Copy link
Collaborator

@aschwaighofer aschwaighofer commented Feb 6, 2024

The call to the inlining utility does not update the call graph. Leading to assertion failures when calling the call graph utility to update the call graph.

Instead rely on an inline pass to run after coro splitting and use alwaysinline annotations.

github.com/swiftlang/swift/issues/68708

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-coroutines

Author: Arnold Schwaighofer (aschwaighofer)

Changes

The call to the inlining utility does not update the call graph. Leading to assertion failures when calling the utility to update the call graph.

Instead rely on an inline pass to run after coro splitting and use alwaysinline annotations.

github.com/swiftlang/swift/issues/68708


Patch is 32.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80904.diff

8 Files Affected:

  • (modified) llvm/lib/Transforms/Coroutines/CoroSplit.cpp (+1-11)
  • (modified) llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-infinite-loop-bug.ll (+4-4)
  • (modified) llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-start-bug.ll (+3-3)
  • (modified) llvm/test/Transforms/Coroutines/coro-async-dyn-align.ll (+5-5)
  • (added) llvm/test/Transforms/Coroutines/coro-async-mutal-recursive.ll (+158)
  • (modified) llvm/test/Transforms/Coroutines/coro-async-unreachable.ll (+5-5)
  • (modified) llvm/test/Transforms/Coroutines/coro-async.ll (+38-38)
  • (modified) llvm/test/Transforms/Coroutines/swift-async-dbg.ll (+15-15)
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index a552b3ac20085c..aed4cd027d0338 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -208,17 +208,12 @@ static bool replaceCoroEndAsync(AnyCoroEndInst *End) {
   // Insert the return instruction.
   Builder.SetInsertPoint(End);
   Builder.CreateRetVoid();
-  InlineFunctionInfo FnInfo;
 
   // Remove the rest of the block, by splitting it into an unreachable block.
   auto *BB = End->getParent();
   BB->splitBasicBlock(End);
   BB->getTerminator()->eraseFromParent();
 
-  auto InlineRes = InlineFunction(*MustTailCall, FnInfo);
-  assert(InlineRes.isSuccess() && "Expected inlining to succeed");
-  (void)InlineRes;
-
   // We have cleaned up the coro.end block above.
   return false;
 }
@@ -1842,13 +1837,8 @@ static void splitAsyncCoroutine(Function &F, coro::Shape &Shape,
     SmallVector<Value *, 8> Args(Suspend->args());
     auto FnArgs = ArrayRef<Value *>(Args).drop_front(
         CoroSuspendAsyncInst::MustTailCallFuncArg + 1);
-    auto *TailCall =
-        coro::createMustTailCall(Suspend->getDebugLoc(), Fn, FnArgs, Builder);
+    coro::createMustTailCall(Suspend->getDebugLoc(), Fn, FnArgs, Builder);
     Builder.CreateRetVoid();
-    InlineFunctionInfo FnInfo;
-    auto InlineRes = InlineFunction(*TailCall, FnInfo);
-    assert(InlineRes.isSuccess() && "Expected inlining to succeed");
-    (void)InlineRes;
 
     // Replace the lvm.coro.async.resume intrisic call.
     replaceAsyncResumeFunction(Suspend, Continuation);
diff --git a/llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-infinite-loop-bug.ll b/llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-infinite-loop-bug.ll
index 07b3bd8fa94ac0..4960709932948c 100644
--- a/llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-infinite-loop-bug.ll
+++ b/llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-infinite-loop-bug.ll
@@ -22,8 +22,8 @@ declare void @my_other_async_function(ptr %async.ctxt)
      i32 128    ; Initial async context size without space for frame
 }>
 
-define swiftcc void @my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt) {
-  tail call swiftcc void %fnPtr(ptr %async.ctxt)
+define swifttailcc void @my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt) alwaysinline {
+  tail call swifttailcc void %fnPtr(ptr %async.ctxt)
   ret void
 }
 
@@ -37,12 +37,12 @@ entry:
 
 ; The address of alloca escapes but the analysis based on lifetimes fails to see
 ; that it can't localize this alloca.
-; CHECK: define swiftcc void @my_async_function(ptr swiftasync %async.ctxt) {
+; CHECK: define swifttailcc void @my_async_function(ptr swiftasync %async.ctxt) {
 ; CHECK: entry:
 ; CHECK-NOT: ret
 ; CHECK-NOT:   [[ESCAPED_ADDR:%.*]] = alloca i64, align 8
 ; CHECK: ret
-define swiftcc void @my_async_function(ptr swiftasync %async.ctxt) {
+define swifttailcc void @my_async_function(ptr swiftasync %async.ctxt) {
 entry:
   %escaped_addr = alloca i64
 
diff --git a/llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-start-bug.ll b/llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-start-bug.ll
index 2306b72a0055fe..42377285f77ca9 100644
--- a/llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-start-bug.ll
+++ b/llvm/test/Transforms/Coroutines/coro-async-addr-lifetime-start-bug.ll
@@ -22,8 +22,8 @@ declare void @my_other_async_function(ptr %async.ctxt)
      i32 128    ; Initial async context size without space for frame
 }>
 
-define swiftcc void @my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt) {
-  tail call swiftcc void %fnPtr(ptr %async.ctxt)
+define swifttailcc void @my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt) alwaysinline {
+  tail call swifttailcc void %fnPtr(ptr %async.ctxt)
   ret void
 }
 
@@ -36,7 +36,7 @@ entry:
   ret ptr %resume_ctxt
 }
 
-define swiftcc void @my_async_function(ptr swiftasync %async.ctxt) {
+define swifttailcc void @my_async_function(ptr swiftasync %async.ctxt) {
 entry:
   %escaped_addr = alloca i64
 
diff --git a/llvm/test/Transforms/Coroutines/coro-async-dyn-align.ll b/llvm/test/Transforms/Coroutines/coro-async-dyn-align.ll
index 040c9881c1ab30..567977ea1476d7 100644
--- a/llvm/test/Transforms/Coroutines/coro-async-dyn-align.ll
+++ b/llvm/test/Transforms/Coroutines/coro-async-dyn-align.ll
@@ -33,12 +33,12 @@ declare swiftcc void @asyncReturn(ptr)
 declare swiftcc void @asyncSuspend(ptr)
 declare {ptr} @llvm.coro.suspend.async(i32, ptr, ptr, ...)
 
-define swiftcc void @my_async_function.my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt) {
-  tail call swiftcc void %fnPtr(ptr %async.ctxt)
+define swifttailcc void @my_async_function.my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt) alwaysinline {
+  musttail call swifttailcc void %fnPtr(ptr %async.ctxt)
   ret void
 }
 
-define ptr @__swift_async_resume_project_context(ptr %ctxt) {
+define ptr @__swift_async_resume_project_context(ptr %ctxt) alwaysinline {
 entry:
   %resume_ctxt = load ptr, ptr %ctxt, align 8
   ret ptr %resume_ctxt
@@ -46,7 +46,7 @@ entry:
 
 
 ; CHECK: %my_async_function.Frame = type { i64, [48 x i8], i64, i64, [16 x i8], ptr, i64, ptr }
-; CHECK: define swiftcc void @my_async_function
+; CHECK: define swifttailcc void @my_async_function
 ; CHECK:  [[T0:%.*]] = getelementptr inbounds %my_async_function.Frame, ptr %async.ctx.frameptr, i32 0, i32 3
 ; CHECK:  [[T1:%.*]] = ptrtoint ptr [[T0]] to i64
 ; CHECK:  [[T2:%.*]] = add i64 [[T1]], 31
@@ -60,7 +60,7 @@ entry:
 ; CHECK:  store i64 2, ptr [[T4]]
 ; CHECK:  store i64 3, ptr [[T9]]
 
-define swiftcc void @my_async_function(ptr swiftasync %async.ctxt) presplitcoroutine {
+define swifttailcc void @my_async_function(ptr swiftasync %async.ctxt) presplitcoroutine {
 entry:
   %tmp = alloca i64, align 8
   %tmp2 = alloca i64, align 16
diff --git a/llvm/test/Transforms/Coroutines/coro-async-mutal-recursive.ll b/llvm/test/Transforms/Coroutines/coro-async-mutal-recursive.ll
new file mode 100644
index 00000000000000..4931fe998daa60
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/coro-async-mutal-recursive.ll
@@ -0,0 +1,158 @@
+; RUN: opt < %s -passes='default<O2>' -S | FileCheck --check-prefixes=CHECK %s
+; RUN: opt < %s -O0 -S | FileCheck --check-prefixes=CHECK-O0 %s
+
+
+; CHECK-NOT: llvm.coro.suspend.async
+; CHECK-O0-NOT: llvm.coro.suspend.async
+
+; This test used to crash during updating the call graph in coro splitting.
+
+target datalayout = "p:64:64:64"
+
+%swift.async_func_pointer = type <{ i32, i32 }>
+
+@"$s1d3fooyySbYaFTu" = hidden global %swift.async_func_pointer <{ i32 trunc (i64 sub (i64 ptrtoint (ptr @"$s1d3fooyySbYaF" to i64), i64 ptrtoint (ptr @"$s1d3fooyySbYaFTu" to i64)) to i32), i32 16 }>
+@"$s1d3baryySbYaFTu" = hidden global %swift.async_func_pointer <{ i32 trunc (i64 sub (i64 ptrtoint (ptr @"$s1d3baryySbYaF" to i64), i64 ptrtoint (ptr @"$s1d3baryySbYaFTu" to i64)) to i32), i32 16 }>
+
+define swifttailcc void @"$s1d3fooyySbYaF"(ptr swiftasync %0, i1 %1) {
+entry:
+  %2 = alloca ptr, align 8
+  %c.debug = alloca i1, align 8
+  %3 = call token @llvm.coro.id.async(i32 16, i32 16, i32 0, ptr @"$s1d3fooyySbYaFTu")
+  %4 = call ptr @llvm.coro.begin(token %3, ptr null)
+  store ptr %0, ptr %2, align 8
+  call void @llvm.memset.p0.i64(ptr align 8 %c.debug, i8 0, i64 1, i1 false)
+  store i1 %1, ptr %c.debug, align 8
+  call void asm sideeffect "", "r"(ptr %c.debug)
+  %5 = load i32, ptr getelementptr inbounds (%swift.async_func_pointer, ptr @"$s1d3baryySbYaFTu", i32 0, i32 1), align 8
+  %6 = zext i32 %5 to i64
+  %7 = call swiftcc ptr @swift_task_alloc(i64 %6) #4
+  call void @llvm.lifetime.start.p0(i64 -1, ptr %7)
+  %8 = load ptr, ptr %2, align 8
+  %9 = getelementptr inbounds <{ ptr, ptr }>, ptr %7, i32 0, i32 0
+  store ptr %8, ptr %9, align 8
+  %10 = call ptr @llvm.coro.async.resume()
+  %11 = getelementptr inbounds <{ ptr, ptr }>, ptr %7, i32 0, i32 1
+  store ptr %10, ptr %11, align 8
+  %12 = call { ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0s(i32 0, ptr %10, ptr @__swift_async_resume_project_context, ptr @"$s1d3fooyySbYaF.0", ptr @"$s1d3baryySbYaF", ptr %7, i1 %1)
+  %13 = extractvalue { ptr } %12, 0
+  %14 = call ptr @__swift_async_resume_project_context(ptr %13)
+  store ptr %14, ptr %2, align 8
+  call swiftcc void @swift_task_dealloc(ptr %7) #4
+  call void @llvm.lifetime.end.p0(i64 -1, ptr %7)
+  %15 = load ptr, ptr %2, align 8
+  %16 = getelementptr inbounds <{ ptr, ptr }>, ptr %15, i32 0, i32 1
+  %17 = load ptr, ptr %16, align 8
+  %18 = load ptr, ptr %2, align 8
+  %19 = call i1 (ptr, i1, ...) @llvm.coro.end.async(ptr %4, i1 false, ptr @"$s1d3fooyySbYaF.0.1", ptr %17, ptr %18)
+  unreachable
+}
+
+declare token @llvm.coro.id.async(i32, i32, i32, ptr) #1
+
+declare void @llvm.trap() #2
+
+declare ptr @llvm.coro.begin(token, ptr) #1
+
+declare void @llvm.memset.p0.i64(ptr nocapture, i8, i64, i1 immarg) #3
+
+define hidden swifttailcc void @"$s1d3baryySbYaF"(ptr swiftasync %0, i1 %1) {
+entry:
+  %2 = alloca ptr, align 8
+  %c.debug = alloca i1, align 8
+  %3 = call token @llvm.coro.id.async(i32 16, i32 16, i32 0, ptr @"$s1d3baryySbYaFTu")
+  %4 = call ptr @llvm.coro.begin(token %3, ptr null)
+  store ptr %0, ptr %2, align 8
+  call void @llvm.memset.p0.i64(ptr align 8 %c.debug, i8 0, i64 1, i1 false)
+  store i1 %1, ptr %c.debug, align 8
+  call void asm sideeffect "", "r"(ptr %c.debug)
+  br i1 %1, label %5, label %17
+
+5:                                                ; preds = %entry
+  %6 = xor i1 %1, true
+  %7 = load i32, ptr getelementptr inbounds (%swift.async_func_pointer, ptr @"$s1d3fooyySbYaFTu", i32 0, i32 1), align 8
+  %8 = zext i32 %7 to i64
+  %9 = call swiftcc ptr @swift_task_alloc(i64 %8) #4
+  call void @llvm.lifetime.start.p0(i64 -1, ptr %9)
+  %10 = load ptr, ptr %2, align 8
+  %11 = getelementptr inbounds <{ ptr, ptr }>, ptr %9, i32 0, i32 0
+  store ptr %10, ptr %11, align 8
+  %12 = call ptr @llvm.coro.async.resume()
+  %13 = getelementptr inbounds <{ ptr, ptr }>, ptr %9, i32 0, i32 1
+  store ptr %12, ptr %13, align 8
+  %14 = call { ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0s(i32 0, ptr %12, ptr @__swift_async_resume_project_context, ptr @"$s1d3baryySbYaF.0.2", ptr @"$s1d3fooyySbYaF", ptr %9, i1 %6)
+  %15 = extractvalue { ptr } %14, 0
+  %16 = call ptr @__swift_async_resume_project_context(ptr %15)
+  store ptr %16, ptr %2, align 8
+  call swiftcc void @swift_task_dealloc(ptr %9) #4
+  call void @llvm.lifetime.end.p0(i64 -1, ptr %9)
+  br label %18
+
+17:                                               ; preds = %entry
+  br label %18
+
+18:                                               ; preds = %5, %17
+  %19 = load ptr, ptr %2, align 8
+  %20 = getelementptr inbounds <{ ptr, ptr }>, ptr %19, i32 0, i32 1
+  %21 = load ptr, ptr %20, align 8
+  %22 = load ptr, ptr %2, align 8
+  %23 = call i1 (ptr, i1, ...) @llvm.coro.end.async(ptr %4, i1 false, ptr @"$s1d3baryySbYaF.0", ptr %21, ptr %22)
+  unreachable
+}
+
+declare swiftcc ptr @swift_task_alloc(i64) #4
+
+declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #5
+
+declare ptr @llvm.coro.async.resume() #6
+
+define linkonce_odr hidden ptr @__swift_async_resume_project_context(ptr %0) #7 {
+entry:
+  %1 = load ptr, ptr %0, align 8
+  %2 = call ptr @llvm.swift.async.context.addr()
+  store ptr %1, ptr %2, align 8
+  ret ptr %1
+}
+
+declare ptr @llvm.swift.async.context.addr() #1
+
+define internal swifttailcc void @"$s1d3fooyySbYaF.0"(ptr %0, ptr %1, i1 %2) #8 {
+entry:
+  musttail call swifttailcc void %0(ptr swiftasync %1, i1 %2)
+  ret void
+}
+
+declare { ptr } @llvm.coro.suspend.async.sl_p0s(i32, ptr, ptr, ...) #6
+
+declare swiftcc void @swift_task_dealloc(ptr) #4
+
+declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #5
+
+define internal swifttailcc void @"$s1d3fooyySbYaF.0.1"(ptr %0, ptr %1) #8 {
+entry:
+  musttail call swifttailcc void %0(ptr swiftasync %1)
+  ret void
+}
+
+declare i1 @llvm.coro.end.async(ptr, i1, ...) #1
+
+define internal swifttailcc void @"$s1d3baryySbYaF.0"(ptr %0, ptr %1) #8 {
+entry:
+  musttail call swifttailcc void %0(ptr swiftasync %1)
+  ret void
+}
+
+define internal swifttailcc void @"$s1d3baryySbYaF.0.2"(ptr %0, ptr %1, i1 %2) #8 {
+entry:
+  musttail call swifttailcc void %0(ptr swiftasync %1, i1 %2)
+  ret void
+}
+
+attributes #1 = { nounwind }
+attributes #2 = { cold noreturn nounwind }
+attributes #3 = { nocallback nofree nounwind willreturn}
+attributes #4 = { nounwind }
+attributes #5 = { nocallback nofree nosync nounwind willreturn }
+attributes #6 = { nomerge nounwind }
+attributes #7 = { alwaysinline nounwind }
+attributes #8 = { alwaysinline nounwind }
diff --git a/llvm/test/Transforms/Coroutines/coro-async-unreachable.ll b/llvm/test/Transforms/Coroutines/coro-async-unreachable.ll
index 79ef8939b0eccf..ed4f526b8ed986 100644
--- a/llvm/test/Transforms/Coroutines/coro-async-unreachable.ll
+++ b/llvm/test/Transforms/Coroutines/coro-async-unreachable.ll
@@ -13,8 +13,8 @@ target datalayout = "p:64:64:64"
 declare void @my_other_async_function(ptr %async.ctxt)
 
 ; Function that implements the dispatch to the callee function.
-define swiftcc void @my_async_function.my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt, ptr %task, ptr %actor) {
-  tail call swiftcc void %fnPtr(ptr %async.ctxt, ptr %task, ptr %actor)
+define swifttailcc void @my_async_function.my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt, ptr %task, ptr %actor) alwaysinline {
+  musttail call swifttailcc void %fnPtr(ptr %async.ctxt, ptr %task, ptr %actor)
   ret void
 }
 
@@ -38,7 +38,7 @@ entry:
      i32 128    ; Initial async context size without space for frame
 }>
 
-define swiftcc void @unreachable(ptr %async.ctxt, ptr %task, ptr %actor)  {
+define swifttailcc void @unreachable(ptr %async.ctxt, ptr %task, ptr %actor)  {
 entry:
   %tmp = alloca { i64, i64 }, align 8
   %proj.1 = getelementptr inbounds { i64, i64 }, ptr %tmp, i64 0, i32 0
@@ -77,11 +77,11 @@ entry:
   unreachable
 }
 
-; CHECK: define swiftcc void @unreachable
+; CHECK: define swifttailcc void @unreachable
 ; CHECK-NOT: @llvm.coro.suspend.async
 ; CHECK: return
 
-; CHECK: define internal swiftcc void @unreachable.resume.0
+; CHECK: define internal swifttailcc void @unreachable.resume.0
 ; CHECK: unreachable
 
 declare ptr @llvm.coro.prepare.async(ptr)
diff --git a/llvm/test/Transforms/Coroutines/coro-async.ll b/llvm/test/Transforms/Coroutines/coro-async.ll
index 3740c3d1d83871..8ead304fe29887 100644
--- a/llvm/test/Transforms/Coroutines/coro-async.ll
+++ b/llvm/test/Transforms/Coroutines/coro-async.ll
@@ -37,28 +37,28 @@ declare void @my_other_async_function(ptr %async.ctxt)
 }>
 
 ; Function that implements the dispatch to the callee function.
-define swiftcc void @my_async_function.my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt, ptr %task, ptr %actor) {
-  tail call swiftcc void %fnPtr(ptr %async.ctxt, ptr %task, ptr %actor)
+define swifttailcc void @my_async_function.my_other_async_function_fp.apply(ptr %fnPtr, ptr %async.ctxt, ptr %task, ptr %actor) alwaysinline {
+  musttail call swifttailcc void %fnPtr(ptr %async.ctxt, ptr %task, ptr %actor)
   ret void
 }
 
 declare void @some_user(i64)
 declare void @some_may_write(ptr)
 
-define ptr @__swift_async_resume_project_context(ptr %ctxt) {
+define ptr @__swift_async_resume_project_context(ptr %ctxt) alwaysinline {
 entry:
   %resume_ctxt = load ptr, ptr %ctxt, align 8
   ret ptr %resume_ctxt
 }
 
-define ptr @resume_context_projection(ptr %ctxt) {
+define ptr @resume_context_projection(ptr %ctxt) alwaysinline {
 entry:
   %resume_ctxt = load ptr, ptr %ctxt, align 8
   ret ptr %resume_ctxt
 }
 
 
-define swiftcc void @my_async_function(ptr swiftasync %async.ctxt, ptr %task, ptr %actor) presplitcoroutine !dbg !1 {
+define swifttailcc void @my_async_function(ptr swiftasync %async.ctxt, ptr %task, ptr %actor) presplitcoroutine !dbg !1 {
 entry:
   %tmp = alloca { i64, i64 }, align 8
   %vector = alloca <4 x double>, align 16
@@ -100,7 +100,7 @@ entry:
   %val.2 = load i64, ptr %proj.2
   call void @some_user(i64 %val.2)
   store <4 x double> %vector_spill, ptr %vector, align 16
-  tail call swiftcc void @asyncReturn(ptr %async.ctxt, ptr %continuation_task_arg, ptr %actor)
+  tail call swifttailcc void @asyncReturn(ptr %async.ctxt, ptr %continuation_task_arg, ptr %actor)
   call i1 (ptr, i1, ...) @llvm.coro.end.async(ptr %hdl, i1 0)
   unreachable
 }
@@ -116,8 +116,8 @@ define void @my_async_function_pa(ptr %ctxt, ptr %task, ptr %actor) {
 ; CHECK: @my_async_function_pa_fp = constant <{ i32, i32 }> <{ {{.*}}, i32 176 }
 ; CHECK: @my_async_function2_fp = constant <{ i32, i32 }> <{ {{.*}}, i32 176 }
 
-; CHECK-LABEL: define swiftcc void @my_async_function(ptr swiftasync %async.ctxt, ptr %task, ptr %actor)
-; CHECK-O0-LABEL: define swiftcc void @my_async_function(ptr swiftasync %async.ctxt, ptr %task, ptr %actor)
+; CHECK-LABEL: define swifttailcc void @my_async_function(ptr swiftasync %async.ctxt, ptr %task, ptr %actor)
+; CHECK-O0-LABEL: define swifttailcc void @my_async_function(ptr swiftasync %async.ctxt, ptr %task, ptr %actor)
 ; CHECK-SAME: !dbg ![[SP1:[0-9]+]] {
 ; CHECK: coro.return:
 ; CHECK:   [[FRAMEPTR:%.*]] = getelementptr inbounds i8, ptr %async.ctxt, i64 128
@@ -139,12 +139,12 @@ define void @my_async_function_pa(ptr %ctxt, ptr %task, ptr %actor) {
 ; CHECK-O0:   [[VECTOR_SPILL:%.*]] = load <4 x double>, ptr {{.*}}
 ; CHECK-O0:   [[VECTOR_SPILL_ADDR:%.*]] = getelementptr inbounds %my_async_function.Frame, ptr {{.*}}, i32 0, i32 1
 ; CHECK-O0:   store <4 x double> [[VECTOR_SPILL]], ptr [[VECTOR_SPILL_ADDR]], align 16
-; CHECK:   tail call swiftcc void @asyncSuspend(ptr nonnull [[CALLEE_CTXT]], ptr %task, ptr %actor)
+; CHECK:   tail call swifttailcc void @asyncSuspend(ptr nonnull [[CALLEE_CTXT]], ptr %task, ptr %actor)
 ; CHECK:   ret void
 ; CHECK: }
 
-; CHECK-LABEL: define internal swiftcc void @my_async_functionTQ0_(ptr nocapture readonly swiftasync %0, ptr %1, ptr nocapture readnone %2)
-; CHECK-O0-LABEL: define internal swiftcc void @my_async_functionTQ0_(ptr swiftasync %0, ptr %1, ptr %2)
+; CHECK-LABEL: define internal swifttailcc void @my_async_functionTQ0_(ptr nocapture readonly swiftasync %0, ptr %1, ptr nocapture readnone %2)
+; CHECK-O0-LABEL: define internal swifttailcc void @my_async_functionTQ0_(ptr swiftasync %0, ptr %1, ptr %2)
 ; CHECK-SAME: !dbg ![[SP2:[0-9]+]] {
 ; CHECK: entryresume.0:
 ; CHECK:   [[CALLER_CONTEXT:%.*]] = load ptr, ptr %0
@@ -163,7 +163,7 @@ define void @my_async_function_pa(ptr %ctxt, ptr %task, ptr %actor) {
 ; CHECK:   tail call void @some_user(i64 [[VAL1]])
 ; CHECK:   [[VAL2:%.*]] = load i64, ptr [[ALLOCA_PRJ2]]
 ; CHECK:   tail call void @some_user(i64 [[VAL2]])
-; CHECK:   tail call swiftcc void @asyncReturn(ptr [[ASYNC_CTXT_RELOAD]], ptr %1, ptr [[ACTOR_RELOAD]])
+; CHECK:   tail call swifttailcc void @asyncReturn(ptr [[ASYNC_CTXT_RELOAD]], ptr %1, ptr [[ACTOR_RELOAD]])
 ; CHECK:   ret void
 ; CHECK: }
 
@@ -177,7 +177,7 @@ define void @my_async_function_pa(ptr %ctxt, ptr %task, ptr %actor) {
      i32 128    ; Initial async context size without space for frame
   }>
 
-define swiftcc void @my_async_function2(ptr %task, ptr %actor, ptr %async.ctxt) presplitcoroutine "frame-pointer"="all" !dbg !6 {
+define swifttailcc void @my_async_function2(ptr %task, ptr %actor, ptr %async.ctxt) presplitcoroutine "frame-pointer"="all" !dbg !6 {
 entry:
 
   %id = call token @llvm.coro.id.async(i32 128, i32 16, i32 2, ptr @my_async_function2_fp)
@@ -210,12 +210,12 @@ entry:
   call void @llvm.coro.async.context.dealloc(ptr %callee_context)
   %continuation_actor_arg = extractvalue {ptr, ptr, ptr} %res.2, 1
 
-  tail call swiftcc void @asyncReturn(ptr %async.ctxt, ptr %continuation_task_arg, ptr %continuation_actor_arg)
+  tail call swifttailcc void @asyncReturn(ptr %async.ctxt, ptr %continuation_task_arg, ptr %continuation_actor_arg)
   call i1 @llvm.coro.end(ptr %hdl, i1 0, token none)
   unreachable
 }
 
-; CHECK-LABEL: define swiftcc void @my_async_function2(ptr %task, ptr %actor, ptr %async.ctxt)
+; CHECK-LABEL: define swifttailcc void @my_async_function2(ptr %task, ptr %actor, ptr %async.ctxt)
 ; CHECK-SAME: #[[FRAMEPOINTER:[0-9]+]]
 ; CHECK-SAME: !dbg ![[SP3:[0-9]+]]
 ; CHECK: store ptr %async.ctxt,
@@ -225,22 +225,22 @@ entry:
 ; CHECK: sto...
[truncated]

The call to the inlining utility does not update the call graph.
Leading to assertion failures when calling the utility to update the
call graph.

Instead rely on an inline pass to run after coro splitting and use
alwaysinline annotations.

github.com/swiftlang/swift/issues/68708
@aschwaighofer aschwaighofer force-pushed the corosplit_noinline_llvm.org branch from 9b69b77 to 5c9b5ba Compare February 7, 2024 16:03
Copy link
Contributor

@rjmccall rjmccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aschwaighofer aschwaighofer merged commit b1ac052 into llvm:main Feb 7, 2024
4 checks passed
@Mogball
Copy link
Contributor

Mogball commented Feb 14, 2024

❤️

@Mogball
Copy link
Contributor

Mogball commented Feb 15, 2024

Hmm. It seems this is breaking compilation of async coroutines when the calling convention is not swifttailcc. musttail for functions that aren't swifttailcc are required to match in prototype the callee (which is assumed to be the surrounding function). However, the coroutine splitting is generating a musttail call to another function. We worked around this by tagging our functions as swifttailcc, but this doesn't seem right?

EDIT: tagging our functions as swifttailcc seems to lead to lots of other problems..

Mogball pushed a commit that referenced this pull request Feb 21, 2024
…80904)"

This reverts commit b1ac052.

This commit breaks coroutine splitting for non-swift calling convention
functions. In this example:

```ll
; ModuleID = 'repro.ll'
source_filename = "stdlib/test/runtime/test_llcl.mojo"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@0 = internal constant { i32, i32 } { i32 trunc (i64 sub (i64 ptrtoint (ptr @craSH to i64), i64 ptrtoint (ptr getelementptr inbounds ({ i32, i32 }, ptr @0, i32 0, i32 1) to i64)) to i32), i32 64 }

define dso_local void @af_suspend_fn(ptr %0, i64 %1, ptr %2) #0 {
  ret void
}

define dso_local void @craSH(ptr %0) #0 {
  %2 = call token @llvm.coro.id.async(i32 64, i32 8, i32 0, ptr @0)
  %3 = call ptr @llvm.coro.begin(token %2, ptr null)
  %4 = getelementptr inbounds { ptr, { ptr, ptr }, i64, { ptr, i1 }, i64, i64 }, ptr poison, i32 0, i32 0
  %5 = call ptr @llvm.coro.async.resume()
  store ptr %5, ptr %4, align 8
  %6 = call { ptr, ptr, ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0p0p0s(i32 0, ptr %5, ptr @ctxt_proj_fn, ptr @af_suspend_fn, ptr poison, i64 -1, ptr poison)
  ret void
}

define dso_local ptr @ctxt_proj_fn(ptr %0) #0 {
  ret ptr %0
}

; Function Attrs: nomerge nounwind
declare { ptr, ptr, ptr } @llvm.coro.suspend.async.sl_p0p0p0s(i32, ptr, ptr, ...) #1

; Function Attrs: nounwind
declare token @llvm.coro.id.async(i32, i32, i32, ptr) #2

; Function Attrs: nounwind
declare ptr @llvm.coro.begin(token, ptr writeonly) #2

; Function Attrs: nomerge nounwind
declare ptr @llvm.coro.async.resume() #1

attributes #0 = { "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+invpcid,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+pku,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+vaes,+vpclmulqdq,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
attributes #1 = { nomerge nounwind }
attributes #2 = { nounwind }
```

This verifier crashes after the `coro-split` pass with

```
cannot guarantee tail call due to mismatched parameter counts
  musttail call void @af_suspend_fn(ptr poison, i64 -1, ptr poison)
LLVM ERROR: Broken function
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: opt ../../../reduced.ll -O0
 #0 0x00007f1d89645c0e __interceptor_backtrace.part.0 /build/gcc-11-XeT9lY/gcc-11-11.4.0/build/x86_64-linux-gnu/libsanitizer/asan/../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4193:28
 #1 0x0000556d94d254f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22
 #2 0x0000556d94d19a2f llvm::sys::RunSignalHandlers() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Signals.cpp:105:20
 #3 0x0000556d94d1aa42 SignalHandler(int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:371:36
 #4 0x00007f1d88e42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007f1d88e969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x00007f1d88e969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #7 0x00007f1d88e969fc pthread_kill ./nptl/pthread_kill.c:89:10
 #8 0x00007f1d88e42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #9 0x00007f1d88e287f3 abort ./stdlib/abort.c:81:7
 #10 0x0000556d8944be01 std::vector<llvm::json::Value, std::allocator<llvm::json::Value>>::size() const /usr/include/c++/11/bits/stl_vector.h:919:40
 #11 0x0000556d8944be01 bool std::operator==<llvm::json::Value, std::allocator<llvm::json::Value>>(std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&, std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&) /usr/include/c++/11/bits/stl_vector.h:1893:23
 #12 0x0000556d8944be01 llvm::json::operator==(llvm::json::Array const&, llvm::json::Array const&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Support/JSON.h:572:69
 #13 0x0000556d8944be01 llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/JSON.cpp:204:28
 #14 0x0000556d949ed2bd llvm::report_fatal_error(char const*, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/ErrorHandling.cpp:82:70
 #15 0x0000556d8e37e876 llvm::SmallVectorBase<unsigned int>::size() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:91:32
 #16 0x0000556d8e37e876 llvm::SmallVectorTemplateCommon<llvm::DiagnosticInfoOptimizationBase::Argument, void>::end() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:282:41
 #17 0x0000556d8e37e876 llvm::SmallVector<llvm::DiagnosticInfoOptimizationBase::Argument, 4u>::~SmallVector() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1215:24
 #18 0x0000556d8e37e876 llvm::DiagnosticInfoOptimizationBase::~DiagnosticInfoOptimizationBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:413:7
 #19 0x0000556d8e37e876 llvm::DiagnosticInfoIROptimization::~DiagnosticInfoIROptimization() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:622:7
 #20 0x0000556d8e37e876 llvm::OptimizationRemark::~OptimizationRemark() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:689:7
 #21 0x0000556d8e37e876 operator() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2213:14
 #22 0x0000556d8e37e876 emit<llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::CGSCCAnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&)::<lambda()> > /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h:83:12
 #23 0x0000556d8e37e876 llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2212:13
 #24 0x0000556d8c36ecb1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::CoroSplitPass, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #25 0x0000556d91c1a84f llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:90:12
 #26 0x0000556d8c3690d1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #27 0x0000556d91c2162d llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:278:18
 #28 0x0000556d8c369035 llvm::detail::PassModel<llvm::Module, llvm::ModuleToPostOrderCGSCCPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #29 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20
 #30 0x0000556d8e30979e llvm::CoroConditionalWrapper::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroConditionalWrapper.cpp:19:74
 #31 0x0000556d8c365755 llvm::detail::PassModel<llvm::Module, llvm::CoroConditionalWrapper, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 #32 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20
 #33 0x0000556d89818556 llvm::SmallPtrSetImplBase::isSmall() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33
 #34 0x0000556d89818556 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:17
 #35 0x0000556d89818556 llvm::SmallPtrSetImpl<llvm::AnalysisKey*>::~SmallPtrSetImpl() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:321:7
 #36 0x0000556d89818556 llvm::SmallPtrSet<llvm::AnalysisKey*, 2u>::~SmallPtrSet() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:427:7
 #37 0x0000556d89818556 llvm::PreservedAnalyses::~PreservedAnalyses() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/Analysis.h:109:7
 #38 0x0000556d89818556 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/NewPMDriver.cpp:532:10
 #39 0x0000556d897e3939 optMain /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/optdriver.cpp:737:27
 #40 0x0000556d89455461 main /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/opt.cpp:25:33
 #41 0x00007f1d88e29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
 #42 0x00007f1d88e29e40 call_init ./csu/../csu/libc-start.c:128:20
 #43 0x00007f1d88e29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
 #44 0x0000556d897b6335 _start (/home/ubuntu/modular/.derived/third-party/llvm-project/build-relwithdebinfo-asan/bin/opt+0x150c335)
Aborted (core dumped)
aschwaighofer added a commit to aschwaighofer/llvm-project that referenced this pull request Apr 9, 2024
…ttailcc thunks

The call to the inlining utility does not update the call graph. Leading
to assertion failures when calling the call graph utility to update the
call graph.

Instead rely on an inline pass to run after coro splitting and use
alwaysinline annotations.

We can only do this if the calling convention used for the thunks is
`swifttailcc` otherwise we would break clients that use other
calling conventions.

Previous instance of this PR was llvm#80904.

github.com/swiftlang/swift/issues/68708
aschwaighofer added a commit to aschwaighofer/llvm-project that referenced this pull request Apr 16, 2024
…ttailcc thunks

The call to the inlining utility does not update the call graph. Leading
to assertion failures when calling the call graph utility to update the
call graph.

Instead rely on an inline pass to run after coro splitting and use
alwaysinline annotations.

We can only do this if the calling convention used for the thunks is
`swifttailcc` otherwise we would break clients that use other
calling conventions.

Previous instance of this PR was llvm#80904.

github.com/swiftlang/swift/issues/68708
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants