Split up the one big codegen lock into per-function locks and dependency edge tracking #56179

vtjnash · 2024-10-15T17:12:32Z

Disjoint content can be LLVM optimized in parallel now, since codegen no longer has any ability to handle recursion, and compilation should even be able to run in parallel with the GC also. Individual commits have no particular separate meaning here and should be squashed.

topolarity · 2024-10-15T19:38:58Z

src/codegen.cpp

+    }
+    if (ci == NULL || (jl_value_t*)ci == jl_nothing || ci->rettype != rettype || !jl_egal(sigtype, mi->specTypes)) { // TODO: correctly handle the ABI conversion if rettype != ci->rettype
+        JL_GC_POP();
+        return std::make_pair((Function*)NULL, (Function*)NULL);


Out of curiosity, what does it mean when codegen fails to produce a result like this?

codegen is always allowed to decline to generate optimized code, and the runtime must find some other way to execute it (usually in the interpreter)

topolarity · 2024-10-16T01:27:56Z

src/codegen.cpp

-            }
-        }
-        JL_GC_POP();
+        abort(); // this code path is unsound, unsafe, and probably bad


Fair enough

Time to open that inference change soon, or this will regress our --trim support

That is the hope. I already started some related changes here with how it tracks the content needing to go into the workqueue

topolarity · 2024-10-16T01:38:01Z

src/aotcompile.cpp

+                proto.decl->setName(preal_decl);
+            }
+        }
+        if (proto.oc) { // additionally, if we are dealing with an oc, then we might also need to fix up the fptr1 reference too


Just for my personal edification, and to get a sense of some of the affected interface here, could you explain what the invalid part looked like here:

The invalid way of doing this became much harder to express, which
exposes a lot of bugs (hits more assertion errors and causes more crashes).

in terms of code?

The ABI is expressed as the tuple (mi.specTypes => codeinst.rettype), but the old code tried to use the wrong value for rettype by reimplementing a buggier version of the workqueue here. With the workqueue gone, that was no longer feasible.

topolarity · 2024-10-16T01:38:51Z

src/aotcompile.cpp

+                // method body. See #34993
+                if ((policy != CompilationPolicy::Default || params.params->trim) &&
+                    jl_atomic_load_relaxed(&codeinst->inferred) == jl_nothing) {
+                    // XXX: SOURCE_MODE_FORCE_SOURCE is wrong here (neither sufficient nor necessary)


Any hint w.r.t. to how this will be tightened up, or what the consequences are?

I don't know yet, but I think it will involve moving more of this function into inference

src/aotcompile.cpp

src/jitlayers.cpp

src/aotcompile.cpp

Disjoint content can be compiled in parallel now, and compilation can run in parallel with the GC. Adds a C++ shim for concurrent gc support in conjunction with using a `std::unique_lock` to DRY code.

Since we use the ForwardingMemoryManger instead of making a new RTDyldMemoryManager object every time, we need to reference count the finalizeMemory calls so that we only call that at the end of relocating everything when everything is ready. We already happen to conveniently have a shared_ptr here, so just use that instead of inventing a duplicate counter.

The invalid way of doing this became much harder to express, which exposes a lot of bugs (hits more assertion errors and causes more crashes). Mostly fixes #55035, since this bug is just that much harder to express in the more constrained API.

vtjnash · 2024-10-18T15:17:16Z

Absent any more comments, I will merge later today so that we can get started on experimenting with the next pieces after it.

…ncy edge tracking (#56179) Disjoint content can be LLVM optimized in parallel now, since codegen no longer has any ability to handle recursion, and compilation should even be able to run in parallel with the GC also. Removes any remaining global state, since that is unsafe. Adds a C++ shim for concurrent gc support in conjunction with using a `std::unique_lock` to DRY code. Fix RuntimeDyld implementation: Since we use the ForwardingMemoryManger instead of making a new RTDyldMemoryManager object every time, we need to reference count the finalizeMemory calls so that we only call that at the end of relocating everything when everything is ready. We already happen to conveniently have a shared_ptr here, so just use that instead of inventing a duplicate counter. Fixes many OC bugs, including mostly fixing #55035, since this bug is just that much harder to express in the more constrained API.

This call resolution code was deleted in JuliaLang#56179 (rightfully so - this code really should never have been here in the first place) but it should be a no-op not an abort, until we implement this in inference.

This call resolution code was deleted in #56179 (rightfully so), but it should be a no-op until we implement this in inference.

vtjnash added the compiler:codegen Generation of LLVM IR and native code label Oct 15, 2024

topolarity reviewed Oct 15, 2024

View reviewed changes

topolarity reviewed Oct 16, 2024

View reviewed changes

vtjnash force-pushed the jn/codegen-unlock branch 2 times, most recently from 846b10f to c93fc22 Compare October 16, 2024 18:55

vtjnash requested a review from vchuravy October 16, 2024 21:14

vtjnash added 6 commits October 18, 2024 15:07

Split up the one big codegen lock into per-function locks

ddd8439

Disjoint content can be compiled in parallel now, and compilation can run in parallel with the GC. Adds a C++ shim for concurrent gc support in conjunction with using a `std::unique_lock` to DRY code.

fix a lot of OC bugs

8612b9d

The invalid way of doing this became much harder to express, which exposes a lot of bugs (hits more assertion errors and causes more crashes). Mostly fixes #55035, since this bug is just that much harder to express in the more constrained API.

fix analyzer issues

d4956ef

fix ASAN issues

557f16f

cleanup and review

9885eaf

vtjnash force-pushed the jn/codegen-unlock branch from c93fc22 to 9885eaf Compare October 18, 2024 15:16

vtjnash merged commit cd99cfc into master Oct 19, 2024
5 of 7 checks passed

vtjnash deleted the jn/codegen-unlock branch October 19, 2024 02:04

topolarity mentioned this pull request Oct 21, 2024

trimming: don't abort where we used to resolve dynamic calls #56271

Merged

topolarity added a commit that referenced this pull request Oct 21, 2024

trimming: don't abort where we used to resolve dynamic calls (#56271)

1ba035d

This call resolution code was deleted in #56179 (rightfully so), but it should be a no-op until we implement this in inference.

giordano mentioned this pull request Nov 27, 2024

Symbol lookup error: undefined symbol: jl_fptr_sparam #56701

Closed

maleadt mentioned this pull request Dec 12, 2024

ORC/JLJIT: Segfault on master JuliaLLVM/LLVM.jl#496

Closed

maleadt mentioned this pull request Jan 7, 2025

Updates for Julia 1.12 JuliaGPU/GPUCompiler.jl#656

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split up the one big codegen lock into per-function locks and dependency edge tracking #56179

Split up the one big codegen lock into per-function locks and dependency edge tracking #56179

vtjnash commented Oct 15, 2024

topolarity Oct 15, 2024

vtjnash Oct 15, 2024

topolarity Oct 16, 2024

vtjnash Oct 16, 2024

topolarity Oct 16, 2024

vtjnash Oct 16, 2024

topolarity Oct 16, 2024

vtjnash Oct 16, 2024

vtjnash commented Oct 18, 2024

Split up the one big codegen lock into per-function locks and dependency edge tracking #56179

Split up the one big codegen lock into per-function locks and dependency edge tracking #56179

Conversation

vtjnash commented Oct 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vtjnash commented Oct 18, 2024