Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split up the one big codegen lock into per-function locks and dependency edge tracking #56179

Merged
merged 6 commits into from
Oct 19, 2024

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Oct 15, 2024

Disjoint content can be LLVM optimized in parallel now, since codegen no longer has any ability to handle recursion, and compilation should even be able to run in parallel with the GC also. Individual commits have no particular separate meaning here and should be squashed.

@vtjnash vtjnash added the compiler:codegen Generation of LLVM IR and native code label Oct 15, 2024
}
if (ci == NULL || (jl_value_t*)ci == jl_nothing || ci->rettype != rettype || !jl_egal(sigtype, mi->specTypes)) { // TODO: correctly handle the ABI conversion if rettype != ci->rettype
JL_GC_POP();
return std::make_pair((Function*)NULL, (Function*)NULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what does it mean when codegen fails to produce a result like this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codegen is always allowed to decline to generate optimized code, and the runtime must find some other way to execute it (usually in the interpreter)

}
}
JL_GC_POP();
abort(); // this code path is unsound, unsafe, and probably bad
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

Time to open that inference change soon, or this will regress our --trim support

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the hope. I already started some related changes here with how it tracks the content needing to go into the workqueue

proto.decl->setName(preal_decl);
}
}
if (proto.oc) { // additionally, if we are dealing with an oc, then we might also need to fix up the fptr1 reference too
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my personal edification, and to get a sense of some of the affected interface here, could you explain what the invalid part looked like here:

The invalid way of doing this became much harder to express, which
exposes a lot of bugs (hits more assertion errors and causes more crashes).

in terms of code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABI is expressed as the tuple (mi.specTypes => codeinst.rettype), but the old code tried to use the wrong value for rettype by reimplementing a buggier version of the workqueue here. With the workqueue gone, that was no longer feasible.

// method body. See #34993
if ((policy != CompilationPolicy::Default || params.params->trim) &&
jl_atomic_load_relaxed(&codeinst->inferred) == jl_nothing) {
// XXX: SOURCE_MODE_FORCE_SOURCE is wrong here (neither sufficient nor necessary)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any hint w.r.t. to how this will be tightened up, or what the consequences are?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know yet, but I think it will involve moving more of this function into inference

@vtjnash vtjnash force-pushed the jn/codegen-unlock branch 2 times, most recently from 846b10f to c93fc22 Compare October 16, 2024 18:55
@vtjnash vtjnash requested a review from vchuravy October 16, 2024 21:14
Disjoint content can be compiled in parallel now, and compilation can
run in parallel with the GC.

Adds a C++ shim for concurrent gc support in conjunction with
using a `std::unique_lock` to DRY code.
Since we use the ForwardingMemoryManger instead of making a new
RTDyldMemoryManager object every time, we need to reference count the
finalizeMemory calls so that we only call that at the end of relocating
everything when everything is ready. We already happen to conveniently
have a shared_ptr here, so just use that instead of inventing a
duplicate counter.
The invalid way of doing this became much harder to express, which
exposes a lot of bugs (hits more assertion errors and causes more crashes).

Mostly fixes #55035, since this bug is just that much harder to express
in the more constrained API.
@vtjnash
Copy link
Member Author

vtjnash commented Oct 18, 2024

Absent any more comments, I will merge later today so that we can get started on experimenting with the next pieces after it.

@vtjnash vtjnash merged commit cd99cfc into master Oct 19, 2024
5 of 7 checks passed
@vtjnash vtjnash deleted the jn/codegen-unlock branch October 19, 2024 02:04
KristofferC pushed a commit that referenced this pull request Oct 21, 2024
…ncy edge tracking (#56179)

Disjoint content can be LLVM optimized in parallel now, since codegen
no longer has any ability to handle recursion, and compilation should
even be able to run in parallel with the GC also. Removes any remaining
global state, since that is unsafe.

Adds a C++ shim for concurrent gc support in conjunction with
using a `std::unique_lock` to DRY code.

Fix RuntimeDyld implementation:
Since we use the ForwardingMemoryManger instead of making a new
RTDyldMemoryManager object every time, we need to reference count the
finalizeMemory calls so that we only call that at the end of relocating
everything when everything is ready. We already happen to conveniently
have a shared_ptr here, so just use that instead of inventing a
duplicate counter.

Fixes many OC bugs, including mostly fixing #55035, since this bug is
just that much harder to express in the more constrained API.
topolarity added a commit to topolarity/julia that referenced this pull request Oct 21, 2024
This call resolution code was deleted in JuliaLang#56179 (rightfully so -
this code really should never have been here in the first place) but it
should be a no-op not an abort, until we implement this in inference.
topolarity added a commit that referenced this pull request Oct 21, 2024
This call resolution code was deleted in #56179 (rightfully so), but it
should be a no-op until we implement this in inference.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants