Introduce at-dispose to replace do-block constructors. #309
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I noticed, once again, that our use of do-block constructors (which we use for scoped resource clean-up) resulted in bad code, presumably because of the closures involved (JuliaLang/julia#15276). GPUCompiler already had a bunch of explicit calls to
LLVM.dispose
to avoid the closure creation in hot code, but that doesn't scale and looks bad. So I explored two possibilities:I started out with 1., and a WIP branch can be found here: https://github.com/maleadt/LLVM.jl/tree/tb/finalizers. Although initial results were promising, I ditched the effort after realizing two major issues:
LLVM.Module
object can also be created by callingparent(::BasicBlock)
. That would then require handle-based refcounting, or an object factory like CUDA.jl's uniqueContext
constructors (which suffer from the same problem as you can look up the current context using API calls), both of which are messy.parent
, as instructions can be deleted from their parent BasicBlock without their memory being reclaimed. In that scenario, we'd end up with a Function object that doesn't root its parent Module, potentially resulting in early-frees.Given this complexity, I decided to go with 2. and keep the responsibility of tying object lifetimes to their respective owners to the user. This has also been working fine, in part because we dispose early which results in bugs being discovered relatively early, whereas bugs in a finalizer-based approach may be lurking for a long time.
The result is a
@dispose
macro which replaces the do-block syntax. The effect is pretty dramatic: on the LLVM.jl test-suite,perf
reports that the total executed instruction count drops by 20%!The effect on GPUCompiler is also impressive:
So a 10% improvement on fairly realistic use of LLVM.jl.