Migrate codegen to operate on orc::ThreadSafeModule #44440

pchintalapudi · 2022-03-04T02:45:02Z

orc::ThreadSafeModule and orc::ThreadSafeContext use reference-counted contexts to allow multiple contexts to be created and destroyed safely during codegen. Moving TSModule to be our API boundary gets us closer to parallelizing codegen without leaking contexts.

Depends on ~~#43770 to remove the global debug information with associated global context~~ #44454 to refactor a few globals.

pchintalapudi · 2022-03-04T21:28:07Z

There's a few indentation TODOs left in this PR, which I've deliberately left out for now as they cause large whitespace diffs that confuse the diff viewer.

pchintalapudi · 2022-03-29T22:10:28Z

Per an offline discussion with Valentin, we will change the interface of jl_create_native to accept a TSModule to codegen into rather than creating its own TSModule with a provided TSContext. This allows users of jl_create_native to specify their own data layout and targets for the destination module.

src/aotcompile.cpp

vchuravy · 2022-03-30T14:18:39Z

src/aotcompile.cpp

@@ -132,16 +132,6 @@ GlobalValue* jl_get_llvm_function_impl(void *native_code, uint32_t idx)
        return NULL;
 }

-extern "C" JL_DLLEXPORT
-LLVMContext* jl_get_llvm_context_impl(void *native_code)


src/aotcompile.cpp

aviatesk · 2022-05-14T03:21:52Z

@gbaraldi bisected that this commit introduced the error reported at #45302.

pchintalapudi · 2022-05-14T06:29:10Z

That's interesting, I wouldn't be surprised that there's a hidden bug given the size of the changeset but the changes to jl_merge_module aren't actually meaningful.

pchintalapudi · 2022-05-14T06:55:42Z

Looks like at least the error in jl_merge_module during LoopVectorization tests is:

Attempting to replace ; Function Attrs: alwaysinline
declare linkonce_odr i32 @"julia_readraw_turbo!_14765u14814"(i32, i32) #4

with global function  ; Function Attrs: alwaysinline
define linkonce_odr i32 @"julia_readraw_turbo!_14765u14814"(i32 %0, i32 %1, i32 %2) #0 {
top:
  %res = call i32 @llvm.x86.bmi.bzhi.32(i32 %0, i32 %1)
  ret i32 %res
}

pchintalapudi · 2022-05-14T07:55:50Z

And another one:

Attempting to replace ; Function Attrs: alwaysinline
declare linkonce_odr i32 @"julia_rshift_i_avx!_47472u47499"(i32, i32) #3

with global function  ; Function Attrs: alwaysinline
define linkonce_odr i32 @"julia_rshift_i_avx!_47472u47499"(i32 %0, i32 %1, i32 %2) #0 {
top:
  %res = call i32 @llvm.x86.bmi.bzhi.32(i32 %0, i32 %1)
  ret i32 %res
}

Looks like there's a common x86.bmi.bzhi intrinsic that's causing some issues between function definition/declaration

pchintalapudi · 2022-05-14T08:14:50Z

Reduced to

using LoopVectorization
img1 = Matrix{UInt8}(undef, 10, 10)
raw = rand(UInt8, (30 * 10) ÷ 4)
function readraw_turbo!(img, raw)
         npack = length(raw) ÷ 3
         @turbo for i = 0:npack-1
           img[1+4i] = raw[2+3i] << 4
           img[2+4i] = raw[1+3i]
           img[3+4i] = raw[2+3i]
           img[4+4i] = raw[3+3i]
         end
         img
       end
readraw_turbo!(img1,raw)

pchintalapudi · 2022-05-14T08:18:38Z

And again to

using LoopVectorization;img1 = Matrix{UInt8}(undef, 10, 10);raw = rand(UInt8, (30 * 10) ÷ 4);function readraw_turbo!(img, raw)
         npack = length(raw) ÷ 3
         @turbo for i = 0:npack-1
           img[1+4i] = raw[2+3i] << 4
         end
         img
       end;readraw_turbo!(img1,raw)

pchintalapudi · 2022-05-14T14:05:57Z

I am now slightly suspicious of this line: https://github.com/JuliaSIMD/VectorizationBase.jl/blob/9ae47c5c12c7bec9b57c132c24f67c4f14090c9e/src/llvm_intrin/masks.jl#L336

llvmcall_expr(decl, instrs, T, :(Tuple{$T,$T}), typ, [typ, typ, typ], [:a, :b])

I'm not sure, but I'm guessing that the list should only have 2 typ, not 3

pchintalapudi marked this pull request as draft March 4, 2022 02:45

pchintalapudi added the compiler:codegen Generation of LLVM IR and native code label Mar 4, 2022

pchintalapudi force-pushed the pc/tsm branch 2 times, most recently from b88d995 to 87366a5 Compare March 4, 2022 21:25

pchintalapudi force-pushed the pc/tsm branch from 87366a5 to 6cb6726 Compare March 4, 2022 21:42

pchintalapudi mentioned this pull request Mar 12, 2022

WIP: Add a compile-on-demand layer #44575

Closed

pchintalapudi force-pushed the pc/tsm branch 2 times, most recently from b86ac64 to 5383db9 Compare March 13, 2022 04:34

pchintalapudi mentioned this pull request Mar 13, 2022

Remove the imaging_mode global and thread it through codegen_params #44600

Merged

pchintalapudi marked this pull request as ready for review March 29, 2022 21:20

pchintalapudi requested review from vtjnash, vchuravy and maleadt March 29, 2022 21:21

Move to TSModule, round 2

c054a3b

pchintalapudi force-pushed the pc/tsm branch from 5383db9 to c054a3b Compare March 29, 2022 22:13

Pass in modules to codegen

b4a0e75

vchuravy reviewed Mar 30, 2022

View reviewed changes

pchintalapudi added 3 commits March 30, 2022 10:31

Rename jl_create_datalayout

97a00ba

Get unlocked modules once

9f83ce6

Pass along module data layout and target more frequently

e2bd1a5

pchintalapudi force-pushed the pc/tsm branch from 51e208c to e2bd1a5 Compare March 30, 2022 15:13

pchintalapudi added 2 commits March 30, 2022 11:38

Remove jl_get_ee_context

8180118

Add note about context locks

8329dca

vtjnash approved these changes Mar 31, 2022

View reviewed changes

vtjnash added the status:merge me PR is reviewed. Merge when all tests are passing label Mar 31, 2022

giordano merged commit 4422a1d into JuliaLang:master Apr 1, 2022

giordano removed the status:merge me PR is reviewed. Merge when all tests are passing label Apr 1, 2022

This was referenced Jan 2, 2023

1.9 compatibility JuliaGPU/CUDA.jl#1710

Closed

Change in llvmcall module merging breaks GPU codegen relying on globals #48093

Closed

pchintalapudi mentioned this pull request Jan 3, 2023

Switch back to LLVM's IR linker #48106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate codegen to operate on orc::ThreadSafeModule #44440

Migrate codegen to operate on orc::ThreadSafeModule #44440

pchintalapudi commented Mar 4, 2022 •

edited

Loading

pchintalapudi commented Mar 4, 2022

pchintalapudi commented Mar 29, 2022

vchuravy Mar 30, 2022

aviatesk commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

Migrate codegen to operate on orc::ThreadSafeModule #44440

Migrate codegen to operate on orc::ThreadSafeModule #44440

Conversation

pchintalapudi commented Mar 4, 2022 • edited Loading

pchintalapudi commented Mar 4, 2022

pchintalapudi commented Mar 29, 2022

vchuravy Mar 30, 2022

Choose a reason for hiding this comment

aviatesk commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented May 14, 2022

pchintalapudi commented Mar 4, 2022 •

edited

Loading