Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actually setup jit targets when compiling packageimages instead of targeting only one #54471

Merged
merged 9 commits into from
Jul 11, 2024

Conversation

gbaraldi
Copy link
Member

@gbaraldi gbaraldi commented May 14, 2024

I haven't assembled a test for this just yet, but this works
Fixes #54464

@KristofferC KristofferC added backport 1.11 Change should be backported to release-1.11 backport 1.10 Change should be backported to the 1.10 release labels May 14, 2024
@@ -1084,16 +1082,14 @@ jl_image_t jl_init_processor_pkgimg(void *hdl)
{
if (jit_targets.empty())
jl_error("JIT targets not initialized");
if (jit_targets.size() > 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to test that we still pick the right/one for all.

I have a unsubstantiated fear about loading images with mismatched ABI. Previously this code said to pick one and then ensured that everyone is consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what guarantees the right one is sysimg_init_cb. We unconditionally look for the first one.

@vchuravy
Copy link
Member

vchuravy commented May 14, 2024

So checking that my understanding is correct. The issue is that when during sysimg creation we don't yet have a native cache so we are not filling jit targets based on that. Instead we actually parse the -C string.

But for pkgimages we have an image loaded, so ensure_jit_targets early exists and so the multi versioning infrastructure doesn't see the rest?

Maybe we should split parsing for output and choosing for loading/jit?

@gbaraldi
Copy link
Member Author

So the code in ensure_jit_targets only seems to actually do anything when building an image, so maybe we should move it around?

@giordano
Copy link
Contributor

giordano commented May 14, 2024

With the script at #54464 (comment) I get

% JULIA_CPU_TARGET='sandybridge,-xsaveopt;generic,clone_all;haswell,-rdrnd,base(1);x86-64-v4,-rdrnd,base(1)' julia-master example_targets.jl 
ERROR: LoadError: Base.PkgId(Base.UUID("7876af07-990d-54b4-ab0e-23690620f79a"), "Example") has not yet been precompiled for julia 1.12.0-DEV.531
Stacktrace:
 [1] error(::Base.PkgId, ::String, ::VersionNumber)
   @ Base ./error.jl:53
 [2] top-level scope
   @ /tmp/example_targets.jl:13
 [3] include
   @ ./Base.jl:559 [inlined]
 [4] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:325
 [5] _start()
   @ Base ./client.jl:533
in expression starting at /tmp/example_targets.jl:13

The .julia/compiled/v1.12/Example directory is empty. Without JULIA_CPU_TARGET:

% julia-master example_targets.jl                  
targets = Base.ImageTarget[haswell; flags=0; features_en=(sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, fsgsbase, bmi, avx2, bmi2, sahf, lzcnt), haswell; flags=0; features_en=(sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, fsgsbase, bmi, avx2, bmi2, sahf, lzcnt)]

Note that the target haswell is repeated twice.

@giordano giordano added the needs tests Unit tests are required for this change label May 14, 2024
@gbaraldi
Copy link
Member Author

This isn't working :(

@giordano giordano added compiler:precompilation Precompilation of modules pkgimage labels May 15, 2024
@giordano
Copy link
Contributor

giordano commented May 15, 2024

Small progress! Without JULIA_CPU_TARGET:

% julia-master example_targets.jl                     
targets = Base.ImageTarget[haswell; flags=0; features_en=(sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, fsgsbase, bmi, avx2, bmi2, sahf, lzcnt)]

Now there's a single target, rather than two. But whenever JULIA_CPU_TARGET is set:

% JULIA_CPU_TARGET='generic' julia-master example_targets.jl              
ERROR: LoadError: Base.PkgId(Base.UUID("7876af07-990d-54b4-ab0e-23690620f79a"), "Example") has not yet been precompiled for julia 1.12.0-DEV.531
Stacktrace:
 [1] error(::Base.PkgId, ::String, ::VersionNumber)
   @ Base ./error.jl:53
 [2] top-level scope
   @ /tmp/example_targets.jl:13
 [3] include
   @ ./Base.jl:559 [inlined]
 [4] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:325
 [5] _start()
   @ Base ./client.jl:533
in expression starting at /tmp/example_targets.jl:13

@gbaraldi gbaraldi removed the needs tests Unit tests are required for this change label May 16, 2024
src/processor_arm.cpp Outdated Show resolved Hide resolved
src/processor_x86.cpp Outdated Show resolved Hide resolved
@IanButterworth
Copy link
Member

Recording here so this isn't lost on slack.

There's a ~27% increase in pkgimage size with this PR.

master (via juliaup)

julia> using PkgCacheInspector 

julia> info_cachefile("Pkg")
Contents of /Users/ian/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/Pkg/tUTdb_myMUk.dylib:
  modules: Any[Pkg.MiniProgressBars, Pkg.GitTools, Pkg.PlatformEngines, Pkg.Versions, Pkg.Registry, Pkg.Resolve, Pkg.Types.FuzzySorting, Pkg.Types, Pkg.BinaryPlatforms, Pkg.Artifacts, Pkg.Operations, Pkg.API, Pkg.REPLMode, Pkg]
  init order: Any[Pkg]
  529 external methods
  16850 new specializations of external methods (Base 79.9%, Base.Broadcast 7.6%, Base.Sort 3.7%, ...)
  1585 external methods with new roots
  33049 external targets
  24615 edges
  file size:   47139456 (44.956 MiB)
  Segment sizes (bytes):
    system:      19914892 ( 47.10%)
    isbits:      19778248 ( 46.78%)
    symbols:        80497 (  0.19%)
    tags:          336949 (  0.80%)
    relocations:  2082887 (  4.93%)
    gvars:          41200 (  0.10%)
    fptrs:          44632 (  0.11%)
  Image targets: 
    generic; flags=0; features_en=()

This PR built locally on MacOS with the same buildkite targets

JULIA_CPU_TARGET=generic;cortex-a57;thunderx2t99;carmel,clone_all;apple-m1,base(3);neoverse-512tvb,base(3)
julia> info_cachefile("Pkg")
Contents of /Users/ian/Documents/GitHub/julia/usr/share/julia/compiled/v1.12/Pkg/tUTdb_p5Nph.dylib:
  modules: Any[Pkg.MiniProgressBars, Pkg.GitTools, Pkg.PlatformEngines, Pkg.Versions, Pkg.Registry, Pkg.Resolve, Pkg.Types.FuzzySorting, Pkg.Types, Pkg.BinaryPlatforms, Pkg.Artifacts, Pkg.Operations, Pkg.API, Pkg.REPLMode, Pkg]
  init order: Any[Pkg]
  529 external methods
  16819 new specializations of external methods (Base 79.9%, Base.Broadcast 7.6%, Base.Sort 3.7%, ...)
  1581 external methods with new roots
  32968 external targets
  24583 edges
  file size:   59591872 (56.831 MiB)
  Segment sizes (bytes):
    system:      19918588 ( 47.19%)
    isbits:      19712700 ( 46.70%)
    symbols:        76445 (  0.18%)
    tags:          336483 (  0.80%)
    relocations:  2080035 (  4.93%)
    gvars:          41216 (  0.10%)
    fptrs:          44520 (  0.11%)
  Image targets: 
    generic; flags=0; features_en=()
    cortex-a57; flags=0; features_en=(crc)
    thunderx2t99; flags=0; features_en=(aes, sha2, crc, lse, rdm, v8_1a)
    carmel; flags=0; features_en=(aes, sha2, crc, lse, fullfp16, rdm, ccpp, v8_1a, v8_2a)
    apple-m1; flags=0; features_en=(aes, sha2, crc, lse, fullfp16, rdm, jsconv, complxnum, rcpc, ccpp, sha3, dotprod, fp16fml, dit, rcpc-immo, flagm, sb, ccdp, altnzcv, fptoint, v8_1a, v8_2a, v8_3a, v8_4a, v8_5a)
    neoverse-512tvb; flags=32; features_en=()

@StefanKarpinski
Copy link
Member

That's not bad for being having efficient code for multiple platforms! This is definitely something we'll want for serving pkgimgs.

@gbaraldi
Copy link
Member Author

There is a windows issue somehow :\

@IanButterworth
Copy link
Member

If there's no clear reason for the windows issue is it worth disabling on windows and merging this to get it tested?

@gbaraldi
Copy link
Member Author

I'm not super confortable merging it because that windows issue doesn't seem to be simple.

KristofferC added a commit that referenced this pull request May 28, 2024
Backported PRs:
- [x] #53665 <!-- use afoldl instead of tail recursion for tuples -->
- [x] #53976 <!-- LinearAlgebra: LazyString in interpolated error
messages -->
- [x] #54005 <!-- make `view(::Memory, ::Colon)` produce a Vector -->
- [x] #54010 <!-- Overload `Base.literal_pow` for `AbstractQ` -->
- [x] #54069 <!-- Allow PrecompileTools to see MI's inferred by foreign
abstract interpreters -->
- [x] #53750 <!-- inference correctness: fields and globals can revert
to undef -->
- [x] #53984 <!-- Profile: fix heap snapshot is valid char check -->
- [x] #54102 <!-- Explicitly compute stride in unaliascopy for SubArray
-->
- [x] #54070 <!-- Fix integer overflow in `skip(s::IOBuffer,
typemax(Int64))` -->
- [x] #54013 <!-- Support case-changes to Annotated{String,Char}s -->
- [x] #53941 <!-- Fix writing of AnnotatedChars to AnnotatedIOBuffer -->
- [x] #54137 <!-- Fix typo in docs for `partialsortperm` -->
- [x] #54129 <!-- use correct size when creating output data from an
IOBuffer -->
- [x] #54153 <!-- Fixup IdSet docstring -->
- [x] #54143 <!-- Fix `make install` from tarballs -->
- [x] #54151 <!-- LinearAlgebra: Correct zero element in
`_generic_matvecmul!` for block adj/trans -->
- [x] #54213 <!-- Add `public` statement to `Base.GC` -->
- [x] #54222 <!-- Utilize correct tbaa when emitting stores of unions.
-->
- [x] #54233 <!-- set MAX_OS_WRITE on unix -->
- [x] #54255 <!-- fix `_checked_mul_dims` in the presence of 0s and
overflow. -->
- [x] #54259 <!-- Fix typo in `readuntil` -->
- [x] #54251 <!-- fix typo in gc_mark_memory8 when chunking a large
array -->
- [x] #54276 <!-- Fix solve for complex `Hermitian` with non-vanishing
imaginary part on diagonal -->
- [x] #54248 <!-- ensure package callbacks are invoked when no valid
precompile file exists for an "auto loaded" stdlib -->
- [x] #54308 <!-- Implement eval-able AnnotatedString 2-arg show -->
- [x] #54302 <!-- Specialised substring equality for annotated strs -->
- [x] #54243 <!-- prevent `package_callbacks` to run multiple time for a
single package -->
- [x] #54350 <!-- add a precompile signature to Artifacts code that is
used by JLLs -->
- [x] #54331 <!-- correctly track freed bytes in
jl_genericmemory_to_string -->
- [x] #53509 <!-- revert moving "creating packages" from Pkg.jl -->
- [x] #54335 <!-- When accessing the data pointer for an array, first
decay it to a Derived Pointer -->
- [x] #54239 <!-- Make sure `fieldcount` constant-folds for `Tuple{...}`
-->
- [x] #54288
- [x] #54067
- [x] #53715 <!-- Add read/write specialisation for IOContext{AnnIO} -->
- [x] #54289 <!-- Rework annotation ordering/optimisations -->
- [x] #53815 <!-- create phantom task for GC threads -->
- [x] #54130 <!-- inference: handle `LimitedAccuracy` in
`handle_global_assignment!` -->
- [x] #54428 <!-- Move ConsoleLogging.jl into Base -->
- [x] #54332 <!-- Revert "add unsetindex support to more copyto methods
(#51760)" -->
- [x] #53826 <!-- Make all command-line options documented in all
related files -->
- [x] #54465 <!-- typeintersect: conservative typevar subtitution during
`finish_unionall` -->
- [x] #54514 <!-- typeintersect: followup cleanup for the nothrow path
of type instantiation -->
- [x] #54499 <!-- make `@doc x` work without REPL loaded -->
- [x] #54210 <!-- attach finalizer in `mmap` to the correct object -->
- [x] #54359 <!-- Pkg REPL: cache `pkg_mode` lookup -->

Non-merged PRs with backport label:
- [ ] #54471 <!-- Actually setup jit targets when compiling
packageimages instead of targeting only one -->
- [ ] #54457 <!-- Make `String(::Memory)` copy -->
- [ ] #54323 <!-- inference: fix too conservative effects for recursive
cycles -->
- [ ] #54322 <!-- effects: add new `@consistent_overlay` macro -->
- [ ] #54191 <!-- make `AbstractPipe` public -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #53882 <!-- Warn about cycles in extension precompilation -->
- [ ] #53707 <!-- Make ScopedValue public -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #53286 <!-- Raise an error when using `include_dependency` with
non-existent file or directory -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC mentioned this pull request May 29, 2024
60 tasks
KristofferC added a commit that referenced this pull request May 30, 2024
Backported PRs:
- [x] #54010 <!-- Overload `Base.literal_pow` for `AbstractQ` -->
- [x] #54143 <!-- Fix `make install` from tarballs -->
- [x] #54151 <!-- LinearAlgebra: Correct zero element in
`_generic_matvecmul!` for block adj/trans -->
- [x] #54233 <!-- set MAX_OS_WRITE on unix -->
- [x] #54251 <!-- fix typo in gc_mark_memory8 when chunking a large
array -->
- [x] #54363 <!-- typeintersect: fix another stack overflow caused by
circular constraints -->
- [x] #54497 <!-- Make TestLogger thread-safe (introduce a lock) -->
- [x] #53796 <!-- Add a missing doc -->
- [x] #54465 <!-- typeintersect: conservative typevar subtitution during
`finish_unionall` -->
- [x] #54514 <!-- typeintersect: followup cleanup for the nothrow path
of type instantiation -->

Need manual backport:
- [ ] #52505 <!-- fix alignment of emit_unbox_store copy -->
- [ ] #53373 <!-- fix sysimage-native-code=no option with pkgimages -->
- [ ] #53815 <!-- create phantom task for GC threads -->
- [ ] #53984 <!-- Profile: fix heap snapshot is valid char check -->
- [ ] #54276 <!-- Fix solve for complex `Hermitian` with non-vanishing
imaginary part on diagonal -->

Contains multiple commits, manual intervention needed:
- [ ] #52854 <!-- Change to streaming out the heap snapshot data -->
- [ ] #53218 <!-- Fix interpreter_exec.jl test -->
- [ ] #53833 <!-- Profile: make heap snapshots viewable in vscode viewer
-->
- [ ] #54303 <!-- LinearAlgebra: improve type-inference in
Symmetric/Hermitian matmul -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->

Non-merged PRs with backport label:
- [ ] #54471 <!-- Actually setup jit targets when compiling
packageimages instead of targeting only one -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC removed backport 1.10 Change should be backported to the 1.10 release backport 1.11 Change should be backported to release-1.11 labels Jun 7, 2024
@giordano
Copy link
Contributor

What's the status of this PR? I had a look a few days ago, my understanding is that a local native build of Julia which doesn't set JULIA_CPU_TARGET during the compilation of the sysimage can't generate pkgimages for different (or smaller than "native" for the host?) ISAs, while setting JULIA_CPU_TARGET during sysimage building allows compiling pkgimages for targets compatible with the those included in the sysimage. Is this an accurate description? If so, I feel like this should be better documented, because it isn't very obvious.

@gbaraldi
Copy link
Member Author

Yes, basically pkgimages can only be more specific than the base system image so if the the base sysimage is native, then the pkgimages are also native.

@vchuravy vchuravy added the merge me PR is reviewed. Merge when all tests are passing label Jul 10, 2024
test/precompile.jl Outdated Show resolved Hide resolved
@IanButterworth IanButterworth merged commit ad407a6 into master Jul 11, 2024
7 checks passed
@IanButterworth IanButterworth deleted the gb/multiversioning-pkimg branch July 11, 2024 08:06
@Octogonapus
Copy link
Contributor

Great to have this improved but I agree with Mose that this is not clear:

basically pkgimages can only be more specific than the base system image

Is an error returned if the user tries to make a package image for a totally different target than the base image? I don't fully understand this code but it seems that there is no check for that case. Would it be feasible to implement such a check? Otherwise this feels like a footgun. I could also update the docs here and here but I don't think that would be enough.

@giordano
Copy link
Contributor

@KristofferC why did you remove the backport-1.11 label without a comment? There's only a merge conflict in the tests, but it's trivial to fix it. I'm happy to push it to #56228 if that's ok for you.

@KristofferC
Copy link
Member

Sorry, I should have written something. It was discussed that this was too big of a thing to backport as late as it was in the release cycle. Now that it has been on master for a while, perhaps that can be reconsidered.

@giordano
Copy link
Contributor

Since backporting this PR would fix #56177, I'm inclined to do it.

IanButterworth pushed a commit to IanButterworth/julia that referenced this pull request Oct 18, 2024
maleadt pushed a commit that referenced this pull request Oct 21, 2024
…rgeting only one (#54471)

Co-authored-by: Gabriel Baraldi <[email protected]>
Co-authored-by: Dilum Aluthge <[email protected]>
@maleadt maleadt mentioned this pull request Oct 21, 2024
43 tasks
maleadt pushed a commit that referenced this pull request Oct 21, 2024
…rgeting only one (#54471)

Co-authored-by: Gabriel Baraldi <[email protected]>
Co-authored-by: Dilum Aluthge <[email protected]>
KristofferC pushed a commit that referenced this pull request Oct 24, 2024
…rgeting only one (#54471)

Co-authored-by: Gabriel Baraldi <[email protected]>
Co-authored-by: Dilum Aluthge <[email protected]>
KristofferC added a commit that referenced this pull request Nov 21, 2024
Backported PRs:
- [x] #55886 <!-- irrationals: restrict assume effects annotations to
known types -->
- [x] #55867 <!-- update `hash` doc string: `widen` not required any
more -->
- [x] #56084 <!-- slightly improve inference in precompilation code -->
- [x] #56088 <!-- make `Base.ANSIIterator` have a concrete field -->
- [x] #54093 <!-- Fix `JULIA_CPU_TARGET` being propagated to workers
precompiling stdlib pkgimages -->
- [x] #56165 <!-- Fix markdown list in installation.md -->
- [x] #56148 <!-- Make loading work when stdlib deps are missing in the
manifest -->
- [x] #56174 <!-- Fix implicit `convert(String, ...)` in several places
-->
- [x] #56159 <!-- Add invalidation barriers for `displaysize` and
`implicit_typeinfo` -->
- [x] #56089 <!-- Call `MulAddMul` instead of multiplication in
_generic_matmatmul! -->
- [x] #56195 <!-- Include default user depot when JULIA_DEPOT_PATH has
leading empty entry -->
- [x] #56215 <!-- [REPL] fix lock ordering mistake in load_pkg -->
- [x] #56251 <!-- REPL: run repl hint generation for modeswitch chars
when not switching -->
- [x] #56092 <!-- stream: fix reading LibuvStream into array -->
- [x] #55870 <!-- fix infinite recursion in `promote_type` for
`Irrational` -->
- [x] #56227 <!-- Do not call `rand` during sysimage precompilation -->
- [x] #55741 <!-- Change annotations to use a NamedTuple -->
- [x] #56149 <!-- Specialize adding/subtracting mixed
Upper/LowerTriangular -->
- [x] #56214 <!-- fix precompile process flags -->
- [x] #54471
- [x] #55622
- [x] #55704
- [x] #55764
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules pkgimage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stdlib pkgimages seems to be compiled with generic target
8 participants