Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect RPATH for libjulia #26830

Closed
lucianolorenti opened this issue Apr 17, 2018 · 18 comments
Closed

Incorrect RPATH for libjulia #26830

lucianolorenti opened this issue Apr 17, 2018 · 18 comments
Labels
building Build system, or building Julia or its dependencies

Comments

@lucianolorenti
Copy link

ARPACK is raising an exception when eigs is used. I checked out this issue reported to Archlinux. I don't know what else information I can provide you.

Example:

julia> eigs(spdiagm(1:15))
ERROR: Base.LinAlg.ARPACKException("unexpected behavior")
Stacktrace:
 [1] aupd_wrapper(::Type{T} where T, ::Base.LinAlg.#matvecA!#114{SparseMatrixCSC{Float64,Int64}}, ::Base.LinAlg.##108#115, ::Base.LinAlg.##109#116, ::Int64, ::Bool, ::Bool, ::String, ::Int64, ::Int32, ::String, ::Float64, ::Int64, ::Int64, ::Array{Float64,1}) at ./linalg/arpack.jl:63
 [2] #_eigs#107(::Int64, ::Int64, ::Symbol, ::Float64, ::Int64, ::Void, ::Array{Float64,1}, ::Bool, ::Base.LinAlg.#_eigs, ::SparseMatrixCSC{Float64,Int64}, ::UniformScaling{Int64}) at ./linalg/arnoldi.jl:285
 [3] _eigs(::SparseMatrixCSC{Float64,Int64}, ::UniformScaling{Int64}) at ./linalg/arnoldi.jl:175
 [4] #eigs#100 at ./linalg/arnoldi.jl:91 [inlined]
 [5] eigs(::SparseMatrixCSC{Float64,Int64}, ::UniformScaling{Int64}) at ./linalg/arnoldi.jl:91
 [6] #eigs#104(::Array{Any,1}, ::Function, ::SparseMatrixCSC{Int64,Int64}, ::UniformScaling{Int64}) at ./linalg/arnoldi.jl:98
 [7] #eigs#99 at ./linalg/arnoldi.jl:90 [inlined]
 [8] eigs(::SparseMatrixCSC{Int64,Int64}) at ./linalg/arnoldi.jl:90
 [9] macro expansion at /home/luciano/.julia/v0.6/Revise/src/Revise.jl:775 [inlined]
 [10] (::Revise.##17#18{Base.REPL.REPLBackend})() at ./event.jl:73
julia> versioninfo()
Julia Version 0.6.2
Commit d386e40 (2017-12-13 18:08 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i3-6100 CPU @ 3.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libm
  LLVM: libLLVM-3.9.1 (ORCJIT, skylake)
pacman -Q arpack julia
arpack 3.5.0-3
julia 2:0.6.2-5
@andreasnoack
Copy link
Member

I cannot reproduce this on Mac or Linux with the official binaries so please report this to the Arch package maintainer.

@ViralBShah
Copy link
Member

Can you try the official linux binaries on Julialang.org?

@lucianolorenti
Copy link
Author

lucianolorenti commented Apr 17, 2018

It works with the official binaries. I wrote in the archlinux issue tracker. It was working a few days ago, something something must have changed.

Sorry for bothering you.

@eli-schwartz
Copy link
Contributor

readelf -d /usr/lib/libjulia.so|grep RPATH
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN:$ORIGIN/julia]

Not sure how this is supposed to work, ever. It's prioritizing system libs over the ones installed to its private libdir.

[root@nspawn ~]# julia -e 'print(eigs(spdiagm(1:15)))'
ERROR: Base.LinAlg.ARPACKException("unexpected behavior")
Stacktrace:
 [1] aupd_wrapper(::Type{T} where T, ::Base.LinAlg.#matvecA!#114{SparseMatrixCSC{Float64,Int64}}, ::Base.LinAlg.##108#115, ::Base.LinAlg.##109#116, ::Int64, ::Bool, ::Bool, ::String, ::Int64, ::Int32, ::String, ::Float64, ::Int64, ::Int64, ::Array{Float64,1}) at ./linalg/arpack.jl:63
 [2] #_eigs#107(::Int64, ::Int64, ::Symbol, ::Float64, ::Int64, ::Void, ::Array{Float64,1}, ::Bool, ::Base.LinAlg.#_eigs, ::SparseMatrixCSC{Float64,Int64}, ::UniformScaling{Int64}) at ./linalg/arnoldi.jl:285
 [3] _eigs(::SparseMatrixCSC{Float64,Int64}, ::UniformScaling{Int64}) at ./linalg/arnoldi.jl:175
 [4] #eigs#100 at ./linalg/arnoldi.jl:91 [inlined]
 [5] eigs(::SparseMatrixCSC{Float64,Int64}, ::UniformScaling{Int64}) at ./linalg/arnoldi.jl:91
 [6] #eigs#104(::Array{Any,1}, ::Function, ::SparseMatrixCSC{Int64,Int64}, ::UniformScaling{Int64}) at ./linalg/arnoldi.jl:98
 [7] #eigs#99 at ./linalg/arnoldi.jl:90 [inlined]
 [8] eigs(::SparseMatrixCSC{Int64,Int64}) at ./linalg/arnoldi.jl:90
[root@nspawn ~]# chrpath -r '$ORIGIN/julia' /usr/lib/libjulia.so.0.6.2
/usr/lib/libjulia.so.0.6.2: RPATH=$ORIGIN:$ORIGIN/julia
/usr/lib/libjulia.so.0.6.2: new RPATH: $ORIGIN/julia
[root@nspawn ~]# julia -e 'print(eigs(spdiagm(1:15)))'
([15.0, 14.0, 13.0, 12.0, 11.0, 10.0], [-4.71845e-16 0.0 -1.11022e-16 5.55112e-17 -5.55112e-17 -5.55112e-17; -2.92735e-17 2.61835e-17 -1.80194e-16 5.3874e-16 -2.72135e-17 -1.12757e-16; 3.33067e-16 4.44089e-16 4.44089e-16 -2.77556e-17 -1.11022e-16 -5.55112e-17; -1.78677e-16 2.22045e-16 -4.44089e-16 -6.93889e-18 -3.33067e-16 0.0; -1.38778e-17 1.21431e-16 -5.55112e-17 2.15106e-16 3.1225e-16 -1.11022e-16; 1.11022e-16 -2.15106e-16 -2.77556e-17 -5.55112e-17 2.77556e-16 -8.32667e-17; -3.33067e-16 -1.11022e-16 3.88578e-16 -1.66533e-16 -3.05311e-16 5.55112e-17; 1.52656e-16 1.11022e-16 1.11022e-16 -1.11022e-16 6.07153e-17 2.63678e-16; -2.08167e-16 -8.32667e-17 -9.36751e-17 2.22045e-16 2.74086e-16 6.66134e-16; -2.01885e-17 -1.8462e-17 -5.44493e-17 2.2129e-16 3.71588e-16 1.0; -3.80426e-16 -2.15176e-18 6.55683e-16 2.97829e-16 -1.0 2.22045e-16; -5.98552e-16 -4.77987e-16 4.12563e-16 1.0 2.77556e-16 -1.66533e-16; 3.51431e-16 -8.06966e-17 -1.0 5.27356e-16 -6.66134e-16 -5.55112e-17; -2.92016e-15 -1.0 1.66533e-16 -6.66134e-16 1.11022e-16 -5.55112e-17; 1.0 -2.69229e-15 4.44089e-16 4.996e-16 -3.33067e-16 8.32667e-17], 6, 1, 15, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]) 

@andreasnoack
Copy link
Member

cc: @nalimilan @staticfloat

@staticfloat
Copy link
Member

I don’t know who maintains the arch Linux package but sounds like they are changing the RPATH of libjulia.so when they shouldn’t.

@eli-schwartz
Copy link
Contributor

You can see the buildscript here: https://git.archlinux.org/svntogit/community.git/tree/trunk/PKGBUILD?h=packages/julia

I have no idea why @xyproto is modifying /usr/bin/julia to use an rpath of /usr/lib, but libjulia.so itself does not appear to be getting touched by us.

@xyproto
Copy link

xyproto commented Apr 18, 2018

@staticfloat the RPATH of libjulia.so was not modified.

@eli-schwartz This issue still appears when not modifying the rpath of /usr/bin/julia with patchelf.

Antonio Rojas commented the following in the Arch bug tracker:

The RPATH of libjulia.so is wrong

> chrpath /usr/lib/libjulia.so
/usr/lib/libjulia.so: RPATH=$ORIGIN:$ORIGIN/julia

so julia 0.6.2-5 is still using system arpack if you have it installed. Workaround:

chrpath -r '/usr/lib/julia/' /lib/libjulia.so.0.6.2

I'm currently testing this chrpath workaround in the PKGBUILD for Julia.

Suggestions are warmly welcome if this is the wrong way to make julia find the arpack library that comes with Julia (as configured with USE_SYSTEM_ARPACK=0).

@andreasnoack
Copy link
Member

@staticfloat Shouldn't the order here be reversed

julia/Make.inc

Line 949 in ca7e837

RPATH_LIB := -Wl,-rpath,'$$ORIGIN' -Wl,-rpath,'$$ORIGIN/julia' -Wl,-z,origin
? I can reproduce the issue if I symlink a system libarpack into $libdir and the issue goes away if I reverse the order.

@andreasnoack andreasnoack reopened this Apr 18, 2018
@andreasnoack andreasnoack changed the title ARPACKException("unexpected behavior") Incorrect RPATH for libjulia Apr 18, 2018
@andreasnoack andreasnoack added the building Build system, or building Julia or its dependencies label Apr 18, 2018
@antonio-rojas
Copy link
Contributor

FYI With respect to the original arpack issue (not sure why this was changed to discuss a completely
different issue), since I suppose you will have to eventually fix it when you update your internal arpack copy: it is caused by this change
opencollab/arpack-ng@d04bdf1

@andreasnoack
Copy link
Member

The issue here is that we end up linking the wrong version of a library. Why do you think the reason is that commit? It looks like it was merged fairly recently while version 3.5.0 of arpack-ng is from last year. I suspect that the reason for the error when linking to the system library is that it is built for 32-bit integers while Julia usually builds arpack for 64-bit integers.

@nalimilan
Copy link
Member

In the Arch issue, it is mentioned that the original bug comes from including that patch in the Arch ARPACK package. The rpath issue only appears when trying to use a bundled ARPACK to avoid that problem.

@eli-schwartz
Copy link
Contributor

eli-schwartz commented Apr 18, 2018

issue here is that we end up linking the wrong version of a library.

Well, this is technically wrong. The initial problem was that Arch Linux used system arpack, and our julia build was intended to link the version it did in fact link.

When the reporter of this issue reported it to us, we ended up switching to julia's vendored arpack, but because of some weird rpath we couldn't even work around this anyway.

...

This does not change the fact that the initial issue is julia failing to work when the system arpack is used. This may not be a high priority, given Julia's preference for controlling vendored dependencies, but a system arpack update breaking everything will eventually be something which affects you anyway so it's better to at least be aware of the issue (and hopefully fix it to give redistributors the option of debundling)...

@eli-schwartz
Copy link
Contributor

Well, in all fairness we are using the unreleased git master due to https://bugs.archlinux.org/task/58123

So our package is pinned to opencollab/arpack-ng@edce634
It is quite understandable that julia does not work with the development version of arpack, but on the plus side if you can fix it now you'll have no problems updating once it is released!

@staticfloat
Copy link
Member

This does not change the fact that the initial issue is julia failing to work when the system arpack is used. This may not be a high priority, given Julia's preference for controlling vendored dependencies, but a system arpack update breaking everything will eventually be something which affects you anyway so it's better to at least be aware of the issue (and hopefully fix it to give redistributors the option of debundling)...

We can't; the ARPACK we build is fundamentally incompatible with other packages, because we made the design decision to use ARPACK in ILP64 mode, which means that the datatype used as indices into matrices are 64-bit integers; many BLAS and LAPACK style libraries have an "ILP64" mode that you can put them into, and when you do so client applications that attempt to use them will not function correctly because their indices will be misinterpreted. Unfortunately for the scientific world, this has become the standard practice causing no end of problems for applications that wish to work with matrices larger than 4.5B elements along a single axis. We have come up with a few clever ways to get around these problems (such as renaming the ILP64 symbols with OpenBLAS to have 64_ on the end to avoid breaking other libraries that get linked into Julia's address space that expect ILP32 symbols to be available) but for Julia itself, we require an ILP64 BLAS and ARPACK by default. You can turn that off, but you will be unable to perform basic linear algebra upon matrices with axes that are larger than 2^32 - 1. See DISTRIBUTING.md for a little more information on this and the build options for Julia to select the interface you want.

@staticfloat
Copy link
Member

@staticfloat Shouldn't the order here be reversed

Yes, I agree.

@ViralBShah
Copy link
Member

Yes, even if users use our provided libraries - other packages on the system will bring in the same dependencies. So, at the very least, we should make sure that Julia uses our own bundled libraries correctly, picking the right one even when alternate versions are system installed.

@nalimilan
Copy link
Member

We can't; the ARPACK we build is fundamentally incompatible with other packages, because we made the design decision to use ARPACK in ILP64 mode, which means that the datatype used as indices into matrices are 64-bit integers; many BLAS and LAPACK style libraries have an "ILP64" mode that you can put them into, and when you do so client applications that attempt to use them will not function correctly because their indices will be misinterpreted.

Yet that doesn't mean distributions cannot use system libraries: they just need to provide ILP64 variants with suffixed symbols. I have worked upstream (opencollab/arpack-ng#30) to ensure that it's possible to build such libraries, and it's now included in Fedora (though Julia doesn't use it yet because I experienced crashes).

Also it's not the end of the world if distributions keep using 32-bit indices for some time. Julia is the only scientific app to use ILP64 in distributions currently, so obviously people don't need it very often.

andreasnoack added a commit that referenced this issue May 4, 2018
ararslan pushed a commit that referenced this issue May 7, 2018
ararslan pushed a commit that referenced this issue May 8, 2018
ararslan pushed a commit that referenced this issue May 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies
Projects
None yet
Development

No branches or pull requests

8 participants