Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interfacing with other packages #4

Open
pshashk opened this issue Sep 2, 2017 · 23 comments
Open

Interfacing with other packages #4

pshashk opened this issue Sep 2, 2017 · 23 comments

Comments

@pshashk
Copy link

pshashk commented Sep 2, 2017

Very nice package. Thank you for fixing this significant drawback of R language. In my opinion, these two features would make float matrices even more useful:

  1. Wrapper for RcppArmadillo Mat<float> class. This would make float matrices suitable for performing in-place manipulations from the Rcpp side (e.g., SGD matrix decomposition).

  2. Being able to convert float matrix to NumPy equivalent to communicate with python libraries via reticulate (e.g., to reduce the memory requirements when preprocessing data for tensorflow or keras deep learning frameworks).

@wrathematics
Copy link
Owner

Thanks!

I've been thinking about RcppArmadillo, maybe creating some kind of ArmaFloat package. I may need Dirk's help to pull that off, and he's extremely busy, so no promises. But it's something I want to pursue.

I hadn't thought about interfacing with python, but that's a great idea. Actually, deep learning's push into ever lower precisions was one of my primary motivations for creating the package. I'm not very familiar with how reticulate works, but it should be possible to add some kind of shim. I'll look into it.

@dselivanov
Copy link
Contributor

I have a project where I also use R's integer matrices to store floats. Actual number crunching done in C++ with Armadillo. I will try to adopt my code to use float package. So this could become an example of interfacing float with C++ code.

@wrathematics
Copy link
Owner

Cool! Does RcppArmadillo ship with the float part of Armadillo? My guess was they didn't because of the lack of single precision blas/lapack.

@dselivanov
Copy link
Contributor

Everything works out of the box. RcppArmadillo ships everything from Armadillo. And Armadillo itself optionally uses external blas / lapack(it contains its own reference implementation I guess).
Here is proof of concept branch of other package - https://github.com/dselivanov/reco/tree/float. More than 2x faster than double precision and 2x less ram.

@wrathematics
Copy link
Owner

wrathematics commented Dec 8, 2017

I think it's because you're using high performance BLAS/LAPACK, so the symbol resolution is taking place from your $(LAPACK_LIBS) and $(BLAS_LIBS) lines in Makevars. When I try to build it with an R version linked with R's reference blas (which don't include the single precision functions), the build fails with undefined symbol: sgesvx_.

Obviously the best thing to do is link with high performance BLAS implementations that have all of the symbols already. The CRAN acceptable solution is to link with float/libs/float.so if it built the reference blas/lapack instead (the LinkingTo field does not do this, sadly). I'll make this easier to do this weekend.

@dselivanov
Copy link
Contributor

Hm. Mb need more sophisticated configure script (which I don't know how to write yet). Won't it work after removal of $(LAPACK_LIBS) and $(BLAS_LIBS) from makevars? In this case I believe Armadillo should use its reference implementation.

@wrathematics
Copy link
Owner

I think Armadillo's non-blas/lapack implementation is controlled by c/c++ preprocessor stuff, which would be painful to handle. But I'm exporting LAPACK (plus some other minor things) to a static library, so no autoconf necessary (thankfully). I have it in a local version that I'll push soon when I'm sure it's working, but basically the reco Makevars would look like:

SLAPACK_LIB = `${R_HOME}/bin/Rscript -e "float:::ldflags()"`

PKG_CXXFLAGS = $(SHLIB_OPENMP_CFLAGS) -DARMA_64BIT_WORD
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(SLAPACK_LIB)
CXX_STD = CXX11

I'll hopefully finish it up tomorrow.

@wrathematics
Copy link
Owner

I've created float:::ldflags() which when used as in the above example Makevars file will set the caller package to link with float.so from the float package. Originally I was going to use a static library, but a shared library really makes more sense.

Basically, if float BLAS/LAPACK get compiled when building the float package (e.g. for binary CRAN distributions), then those symbols will be there and the linker can resolve the lookup for any package using, for example, armadillo. Otherwise (in the case where high performance BLAS/LAPACK are used) those symbols will never get built in the first place and the linking may even be unnecessary. The advantages are that it's the same process regardless of what BLAS/LAPACK libraries are used, and there are some extra things in float.so, like R_NaNf and NA_FLOAT (analogues of R_NaN and NA_REAL).

I plan to write this up more carefully in the package vignette soon. I have an example package that uses this here. I've also tested it with reco and it appears to be working in my setup that doesn't have high performance BLAS/LAPACK.

@dselivanov
Copy link
Contributor

Thanks you very much for detailed investigation. Will try and report back.

@dselivanov
Copy link
Contributor

@wrathematics finally I've tried Makevars you proposed. It works great with minimal adjustment - need to add -L to the beginning of the SLAPACK_LIB:

SLAPACK_LIB = "-L"`${R_HOME}/bin/Rscript -e "float:::ldflags()"`

PKG_CXXFLAGS = $(SHLIB_OPENMP_CFLAGS) -DARMA_64BIT_WORD
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(SLAPACK_LIB)
CXX_STD = CXX11

Tested on OS X and Ubuntu 16.04

@cdeterman
Copy link

I managed to get the compiler to build a new package dll on my Windows system but when it tries to load the new package it crashes with the system error

The program can't start because float.dll is missing from your computer. Try reinstalling the program to fix this problem.

Thoughts?

@cdeterman
Copy link

I thought this could be resolved using the -Wl,-rpath but I guess rpath is not supported on Windows.

@jamespr615
Copy link

I had exactly the sames problem. Win7

Sequence:
rsparse would not build.
Did devtools install of float.

devtools::install_github("dselivanov/float").
It built. .DLLs are present on the file system.

Could not build rsparse using devtool github install
devtools::install_github("dselivanov/rsparse")

downloaded the zip and worked to install_local

install_local("C:/Users/reefej/R/win-library/3.4/rsparse", force=TRUE)

No good. Choked on many things.

  • Changed the name of Makevars.in to Makevars.win
  • Updated Makevars.win to

SLAPACK_LIB = ${R_HOME}/bin/Rscript -e "float:::ldflags()"

PKG_CXXFLAGS = -std=c++11 $(SHLIB_OPENMP_CFLAGS) -DARMA_64BIT_WORD -O3 -fopenmp -ffast-math -march=native -mavx
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(SLAPACK_LIB)
CXX_STD = CXX11

It built but could not find float.dll.

  • I dropped both i386 and x64 version of float.dll into directory in windows path.
    • It found them, completed the build and I could do

library(rparse)
model = WRMF$new(rank = 8)

After Makevars updates, I expect the problem float install and build does not register the .dlls with R environment. So you have to 'show' R and windows where to find them. since the .dlls have the same name I dropped the i386 and x64 versions into separate dirs. Do not know implications at run time, but windows will find the first' one in the search path. A windows tool can tell you which gets loaded. The build knew the difference though. I will check how to 'register the float dll dirs with the R environment.

@wrathematics
Copy link
Owner

wrathematics commented Jul 17, 2018

I get a compiler error when I try to install rsparse, but it's unrelated to float (but the windows environment that I have (limited) access to is really weird, so it could easily be a configuration problem on my end). I don't think there's any way around using a static lib for windows and possibly mac as well. I'm working on that as we speak.

@wrathematics
Copy link
Owner

The latest rsparse doesn't link with float, so I think you're trying an older version. You might try the latest version.

I was able to build an older version (this one) on linux by linking against a static library generated in the float package. I get an ld error on Windows that I don't understand though. I've added the changes to a new branch.

I could use some windows expertise if you have a minute @snoweye.

@snoweye
Copy link
Contributor

snoweye commented Jul 18, 2018

@dselivanov @wrathematics Is the change (without linking with float) temporary or permanent? I need to decide to invest time on it or not because the work is not easy nor trivial nor guarantee success (high risk to fail) and potential very long time.

@wrathematics
Copy link
Owner

Well regardless of rsparse, I need to get something reliable that works on windows.

What I have now appears to work on mac (still running some tests), and Linux continues to be problem free. I just pushed an update for windows, but it's still broken. At first I was trying this and this (which you wrote), but couldn't get it to work. Now I basically just copied what I do on *nix and it appears to work, but it has some missing symbol problems. I really don't know how to do this on windows.

@wrathematics
Copy link
Owner

wrathematics commented Jul 18, 2018

You could also try testing with this (needs this) or kazaam. But I can't get openmp or MPI to work in my busted windows vm :[

@jamespr615
Copy link

jamespr615 commented Jul 18, 2018 via email

@dselivanov
Copy link
Contributor

@snoweye this is temporary - once we will figure out reliable way on hot to link to float and update float on CRAN, I will return dependency back.

@snoweye
Copy link
Contributor

snoweye commented Aug 7, 2018

@wrathematics @dselivanov Please check my changes to the float, kazaam and rsparse. They install on my native Windows. Thanks.

wrathematics pushed a commit that referenced this issue Aug 16, 2018
@dselivanov
Copy link
Contributor

https://github.com/dselivanov/rsparse/releases/tag/v0.3.3.1 is on CRAN, so it can showcase on how to link to float (essentially everything is done by float).

@wrathematics
Copy link
Owner

Cool!

I'd eventually like to crack dynamic linking on all platforms. This would avoid file size notes for downstream packages, but it would also simplify the use of high performance BLAS in some situations. It shouldn't change anything for a package author if I can ever get it right. It's on the list, anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants