Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] make package installable with CRAN toolchain (fixes #2960) #3188

Merged
merged 50 commits into from
Jul 29, 2020
Merged
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
3d91a63
[R-package] make package installable with CRAN toolchain (fixes #2960)
jameslamb Jun 23, 2020
f1c56b7
Apply suggestions from code review
jameslamb Jul 2, 2020
8345d38
merge master
jameslamb Jul 4, 2020
ebe1d47
remove GPU stuff
jameslamb Jul 4, 2020
c7c27ed
Merge branch 'feat/cran-install' of github.com:jameslamb/LightGBM int…
jameslamb Jul 4, 2020
53108bd
use wildcard to find objects to build
jameslamb Jul 4, 2020
782209a
use -lomp
jameslamb Jul 4, 2020
e9cb829
build configure before moving files
jameslamb Jul 4, 2020
0e23f4a
merge master
jameslamb Jul 7, 2020
17632cc
using wildcard for objects
jameslamb Jul 7, 2020
ae3bc72
Update .github/workflows/main.yml
jameslamb Jul 7, 2020
fe4776c
add explicit objects back
jameslamb Jul 7, 2020
cb4b39a
reduce allowed R CMD check NOTEs and catch stderr from build-cran-pac…
jameslamb Jul 7, 2020
d78932e
fixing things
jameslamb Jul 8, 2020
012f876
pin autoconf version
jameslamb Jul 8, 2020
39160a8
show diff
jameslamb Jul 8, 2020
3301a79
add automake back
jameslamb Jul 8, 2020
b843040
run less checks
jameslamb Jul 8, 2020
0dd5365
command was in the wrong place
jameslamb Jul 8, 2020
9e29038
fix autoconf version
jameslamb Jul 8, 2020
f7d9cc4
Merge branch 'master' into feat/cran-install
jameslamb Jul 9, 2020
c673b4f
change strategy for handling configure
jameslamb Jul 9, 2020
99bea32
fix Rbuildignore
jameslamb Jul 10, 2020
4b9a425
Merge branch 'master' into feat/cran-install
jameslamb Jul 11, 2020
c009369
fix NOTEs
jameslamb Jul 11, 2020
752ad5f
fix notes about unrecognized files
jameslamb Jul 11, 2020
0eee9d4
fixing extra files
jameslamb Jul 12, 2020
9d5504f
remove USE_R35
jameslamb Jul 12, 2020
a062ef3
remove USE_R35
jameslamb Jul 12, 2020
56fe9d8
add OpenMP check for Mac CRAN build
jameslamb Jul 12, 2020
3962c25
Merge branch 'master' into feat/cran-install
jameslamb Jul 15, 2020
ef4ed2d
run all checks
jameslamb Jul 15, 2020
8ba2b0e
Merge branch 'master' into feat/cran-install
jameslamb Jul 18, 2020
f934785
merge master
jameslamb Jul 21, 2020
04906c5
Apply suggestions from code review
jameslamb Jul 24, 2020
de114e1
Merge branch 'master' into feat/cran-install
jameslamb Jul 24, 2020
1800820
suggestions from code review
jameslamb Jul 24, 2020
9b74925
undo indenting
jameslamb Jul 24, 2020
8703a90
remove 03 from Makevars.win.in
jameslamb Jul 24, 2020
9c3a4ae
Merge branch 'master' into feat/cran-install
jameslamb Jul 27, 2020
239ee3f
update language about OpenMP in configure script
jameslamb Jul 28, 2020
62b24da
Merge branch 'master' into feat/cran-install
jameslamb Jul 28, 2020
856b80e
checking if configure.ac check works
jameslamb Jul 28, 2020
c9bcc46
add autoconf back
jameslamb Jul 28, 2020
eb93b5c
remove testing code in configure.ac
jameslamb Jul 28, 2020
050a689
more fixes for CI on configure script
jameslamb Jul 28, 2020
0c38702
print git diff
jameslamb Jul 28, 2020
ff6da50
add VERSION.txt when checking configure
jameslamb Jul 28, 2020
9d08776
fix relative paths
jameslamb Jul 28, 2020
9012752
remove git diff
jameslamb Jul 29, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 75 additions & 3 deletions .ci/test_r_package.sh
Original file line number Diff line number Diff line change
@@ -50,10 +50,25 @@ if [[ $AZURE != "true" ]] && [[ $OS_NAME == "linux" ]]; then
texlive-fonts-extra \
qpdf \
|| exit -1

# https://github.com/r-lib/actions/issues/111
if [[ $R_BUILD_TYPE == "cran" ]]; then
sudo apt-get install \
--no-install-recommends \
-y \
autoconf=$(cat R-package/AUTOCONF_UBUNTU_VERSION) \
devscripts \
|| exit -1
fi
fi

# Installing R precompiled for Mac OS 10.11 or higher
if [[ $OS_NAME == "macos" ]]; then
if [[ $R_BUILD_TYPE == "cran" ]]; then
brew install \
automake \
checkbashisms
fi
brew install qpdf
brew cask install basictex
export PATH="/Library/TeX/texbin:$PATH"
@@ -109,10 +124,48 @@ if [[ $TASK == "r-package-check-docs" ]]; then
fi

cd ${BUILD_DIRECTORY}
Rscript build_r.R --skip-install || exit -1

PKG_TARBALL="lightgbm_${LGB_VER}.tar.gz"
LOG_FILE_NAME="lightgbm.Rcheck/00check.log"
if [[ $R_BUILD_TYPE == "cmake" ]]; then
Rscript build_r.R --skip-install || exit -1
elif [[ $R_BUILD_TYPE == "cran" ]]; then

# on Linux, we recreate configure in CI to test if
# a change in a PR has changed configure.ac
if [[ $OS_NAME == "linux" ]]; then
cp VERSION.txt R-package/src/VERSION.txt
cd ${BUILD_DIRECTORY}/R-package
autoconf \
--output configure \
configure.ac \
|| exit -1
cd ${BUILD_DIRECTORY}

num_files_changed=$(
git diff --name-only | wc -l
)
if [[ ${num_files_changed} -gt 0 ]]; then
echo "'configure' in the R package has changed. Please recreate it and commit the changes."
echo "Changed files:"
git diff --compact-summary
echo "See R-package/README.md for details on how to recreate this script."
echo ""
exit -1
fi
fi

./build-cran-package.sh || exit -1

# Test CRAN source .tar.gz in a directory that is not this repo or below it.
# When people install.packages('lightgbm'), they won't have the LightGBM
# git repo around. This is to protect against the use of relative paths
# like ../../CMakeLists.txt that would only work if you are in the repo
R_CMD_CHECK_DIR="${HOME}/tmp-r-cmd-check/"
mkdir -p ${R_CMD_CHECK_DIR}
mv ${PKG_TARBALL} ${R_CMD_CHECK_DIR}
cd ${R_CMD_CHECK_DIR}
fi

# fails tests if either ERRORs or WARNINGs are thrown by
# R CMD CHECK
@@ -134,7 +187,8 @@ while kill -0 ${CHECK_PID} >/dev/null 2>&1; do
done

echo "R CMD check build logs:"
cat ${BUILD_DIRECTORY}/lightgbm.Rcheck/00install.out
BUILD_LOG_FILE=lightgbm.Rcheck/00install.out
cat ${BUILD_LOG_FILE}

if [[ $check_succeeded == "no" ]]; then
exit -1
@@ -145,7 +199,11 @@ if grep -q -R "WARNING" "$LOG_FILE_NAME"; then
exit -1
fi

ALLOWED_CHECK_NOTES=2
if [[ $OS_NAME == "linux" ]] && [[ $R_BUILD_TYPE == "cran" ]]; then
ALLOWED_CHECK_NOTES=2
else
ALLOWED_CHECK_NOTES=1
fi
NUM_CHECK_NOTES=$(
cat ${LOG_FILE_NAME} \
| grep -e '^Status: .* NOTE.*' \
@@ -155,3 +213,17 @@ if [[ ${NUM_CHECK_NOTES} -gt ${ALLOWED_CHECK_NOTES} ]]; then
echo "Found ${NUM_CHECK_NOTES} NOTEs from R CMD check. Only ${ALLOWED_CHECK_NOTES} are allowed"
exit -1
fi

# this check makes sure that CI builds of the CRAN package on Mac
# actually use OpenMP
if [[ $OS_NAME == "macos" ]] && [[ $R_BUILD_TYPE == "cran" ]]; then
omp_working=$(
cat $BUILD_LOG_FILE \
| grep -E "checking whether OpenMP will work .*yes" \
| wc -l
)
if [[ $omp_working -ne 1 ]]; then
echo "OpenMP was not found, and should be when testing the CRAN package on macOS"
exit -1
fi
fi
99 changes: 61 additions & 38 deletions .ci/test_r_package_windows.ps1
Original file line number Diff line number Diff line change
@@ -44,54 +44,61 @@ function Run-R-Code-Redirect-Stderr {
Rscript --vanilla -e $decorated_code
}

$env:R_LIB_PATH = "$env:BUILD_SOURCESDIRECTORY/RLibrary" -replace '[\\]', '/'
$env:R_LIBS = "$env:R_LIB_PATH"
$env:PATH = "$env:R_LIB_PATH/Rtools/bin;" + "$env:R_LIB_PATH/Rtools/usr/bin;" + "$env:R_LIB_PATH/R/bin/x64;" + "$env:R_LIB_PATH/miktex/texmfs/install/miktex/bin/x64;" + $env:PATH
$env:CRAN_MIRROR = "https://cloud.r-project.org/"
$env:CTAN_MIRROR = "https://ctan.math.illinois.edu/systems/win32/miktex"
$env:CTAN_MIKTEX_ARCHIVE = "$env:CTAN_MIRROR/setup/windows-x64/"
$env:CTAN_PACKAGE_ARCHIVE = "$env:CTAN_MIRROR/tm/packages/"

# Get details needed for installing R components
#
# NOTES:
# * some paths and file names are different on R4.0
$env:R_MAJOR_VERSION = $env:R_VERSION.split('.')[0]
if ($env:R_MAJOR_VERSION -eq "3") {
$env:RTOOLS_MINGW_BIN = "$env:R_LIB_PATH/Rtools/mingw_64/bin"
# Rtools 3.x has to be installed at C:\Rtools\
# * https://stackoverflow.com/a/46619260/3986677
$RTOOLS_INSTALL_PATH = "C:\Rtools"
$env:RTOOLS_MINGW_BIN = "$RTOOLS_INSTALL_PATH/mingw_64/bin"
$env:RTOOLS_EXE_FILE = "Rtools35.exe"
$env:R_WINDOWS_VERSION = "3.6.3"
} elseif ($env:R_MAJOR_VERSION -eq "4") {
$env:RTOOLS_MINGW_BIN = "$env:R_LIB_PATH/Rtools/mingw64/bin"
$RTOOLS_INSTALL_PATH = "C:\rtools40"
$env:RTOOLS_MINGW_BIN = "$RTOOLS_INSTALL_PATH/mingw64/bin"
$env:RTOOLS_EXE_FILE = "rtools40-x86_64.exe"
$env:R_WINDOWS_VERSION = "4.0.2"
} else {
Write-Output "[ERROR] Unrecognized R version: $env:R_VERSION"
Check-Output $false
}

if ($env:COMPILER -eq "MINGW") {
$env:R_LIB_PATH = "$env:BUILD_SOURCESDIRECTORY/RLibrary" -replace '[\\]', '/'
$env:R_LIBS = "$env:R_LIB_PATH"
$env:PATH = "$RTOOLS_INSTALL_PATH/bin;" + "$RTOOLS_INSTALL_PATH/usr/bin;" + "$env:R_LIB_PATH/R/bin/x64;" + "$env:R_LIB_PATH/miktex/texmfs/install/miktex/bin/x64;" + $env:PATH
$env:CRAN_MIRROR = "https://cloud.r-project.org/"
$env:CTAN_MIRROR = "https://ctan.math.illinois.edu/systems/win32/miktex"
$env:CTAN_MIKTEX_ARCHIVE = "$env:CTAN_MIRROR/setup/windows-x64/"
$env:CTAN_PACKAGE_ARCHIVE = "$env:CTAN_MIRROR/tm/packages/"

if (($env:COMPILER -eq "MINGW") -and ($env:R_BUILD_TYPE -eq "cmake")) {
$env:CXX = "$env:RTOOLS_MINGW_BIN/g++.exe"
$env:CC = "$env:RTOOLS_MINGW_BIN/gcc.exe"
}

cd $env:BUILD_SOURCESDIRECTORY
tzutil /s "GMT Standard Time"
[Void][System.IO.Directory]::CreateDirectory($env:R_LIB_PATH)

if ($env:TOOLCHAIN -eq "MINGW") {
Write-Output "Telling R to use MinGW"
$install_libs = "$env:BUILD_SOURCESDIRECTORY/R-package/src/install.libs.R"
((Get-Content -Path $install_libs -Raw) -Replace 'use_mingw <- FALSE','use_mingw <- TRUE') | Set-Content -Path $install_libs
} elseif ($env:TOOLCHAIN -eq "MSYS") {
Write-Output "Telling R to use MSYS"
$install_libs = "$env:BUILD_SOURCESDIRECTORY/R-package/src/install.libs.R"
((Get-Content -Path $install_libs -Raw) -Replace 'use_msys2 <- FALSE','use_msys2 <- TRUE') | Set-Content -Path $install_libs
} elseif ($env:TOOLCHAIN -eq "MSVC") {
# no customization for MSVC
} else {
Write-Output "[ERROR] Unrecognized compiler: $env:TOOLCHAIN"
Check-Output $false
$env:LGB_VER = Get-Content -Path VERSION.txt -TotalCount 1

if ($env:R_BUILD_TYPE -eq "cmake") {
if ($env:TOOLCHAIN -eq "MINGW") {
Write-Output "Telling R to use MinGW"
$install_libs = "$env:BUILD_SOURCESDIRECTORY/R-package/src/install.libs.R"
((Get-Content -Path $install_libs -Raw) -Replace 'use_mingw <- FALSE','use_mingw <- TRUE') | Set-Content -Path $install_libs
} elseif ($env:TOOLCHAIN -eq "MSYS") {
Write-Output "Telling R to use MSYS"
$install_libs = "$env:BUILD_SOURCESDIRECTORY/R-package/src/install.libs.R"
((Get-Content -Path $install_libs -Raw) -Replace 'use_msys2 <- FALSE','use_msys2 <- TRUE') | Set-Content -Path $install_libs
} elseif ($env:TOOLCHAIN -eq "MSVC") {
# no customization for MSVC
} else {
Write-Output "[ERROR] Unrecognized toolchain: $env:TOOLCHAIN"
Check-Output $false
}
}

# download R and RTools
@@ -105,16 +112,18 @@ Start-Process -FilePath R-win.exe -NoNewWindow -Wait -ArgumentList "/VERYSILENT
Write-Output "Done installing R"

Write-Output "Installing Rtools"
Start-Process -FilePath Rtools.exe -NoNewWindow -Wait -ArgumentList "/VERYSILENT /DIR=$env:R_LIB_PATH/Rtools" ; Check-Output $?
Start-Process -FilePath Rtools.exe -NoNewWindow -Wait -ArgumentList "/VERYSILENT /DIR=$RTOOLS_INSTALL_PATH" ; Check-Output $?
Write-Output "Done installing Rtools"

Write-Output "Installing dependencies"
$packages = "c('data.table', 'jsonlite', 'Matrix', 'processx', 'R6', 'testthat'), dependencies = c('Imports', 'Depends', 'LinkingTo')"
Run-R-Code-Redirect-Stderr "options(install.packages.check.source = 'no'); install.packages($packages, repos = '$env:CRAN_MIRROR', type = 'binary', lib = '$env:R_LIB_PATH')" ; Check-Output $?

# MiKTeX and pandoc can be skipped on MSVC builds, since we don't
# build the package documentation for those
if ($env:COMPILER -ne "MSVC") {
# MiKTeX and pandoc can be skipped on non-MinGW builds, since we don't
# build the package documentation for those.
#
# MiKTeX always needs to be built to test a CRAN package.
if (($env:COMPILER -eq "MINGW") -or ($env:R_BUILD_TYPE -eq "cran")) {
Download-Miktex-Setup "$env:CTAN_MIKTEX_ARCHIVE" "miktexsetup-x64.zip"
Add-Type -AssemblyName System.IO.Compression.FileSystem
[System.IO.Compression.ZipFile]::ExtractToDirectory("miktexsetup-x64.zip", "miktex")
@@ -132,17 +141,29 @@ Write-Output "Building R package"

# R CMD check is not used for MSVC builds
if ($env:COMPILER -ne "MSVC") {
Run-R-Code-Redirect-Stderr "commandArgs <- function(...){'--skip-install'}; source('build_r.R')"; Check-Output $?

$PKG_FILE_NAME = Get-Item *.tar.gz
$PKG_FILE_NAME = $PKG_FILE_NAME -replace '[\\]', '/'
$PKG_FILE_NAME = "lightgbm_$env:LGB_VER.tar.gz"
$LOG_FILE_NAME = "lightgbm.Rcheck/00check.log"

if ($env:R_BUILD_TYPE -eq "cmake") {
Run-R-Code-Redirect-Stderr "commandArgs <- function(...){'--skip-install'}; source('build_r.R')"; Check-Output $?
} elseif ($env:R_BUILD_TYPE -eq "cran") {
Run-R-Code-Redirect-Stderr "result <- processx::run(command = 'sh', args = 'build-cran-package.sh', echo = TRUE, windows_verbatim_args = FALSE)" ; Check-Output $?
# Test CRAN source .tar.gz in a directory that is not this repo or below it.
# When people install.packages('lightgbm'), they won't have the LightGBM
# git repo around. This is to protect against the use of relative paths
# like ../../CMakeLists.txt that would only work if you are in the repoo
$R_CMD_CHECK_DIR = "tmp-r-cmd-check"
New-Item -Path "C:\" -Name $R_CMD_CHECK_DIR -ItemType "directory" > $null
Move-Item -Path "$PKG_FILE_NAME" -Destination "C:\$R_CMD_CHECK_DIR\" > $null
cd "C:\$R_CMD_CHECK_DIR\"
}

Write-Output "Running R CMD check as CRAN"
Run-R-Code-Redirect-Stderr "result <- processx::run(command = 'R.exe', args = c('CMD', 'check', '--no-multiarch', '--as-cran', '$PKG_FILE_NAME'), echo = TRUE, windows_verbatim_args = FALSE)" ; $check_succeeded = $?

Write-Output "R CMD check build logs:"
$INSTALL_LOG_FILE_NAME = "$env:BUILD_SOURCESDIRECTORY\lightgbm.Rcheck\00install.out"
$INSTALL_LOG_FILE_NAME = "lightgbm.Rcheck\00install.out"
Get-Content -Path "$INSTALL_LOG_FILE_NAME"

Check-Output $check_succeeded
@@ -174,15 +195,17 @@ if ($env:COMPILER -ne "MSVC") {
# Checking that we actually got the expected compiler. The R package has some logic
# to fail back to MinGW if MSVC fails, but for CI builds we need to check that the correct
# compiler was used.
$checks = Select-String -Path "${INSTALL_LOG_FILE_NAME}" -Pattern "Check for working CXX compiler.*$env:COMPILER"
if ($checks.Matches.length -eq 0) {
Write-Output "The wrong compiler was used. Check the build logs."
Check-Output $False
if ($env:R_BUILD_TYPE -eq "cmake") {
$checks = Select-String -Path "${INSTALL_LOG_FILE_NAME}" -Pattern "Check for working CXX compiler.*$env:COMPILER"
if ($checks.Matches.length -eq 0) {
Write-Output "The wrong compiler was used. Check the build logs."
Check-Output $False
}
}

# Checking that we got the right toolchain for MinGW. If using MinGW, both
# MinGW and MSYS toolchains are supported
if ($env:COMPILER -eq "MINGW") {
if (($env:COMPILER -eq "MINGW") -and ($env:R_BUILD_TYPE -eq "cmake")) {
$checks = Select-String -Path "${INSTALL_LOG_FILE_NAME}" -Pattern "Trying to build with.*$env:TOOLCHAIN"
if ($checks.Matches.length -eq 0) {
Write-Output "The wrong toolchain was used. Check the build logs."
43 changes: 42 additions & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -10,71 +10,110 @@ on:

jobs:
test:
name: ${{ matrix.task }} (${{ matrix.os }}, ${{ matrix.compiler }}, R ${{ matrix.r_version }})
name: ${{ matrix.task }} (${{ matrix.os }}, ${{ matrix.compiler }}, R ${{ matrix.r_version }}, ${{ matrix.build_type }})
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
include:
################
# CMake builds #
################
- os: ubuntu-latest
task: r-package
compiler: gcc
r_version: 3.6
build_type: cmake
- os: ubuntu-latest
task: r-package
compiler: gcc
r_version: 4.0
build_type: cmake
- os: ubuntu-latest
task: r-package
compiler: clang
r_version: 3.6
build_type: cmake
- os: ubuntu-latest
task: r-package
compiler: clang
r_version: 4.0
build_type: cmake
- os: ubuntu-latest
task: r-package-check-docs
compiler: gcc
r_version: 4.0
build_type: cmake
- os: macOS-latest
task: r-package
compiler: gcc
r_version: 3.6
build_type: cmake
- os: macOS-latest
task: r-package
compiler: gcc
r_version: 4.0
build_type: cmake
- os: macOS-latest
task: r-package
compiler: clang
r_version: 3.6
build_type: cmake
- os: macOS-latest
task: r-package
compiler: clang
r_version: 4.0
build_type: cmake
- os: windows-latest
task: r-package
compiler: MINGW
toolchain: MINGW
r_version: 3.6
build_type: cmake
- os: windows-latest
task: r-package
compiler: MINGW
toolchain: MSYS
r_version: 4.0
build_type: cmake
# Visual Studio 2017
- os: windows-2016
task: r-package
compiler: MSVC
toolchain: MSVC
r_version: 3.6
build_type: cmake
# Visual Studio 2019
- os: windows-2019
task: r-package
compiler: MSVC
toolchain: MSVC
r_version: 4.0
build_type: cmake
###############
# CRAN builds #
###############
- os: windows-latest
task: r-package
compiler: MINGW
toolchain: MSYS
r_version: 4.0
build_type: cran
- os: ubuntu-latest
task: r-package
compiler: gcc
r_version: 4.0
build_type: cran
- os: macOS-latest
task: r-package
compiler: clang
r_version: 4.0
build_type: cran
steps:
- name: Prevent conversion of line endings on Windows
if: startsWith(matrix.os, 'windows')
shell: pwsh
run: git config --global core.autocrlf false
- name: Checkout repository
uses: actions/checkout@v1
with:
@@ -100,6 +139,7 @@ jobs:
export PATH="$CONDA/bin:${HOME}/.local/bin:$PATH"
export LGB_VER=$(head -n 1 VERSION.txt)
export R_VERSION="${{ matrix.r_version }}"
export R_BUILD_TYPE="${{ matrix.build_type }}"
$GITHUB_WORKSPACE/.ci/setup.sh
$GITHUB_WORKSPACE/.ci/test.sh
- name: Use conda on Windows
@@ -114,6 +154,7 @@ jobs:
$env:BUILD_SOURCESDIRECTORY = $env:GITHUB_WORKSPACE
$env:TOOLCHAIN = "${{ matrix.toolchain }}"
$env:R_VERSION = "${{ matrix.r_version }}"
$env:R_BUILD_TYPE = "${{ matrix.build_type }}"
$env:COMPILER = "${{ matrix.compiler }}"
$env:GITHUB_ACTIONS = "true"
$env:TASK = "${{ matrix.task }}"
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -399,12 +399,15 @@ python-package/compile/
python-package/lightgbm/VERSION.txt

# R build artefacts
**/autom4te.cache/
R-package/docs
R-package/src/CMakeLists.txt
R-package/src/lib_lightgbm.so.dSYM/
R-package/src/src/
R-package/src-x64
R-package/src-i386
R-package/**/VERSION.txt
**/Makevars.win
lightgbm_r/*
lightgbm*.tar.gz
lightgbm.Rcheck/
5 changes: 5 additions & 0 deletions R-package/.Rbuildignore
Original file line number Diff line number Diff line change
@@ -14,6 +14,11 @@
^src/CMakeLists.txt$
^Makefile$
^src/build/.*$
^autom4te.cache/.*$

# files only used during development
AUTOCONF_UBUNTU_VERSION
^recreate-configure\.sh$

# unnecessary files from submodules
^src/compute/.appveyor.yml$
1 change: 1 addition & 0 deletions R-package/AUTOCONF_UBUNTU_VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
2.69-11
78 changes: 78 additions & 0 deletions R-package/README.md
Original file line number Diff line number Diff line change
@@ -6,6 +6,7 @@ LightGBM R-package
* [Installation](#installation)
* [Examples](#examples)
* [Testing](#testing)
* [Preparing a CRAN Package and Installing It](#preparing-a-cran-package-and-installing-it)
* [External Repositories](#external-unofficial-repositories)
* [Known Issues](#known-issues)

@@ -143,6 +144,83 @@ Rscript -e " \
"
```

Preparing a CRAN Package and Installing It
------------------------------------------

This section is primarily for maintainers, but may help users and contributors to understand the structure of the R package.

Most of `LightGBM` uses `CMake` to handle tasks like setting compiler and linker flags, including header file locations, and linking to other libraries. Because CRAN packages typically do not assume the presence of `CMake`, the R package uses an alternative method that is in the CRAN-supported toolchain for building R packages with C++ code: `Autoconf`.

For more information on this approach, see ["Writing R Extensions"](https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Configure-and-cleanup).

### Build a CRAN Package

From the root of the repository, run the following.

```shell
sh build-cran-package.sh
```

This will create a file `lightgbm_${VERSION}.tar.gz`, where `VERSION` is the version of `LightGBM`.

### Standard Installation from CRAN Package

After building the package, install it with a command like the following:

```shell
R CMD install lightgbm_*.tar.gz
```

#### Custom Installation (Linux, Mac)

To change the compiler used when installing the package, you can create a file `~/.R/Makevars` which overrides `CC` (`C` compiler) and `CXX` (`C++` compiler). For example, to use `gcc` instead of `clang` on Mac, you could use something like the following:

```make
# ~/.R/Makevars
CC=gcc-8
CXX=g++-8
CXX11=g++-8
```

### Changing the CRAN Package

A lot of details are handled automatically by `R CMD build` and `R CMD install`, so it can be difficult to understand how the files in the R package are related to each other. An extensive treatment of those details is available in ["Writing R Extensions"](https://cran.r-project.org/doc/manuals/r-release/R-exts.html).

This section briefly explains the key files for building a CRAN package. To update the package, edit the files relevant to your change and re-run the steps in [Build a CRAN Package](#build-a-cran-package).

**Linux or Mac**

At build time, `configure` will be run and used to create a file `Makevars`, using `Makevars.in` as a template.

1. Edit `configure.ac`
2. Create `configure` with `autoconf`. Do not edit it by hand. This file must be generated on Ubuntu 18.04.

If you have an Ubuntu 18.04 environment available, run the provided script from the root of the `LightGBM` repository.

```shell
./R-package/recreate-configure.sh
```

If you do not have easy access to an Ubuntu 18.04 environment, the `configure` script can be generated using Docker.

```shell
docker run \
-v $(pwd):/opt/LightGBM \
-t ubuntu:18.04 \
/bin/bash -c "cd /opt/LightGBM && ./R-package/recreate-configure.sh"
```

The version of `autoconf` used by this project is stored in `R-package/AUTOCONF_UBUNTU_VERSION`. To update that version, update that file and run the commands above. To see available versions, see https://packages.ubuntu.com/search?keywords=autoconf.

3. Edit `src/Makevars.in`

**Configuring for Windows**

At build time, `configure.win` will be run and used to create a file `Makevars.win`, using `Makevars.win.in` as a template.

1. Edit `configure.win` directly
2. Edit `src/Makevars.win.in`

External (Unofficial) Repositories
----------------------------------

2,992 changes: 2,992 additions & 0 deletions R-package/configure

Large diffs are not rendered by default.

137 changes: 137 additions & 0 deletions R-package/configure.ac
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
### configure.ac -*- Autoconf -*-
# Template used by Autoconf to generate 'configure' script. For more see:
# * https://unconj.ca/blog/an-autoconf-primer-for-r-package-authors.html
# * https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Configure-and-cleanup

AC_PREREQ(2.69)
AC_INIT([lightgbm], [m4_esyscmd_s([cat src/VERSION.txt])], [], [lightgbm], [])

###########################
# find compiler and flags #
###########################

AC_MSG_CHECKING([location of R])
AC_MSG_RESULT([${R_HOME}])

# set up CPP flags
# find the compiler and compiler flags used by R.
: ${R_HOME=`R HOME`}
if test -z "${R_HOME}"; then
echo "could not determine R_HOME"
exit 1
fi
CC=`"${R_HOME}/bin/R" CMD config CC`
CXX=`"${R_HOME}/bin/R" CMD config CXX11`
CFLAGS=`"${R_HOME}/bin/R" CMD config CFLAGS`
CPPFLAGS=`"${R_HOME}/bin/R" CMD config CPPFLAGS`

# LightGBM-specific flags
LGB_CPPFLAGS=""

###############
# MM_PREFETCH #
###############

AC_MSG_CHECKING([whether MM_PREFETCH works])
ac_mmprefetch=no
AC_LANG_CONFTEST(
[
AC_LANG_PROGRAM(
[[
#include <xmmintrin.h>
]],
[[
int main() {
int a = 0;
_mm_prefetch(&a, _MM_HINT_NTA);
return 0;
}
]]
)
]
)
${CC} -o conftest conftest.c 2>/dev/null && ./conftest && ac_mmprefetch=yes
AC_MSG_RESULT([${ac_mmprefetch}])
if test "${ac_mmprefetch}" = yes; then
LGB_CPPFLAGS+=" -DMM_PREFETCH=1"
fi

############
# MM_ALLOC #
############

AC_MSG_CHECKING([whether MM_MALLOC works])
ac_mm_malloc=no
AC_LANG_CONFTEST(
[
AC_LANG_PROGRAM(
[[
#include <mm_malloc.h>
]],
[[
int main() {
char *a = (char*)_mm_malloc(8, 16);
_mm_free(a);
return 0;
}
]]
)
]
)
${CC} -o conftest conftest.c 2>/dev/null && ./conftest && ac_mm_malloc=yes
AC_MSG_RESULT([${ac_mm_malloc}])
if test "${ac_mm_malloc}" = yes; then
LGB_CPPFLAGS+=" -DMM_MALLOC=1"
fi

##########
# OpenMP #
##########

OPENMP_CXXFLAGS=""

if test `uname -s` = "Linux"
then
OPENMP_CXXFLAGS="\$(SHLIB_OPENMP_CXXFLAGS)"
fi

if test `uname -s` = "Darwin"
then
OPENMP_CXXFLAGS='-Xclang -fopenmp'
OPENMP_LIB='-lomp'
ac_pkg_openmp=no
AC_MSG_CHECKING([whether OpenMP will work in a package])
AC_LANG_CONFTEST(
[
AC_LANG_PROGRAM(
[[
#include <omp.h>
]],
[[
return (omp_get_max_threads() <= 1);
]]
)
]
)
${CC} -o conftest conftest.c ${OPENMP_LIB} ${OPENMP_CXXFLAGS} 2>/dev/null && ./conftest && ac_pkg_openmp=yes
AC_MSG_RESULT([${ac_pkg_openmp}])
if test "${ac_pkg_openmp}" = no; then
OPENMP_CXXFLAGS=''
OPENMP_LIB=''
echo '***********************************************************************************************'
echo ' OpenMP is unavailable on this macOS system. LightGBM code will run single-threaded as a result.'
echo ' To use all CPU cores for training jobs, you should install OpenMP by running'
echo ''
echo ' brew install libomp'
echo '***********************************************************************************************'
fi
fi

# substitute variables from this script into Makevars.in
AC_SUBST(OPENMP_CXXFLAGS)
AC_SUBST(OPENMP_LIB)
AC_SUBST(LGB_CPPFLAGS)
AC_CONFIG_FILES([src/Makevars])

# write out Autoconf output
AC_OUTPUT
63 changes: 63 additions & 0 deletions R-package/configure.win
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Script used to generate `Makevars.win` from `Makevars.win.in`
# on Windows

###########################
# find compiler and flags #
###########################

R_SCRIPT="${R_HOME}/bin${R_ARCH_BIN}/Rscript"
R_EXE="${R_HOME}/bin${R_ARCH_BIN}/R"
CC=`"${R_EXE}" CMD config CC`

# LightGBM-specific flags
LGB_CPPFLAGS=""

###############
# MM_PREFETCH #
###############

ac_mm_prefetch="no"

cat > conftest.c <<EOL
#include <xmmintrin.h>
int main() {
int a = 0;
_mm_prefetch(&a, _MM_HINT_NTA);
return 0;
}
EOL

${CC} -o conftest conftest.c 2>/dev/null && ./conftest && ac_mm_prefetch="yes"
echo "checking whether MM_PREFETCH works...${ac_mm_prefetch}"

if test "${ac_mm_prefetch}" = "yes";
then
LGB_CPPFLAGS="${LGB_CPPFLAGS} -DMM_PREFETCH=1"
fi

############
# MM_ALLOC #
############
ac_mm_malloc="no"

cat > conftest.c <<EOL
#include <mm_malloc.h>
int main() {
char *a = (char*)_mm_malloc(8, 16);
_mm_free(a);
return 0;
}
EOL

${CC} -o conftest conftest.c 2>/dev/null && ./conftest && ac_mm_malloc="yes"
echo "checking whether MM_MALLOC works...${ac_mm_malloc}"

if test "${ac_mm_malloc}" = "yes";
then
LGB_CPPFLAGS="${LGB_CPPFLAGS} -DMM_MALLOC=1"
fi

# Generate Makevars.win from Makevars.win.in
sed -e \
"s/@LGB_CPPFLAGS@/$LGB_CPPFLAGS/" \
< src/Makevars.win.in > src/Makevars.win
File renamed without changes.
File renamed without changes.
24 changes: 24 additions & 0 deletions R-package/recreate-configure.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/bin/bash

# recreates 'configure' from 'configure.ac'
# this script should run on Ubuntu 18.04
AUTOCONF_VERSION=$(cat R-package/AUTOCONF_UBUNTU_VERSION)

echo "Creating 'configure' script with Autoconf ${AUTOCONF_VERSION}"

apt update
apt-get install \
--no-install-recommends \
-y \
autoconf=${AUTOCONF_VERSION}

cp VERSION.txt R-package/src/
cd R-package
autoconf \
--output configure \
configure.ac \
|| exit -1

rm -r autom4te.cache || echo "no autoconf cache found"

echo "done creating 'configure' script"
56 changes: 56 additions & 0 deletions R-package/src/Makevars.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
CXX_STD = CXX11

PKGROOT=.

LGB_CPPFLAGS = \
@LGB_CPPFLAGS@ \
-DUSE_SOCKET \
-DLGB_R_BUILD

CPICFLAGS = -fPIC

PKG_CPPFLAGS = \
-I$(PKGROOT)/include \
$(LGB_CPPFLAGS)

PKG_CXXFLAGS = \
@OPENMP_CXXFLAGS@ \
-pthread

PKG_LIBS = \
@OPENMP_CXXFLAGS@ \
@OPENMP_LIB@ \
-pthread

OBJECTS = \
application/application.o \
boosting/boosting.o \
boosting/gbdt.o \
boosting/gbdt_model_text.o \
boosting/gbdt_prediction.o \
boosting/prediction_early_stop.o \
io/bin.o \
io/config.o \
io/config_auto.o \
io/dataset.o \
io/dataset_loader.o \
io/file_io.o \
io/json11.o \
io/metadata.o \
io/parser.o \
io/tree.o \
metric/dcg_calculator.o \
metric/metric.o \
objective/objective_function.o \
network/linker_topo.o \
network/linkers_mpi.o \
network/linkers_socket.o \
network/network.o \
treelearner/data_parallel_tree_learner.o \
treelearner/feature_parallel_tree_learner.o \
treelearner/gpu_tree_learner.o \
treelearner/serial_tree_learner.o \
treelearner/tree_learner.o \
treelearner/voting_parallel_tree_learner.o \
c_api.o \
lightgbm_R.o
57 changes: 57 additions & 0 deletions R-package/src/Makevars.win.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
CXX_STD = CXX11

PKGROOT=.

LGB_CPPFLAGS = \
@LGB_CPPFLAGS@ \
-DUSE_SOCKET \
-DLGB_R_BUILD

CPICFLAGS = -fPIC

PKG_CPPFLAGS = \
-I$(PKGROOT)/include \
$(LGB_CPPFLAGS)

PKG_CXXFLAGS = \
${SHLIB_OPENMP_CXXFLAGS} \
${SHLIB_PTHREAD_FLAGS}

PKG_LIBS = \
${SHLIB_OPENMP_CXXFLAGS} \
${SHLIB_PTHREAD_FLAGS} \
-lws2_32 \
-lIphlpapi

OBJECTS = \
application/application.o \
boosting/boosting.o \
boosting/gbdt.o \
boosting/gbdt_model_text.o \
boosting/gbdt_prediction.o \
boosting/prediction_early_stop.o \
io/bin.o \
io/config.o \
io/config_auto.o \
io/dataset.o \
io/dataset_loader.o \
io/file_io.o \
io/json11.o \
io/metadata.o \
io/parser.o \
io/tree.o \
metric/dcg_calculator.o \
metric/metric.o \
objective/objective_function.o \
network/linker_topo.o \
network/linkers_mpi.o \
network/linkers_socket.o \
network/network.o \
treelearner/data_parallel_tree_learner.o \
treelearner/feature_parallel_tree_learner.o \
treelearner/gpu_tree_learner.o \
treelearner/serial_tree_learner.o \
treelearner/tree_learner.o \
treelearner/voting_parallel_tree_learner.o \
c_api.o \
lightgbm_R.o
90 changes: 90 additions & 0 deletions build-cran-package.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
#!/bin/sh

# [description]
# Prepare a source distribution of the R package
# to be submitted to CRAN.
#
# [usage]
# sh build-cran-package.sh

set -e

ORIG_WD=$(pwd)
TEMP_R_DIR=$(pwd)/lightgbm_r

if test -d ${TEMP_R_DIR}; then
rm -r ${TEMP_R_DIR}
fi
mkdir -p ${TEMP_R_DIR}

# move relevant files
cp -R R-package/* ${TEMP_R_DIR}
cp -R include ${TEMP_R_DIR}/src/
cp -R src/* ${TEMP_R_DIR}/src/
cp VERSION.txt ${TEMP_R_DIR}/src/

cd ${TEMP_R_DIR}

# Remove files not needed for CRAN
echo "Removing files not needed for CRAN"
rm src/install.libs.R
rm -r src/cmake/
rm -r inst/
rm -r pkgdown/
rm AUTOCONF_UBUNTU_VERSION
rm recreate-configure.sh

# main.cpp is used to make the lightgbm CLI, unnecessary
# for the R package
rm src/main.cpp

# Remove 'region' and 'endregion' pragmas. This won't change
# the correctness of the code. CRAN does not allow you
# to use compiler flag '-Wno-unknown-pragmas' or
# pragmas that suppress warnings.
echo "Removing unknown pragmas in headers"
for file in src/include/LightGBM/*.h; do
sed \
-i.bak \
-e 's/^.*#pragma region.*$//' \
-e 's/^.*#pragma endregion.*$//' \
"${file}"
done
rm src/include/LightGBM/*.h.bak

# When building an R package with 'configure', it seems
# you're guaranteed to get a shared library called
# <packagename>.so/dll. The package source code expects
# 'lib_lightgbm.so', not 'lightgbm.so', to comply with the way
# this project has historically handled installation
echo "Changing lib_lightgbm to lightgbm"
for file in R/*.R; do
sed \
-i.bak \
-e 's/lib_lightgbm/lightgbm/' \
"${file}"
done
sed \
-i.bak \
-e 's/lib_lightgbm/lightgbm/' \
NAMESPACE

# 'processx' is listed as a 'Suggests' dependency in DESCRIPTION
# because it is used in install.libs.R, a file that is not
# included in the CRAN distribution of the package
sed \
-i.bak \
'/processx/d' \
DESCRIPTION

echo "Cleaning sed backup files"
rm R/*.R.bak
rm NAMESPACE.bak

cd ${ORIG_WD}

R CMD build \
--keep-empty-dirs \
lightgbm_r

echo "Done building R package"
26 changes: 25 additions & 1 deletion build_r.R
Original file line number Diff line number Diff line change
@@ -13,7 +13,7 @@ TEMP_SOURCE_DIR <- file.path(TEMP_R_DIR, "src")
# R returns FALSE (not a non-zero exit code) if a file copy operation
# breaks. Let's fix that
.handle_result <- function(res) {
if (!res) {
if (!all(res)) {
stop("Copying files failed!")
}
}
@@ -70,6 +70,20 @@ result <- file.copy(
)
.handle_result(result)

# Add blank Makevars files
result <- file.copy(
from = file.path(TEMP_R_DIR, "inst", "Makevars")
, to = file.path(TEMP_SOURCE_DIR, "Makevars")
, overwrite = TRUE
)
.handle_result(result)
result <- file.copy(
from = file.path(TEMP_R_DIR, "inst", "Makevars.win")
, to = file.path(TEMP_SOURCE_DIR, "Makevars.win")
, overwrite = TRUE
)
.handle_result(result)

result <- file.copy(
from = "include/"
, to = sprintf("%s/", TEMP_SOURCE_DIR)
@@ -101,6 +115,16 @@ result <- file.copy(
)
.handle_result(result)

# remove CRAN-specific files
result <- file.remove(
file.path(TEMP_R_DIR, "configure")
, file.path(TEMP_R_DIR, "configure.ac")
, file.path(TEMP_R_DIR, "configure.win")
, file.path(TEMP_SOURCE_DIR, "Makevars.in")
, file.path(TEMP_SOURCE_DIR, "Makevars.win.in")
)
.handle_result(result)

# copy files into the place CMake expects
for (src_file in c("lightgbm_R.cpp", "lightgbm_R.h", "R_object_helper.h")) {
result <- file.copy(
10 changes: 6 additions & 4 deletions include/LightGBM/utils/common.h
Original file line number Diff line number Diff line change
@@ -35,10 +35,12 @@
#include <malloc.h>
#elif MM_MALLOC
#include <mm_malloc.h>
#elif defined(__GNUC__)
#include <malloc.h>
#define _mm_malloc(a, b) memalign(b, a)
#define _mm_free(a) free(a)
// https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html
// https://www.oreilly.com/library/view/mac-os-x/0596003560/ch05s01s02.html
#elif defined(__GNUC__) && defined(HAVE_MALLOC_H)
#include <malloc.h>
#define _mm_malloc(a, b) memalign(b, a)
#define _mm_free(a) free(a)
#else
#include <stdlib.h>
#define _mm_malloc(a, b) malloc(a)