Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] [docs] add intro vignette #3946

Merged
merged 102 commits into from
Nov 6, 2021
Merged

[R-package] [docs] add intro vignette #3946

merged 102 commits into from
Nov 6, 2021

Conversation

mayer79
Copy link
Contributor

@mayer79 mayer79 commented Feb 12, 2021

First step towards #1944.

This initial vignette shows the very basic workflow of LightGBM. As such, it would not replace the demo with the corresponding name as that seems rather a guide how to deal with sparse data.

The html looks as in this zip here:

basic_walkthrough.zip

Some notes on R vignettes in general

  • CRAN does not rebuild the vignette html, but it still runs its R code.
  • While it is possible to use a logo, it is non-trivial.
  • Would you stick to the same code style as in the package code? I am asking because LightGBM is the only project that use R code with comma first.
  • The vignette html will be created automatically when building the package. Since the whole building process of LGB is complex, I have no idea if this works out as I hope... still, our colleagues here https://cran.r-project.org/web/packages/xgboost/index.html were successful.

@mayer79 mayer79 requested a review from StrikerRUS as a code owner February 12, 2021 10:38
@mayer79

This comment has been minimized.

@StrikerRUS StrikerRUS removed their request for review February 12, 2021 12:02
@StrikerRUS StrikerRUS added the doc label Feb 12, 2021
@jameslamb jameslamb changed the title added intro vignette [R-package] add intro vignette Feb 12, 2021
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for taking the time to set this up!! I left a first round of comments for your consideration. Later tonight I'll try pulling this and knitting the vignettes myself, and I also try pkgdown::build_site() to see what that output looks like.

R-package/vignettes/.gitignore Outdated Show resolved Hide resolved
R-package/vignettes/basic_walkthrough.Rmd Outdated Show resolved Hide resolved
R-package/vignettes/basic_walkthrough.Rmd Outdated Show resolved Hide resolved
R-package/vignettes/basic_walkthrough.Rmd Outdated Show resolved Hide resolved
R-package/vignettes/basic_walkthrough.Rmd Outdated Show resolved Hide resolved
R-package/DESCRIPTION Outdated Show resolved Hide resolved
R-package/vignettes/basic_walkthrough.Rmd Outdated Show resolved Hide resolved
R-package/vignettes/basic_walkthrough.Rmd Outdated Show resolved Hide resolved
R-package/vignettes/basic_walkthrough.Rmd Outdated Show resolved Hide resolved
.github/workflows/r_package.yml Outdated Show resolved Hide resolved
@jameslamb

This comment has been minimized.

mayer79 and others added 19 commits February 12, 2021 17:45
@jameslamb jameslamb changed the title WIP: [R-package] add intro vignette [R-package] add intro vignette Nov 4, 2021
@jameslamb jameslamb changed the title [R-package] add intro vignette [R-package] [docs] add intro vignette Nov 4, 2021
@jameslamb jameslamb marked this pull request as ready for review November 4, 2021 03:40
@jameslamb jameslamb marked this pull request as draft November 4, 2021 03:45
@jameslamb
Copy link
Collaborator

Sorry, I know I just marked this "Ready for Review" but I want to move it back to Draft while I investigate one other thing. I think that I might be able to make this work with CMake-based builds, which would simplify this PR a lot.

@jameslamb
Copy link
Collaborator

jameslamb commented Nov 6, 2021

/gha run r-valgrind

Workflow R valgrind tests has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/1428180191

Status: failure ❌.

@jameslamb
Copy link
Collaborator

jameslamb commented Nov 6, 2021

/gha run r-solaris

Workflow Solaris CRAN check has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/1428180333

solaris-x86-patched: https://builder.r-hub.io/status/lightgbm_3.3.1.99.tar.gz-7206a1d9c07147dbb811520a9b1e43a3
solaris-x86-patched-ods: https://builder.r-hub.io/status/lightgbm_3.3.1.99.tar.gz-e601a57043db4b51b5f090b508baa703
Reports also have been sent to LightGBM public e-mail: https://yopmail.com?lightgbm_rhub_checks
Status: success ✔️.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think this is ready for review! Since so much of the code on this branch is now mine, I don't think my approval should count toward a merge.

@StrikerRUS @Laurae2 when you have time, could you please take a look?

@mayer79 I'd also welcome any suggestions you have on the changes I've pushed.


There has been a lot of discussion and comments on this PR as we worked through it, so I'll try to summarize the key points here.

There is also a section in "Writing R Extensions" dedicated to building vignettes, which could be helpful for reviewing: https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Writing-package-vignettes.

1. Building {lightgbm} now requires installing it

details (click me)

The package has to now be installed during R CMD build in order to build vignettes.

From "Writing R Extensions":

R CMD build will automatically create the (PDF or HTML versions of the) vignettes in inst/doc for distribution with the package sources. By including the vignette outputs in the package sources it is not necessary that these can be re-built at install time, i.e., the package author can use private R packages, screen snapshots and LaTeX extensions which are only available on their machine.

That means that code like this:

sh build-cran-package.sh
R CMD INSTALL lightgbm_*.tar.gz

will now end up compiling LightGBM twice.

This makes our CI jobs take a bit longer, but the effect on users should be minimal. Vignettes won't be re-built when anyone runs install.packages() to install from CRAN. This PR introduces the option --no-build-vignettes to build-cran-package.sh and build_r.R to allow people building from source to opt out of rebuilding the vignettes.

It also means that all of {lightgbm}'s dependencies have to be installed before running R CMD build.

2. {lightgbm} now depends on {knitr} and {rmarkdown}

details (click me)

I'm proposing adding these as Suggests dependencies. They're not necessary to install the package so users shouldn't be impacted. You can see from the list of "Reverse Suggests" at https://cran.r-project.org/web/packages/knitr/index.html and https://cran.r-project.org/web/packages/rmarkdown/index.html that this is a common pattern.

3. build-cran-package.sh now needs to carefully remove object files from the package tarball

details (click me)

See this comment from early in the development on this PR: #3946 (comment).

The installation of the package during R CMD build leaves behind .o files, and these cannot be distributed with the package. Adding rules to .Rbuildignore did not prevent this issue. Neither did adding a clean: target to src/Makevars.in.

I believe the issue is that R's own code automatically cleans up .o files during this stage, but only if they're in the current working directory at build time.

Notice that the objects listed in #3946 (comment) are only those nested under src/... lightgbm_R.o and c_api.o are remmoved correctly.

R CMD build calls an internal function cleanup_pkg() here:

https://github.com/wch/r-source/blob/b2498c9792278f68aa56bc65fab45b0ff9444c53/src/library/tools/R/build.R#L436

But that function's logic will only clean up *.o / *.so files one level below the current working directory!

https://github.com/wch/r-source/blob/b2498c9792278f68aa56bc65fab45b0ff9444c53/src/library/tools/R/build.R#L465-L468

I'm planning to open a bug report with R, but even if that goes well it'll be a long time (like on the order of years) until all versions of R that LightGBM supports have a fix.

So for now, I'm proposing adding code in build-cran-package.sh that untars the package, removes those .o files, and re-tars it. That code could be removed in the future if #3249 is implemented and the R package uses a unity build instead (as @hcho3 suggested in #3188 (comment)).

Thanks very much!

@jameslamb jameslamb marked this pull request as ready for review November 6, 2021 04:34
@jameslamb jameslamb self-requested a review November 6, 2021 04:35
@jameslamb jameslamb dismissed their stale review November 6, 2021 04:35

dismissing request-changes review to unblock eventual merging

Comment on lines +276 to +279
r-knitr=1.35=r41hc72bb7e_0 \
r-matrix=1.3_4=r41he454529_0 \
r-pkgdown=1.6.1=r41hc72bb7e_0 \
r-rmarkdown=2.11=r41hc72bb7e_0 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't check these changes at RTD because this PR is made from a fork. I suggest re-target this PR's branch to new let's say dev branch and merge it instantly. Then I'll be able to activate RTD test builds for that branch. After all checks passed we merge dev into master via new PR. WDYT about this plan?

Also, I guess we need some changes in our pkgdown config file to make vignettes publicly available and rendered at our docs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest re-target this PR's branch to new let's say dev branch and merge it instantly

Yep I agree! Good idea. I'll make that dev branch right now and re-target this.

I guess we need some changes in our pkgdown config file

Ah you're right! According to https://pkgdown.r-lib.org/articles/pkgdown.html#articles, {pkgdown} always automatically builds the HTML for vignettes, but I think since {lightgbm}'s config explicitly sets the navigation, we need to explicitly add a navbar item for it.

Added in ddd9a62.

Tested locally with

sh build-cran-package.sh
R CMD INSTALL --with-keep.source lightgbm_3.3.1.99.tar.gz
cd R-package
Rscript -e "pkgdown::build_site( \
            lazy = FALSE \
            , install = FALSE \
            , devel = FALSE \
            , examples = TRUE \
            , run_dont_run = TRUE \
            , seed = 42L \
            , preview = FALSE \
            , new_process = TRUE \
        )
        "
open docs/index.html

Vignettes end up on an "Articles" page like this:

image

And then clicking through the link there, I see

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pointed this at a new LightGBM branch, dev-vignettes. Decided not to literally use "dev", just because that has a special meaning in some git workflows, and I didn't want anyone coming to this repo to misinterpret and start building from that branch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this over to #4775

@jameslamb jameslamb changed the base branch from master to dev-vignettes November 6, 2021 15:01
@jameslamb jameslamb merged commit 5026ea7 into microsoft:dev-vignettes Nov 6, 2021
jameslamb added a commit that referenced this pull request Nov 18, 2021
* [R-package] [docs] add intro vignette (#3946)

* add 10 test vignettes

* Revert "add 10 test vignettes"

This reverts commit 40fb2e2.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <[email protected]>

Co-authored-by: Michael Mayer <[email protected]>
Co-authored-by: Nikita Titov <[email protected]>
@StrikerRUS StrikerRUS mentioned this pull request Jan 6, 2022
13 tasks
@jameslamb jameslamb mentioned this pull request Oct 7, 2022
40 tasks
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants