diff --git a/docs/README.md b/docs/README.md index 6bb83d8953057..7c406fb9cd766 100644 --- a/docs/README.md +++ b/docs/README.md @@ -26,13 +26,20 @@ Read on to learn more about viewing documentation in plain text (i.e., markdown) documentation yourself. Why build it yourself? So that you have the docs that correspond to whichever version of Spark you currently have checked out of revision control. -## Prerequisites +## Building Documentation +There are two ways to build Spark documentation, complete and partial. Complete will build a site similar +to the main documentation site at https://spark.apache.org/documentation.html. Partial documentation build is for +a specific language or API, are also possible. + +### Prerequisites The Spark documentation build uses a number of tools to build HTML docs and API docs in Scala, Java, Python, R and SQL. -You need to have [Ruby](https://www.ruby-lang.org/en/documentation/installation/) and -[Python](https://docs.python.org/2/using/unix.html#getting-and-installing-the-latest-version-of-python) +For complete documentation all tools below must be installed **including Optionals**. + +You need to have the JDK, Scala, [Ruby](https://www.ruby-lang.org/en/documentation/installation/) and +[Python](https://docs.python.org/3.8/using/unix.html#getting-and-installing-the-latest-version-of-python) installed. Make sure the `bundle` command is available, if not install the Gem containing it: ```sh @@ -66,11 +73,15 @@ $ sudo pip install 'sphinx<3.1.0' mkdocs numpy pydata_sphinx_theme ipython nbsph ### R API Documentation (Optional) -If you'd like to generate R API documentation, you'll need to [install Pandoc](https://pandoc.org/installing.html) -and install these libraries: +If you'd like to generate R API documentation, you'll need to install these packages and libraries: + +```sh +$ sudo apt install libssl-dev libcurl4-openssl-dev pandoc libfontconfig1-dev libharfbuzz-dev \ + libfribidi-dev libfreetype6-dev libpng-dev libtiff5-dev libjpeg-dev libxml2-dev +``` ```sh -$ sudo Rscript -e 'install.packages(c("knitr", "devtools", "testthat", "rmarkdown"), repos="https://cloud.r-project.org/")' +$ sudo Rscript -e 'install.packages(c("curl", "knitr", "devtools", "testthat", "rmarkdown", "markdown", "e1071"), repos="https://cloud.r-project.org/")' $ sudo Rscript -e 'devtools::install_version("roxygen2", version = "7.1.2", repos="https://cloud.r-project.org/")' $ sudo Rscript -e "devtools::install_version('pkgdown', version='2.0.1', repos='https://cloud.r-project.org')" $ sudo Rscript -e "devtools::install_version('preferably', version='0.4', repos='https://cloud.r-project.org')" @@ -89,17 +100,7 @@ you have checked out or downloaded. In this directory you will find text files formatted using Markdown, with an ".md" suffix. You can read those text files directly if you want. Start with `index.md`. -Execute `SKIP_API=1 bundle exec jekyll build` from the `docs/` directory to compile the site. Compiling the site with -Jekyll will create a directory called `_site` containing `index.html` as well as the rest of the -compiled files. - -```sh -$ cd docs -# Skip generating API docs (which takes a while) -$ SKIP_API=1 bundle exec jekyll build -``` - -You can also generate the default Jekyll build with API Docs as follows: +You can generate the complete website from the `docs/` directory as follows: ```sh $ bundle exec jekyll build @@ -111,7 +112,7 @@ $ bundle exec jekyll serve --watch $ PRODUCTION=1 bundle exec jekyll build ``` -## API Docs (Scaladoc, Javadoc, Sphinx, roxygen2, MkDocs) +## Generating individual API Docs (Scaladoc, Javadoc, Sphinx, roxygen2, MkDocs) You can build just the Spark scaladoc and javadoc by running `./build/sbt unidoc` from the `$SPARK_HOME` directory. @@ -129,6 +130,14 @@ The jekyll plugin also generates the PySpark docs using [Sphinx](http://sphinx-d using [roxygen2](https://cran.r-project.org/web/packages/roxygen2/index.html) and SQL docs using [MkDocs](https://www.mkdocs.org/). -NOTE: To skip the step of building and copying over the Scala, Java, Python, R and SQL API docs, run `SKIP_API=1 -bundle exec jekyll build`. In addition, `SKIP_SCALADOC=1`, `SKIP_PYTHONDOC=1`, `SKIP_RDOC=1` and `SKIP_SQLDOC=1` can be used +NOTE: To skip the step of building and copying over the Scala, Java, Python, R and SQL API docs, see below example. +In addition, `SKIP_SCALADOC=1`, `SKIP_PYTHONDOC=1`, `SKIP_RDOC=1` and `SKIP_SQLDOC=1` can be used to skip a single step of the corresponding language. `SKIP_SCALADOC` indicates skipping both the Scala and Java docs. + +For example: + +```sh +$ cd docs +# Skip generating API docs (which takes a while) +$ SKIP_API=1 bundle exec jekyll build +```