Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/jekyll-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ jobs:
ruby-version: '3.4.5'
bundler-cache: true
- run: |
JEKYLL_FATAL_LINK_CHECKER=internal bundle exec jekyll build --future
JEKYLL_FATAL_LINK_CHECKER=internal bundle exec jekyll build --future
6 changes: 6 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,12 @@ gem 'typhoeus'
gem 'activesupport', '~> 7'
gem 'mustache', '~> 1'

# PDF Generator (optional - requires Node.js)
# Install with: ENABLE_PDF_GENERATION=true bundle install
if ENV['ENABLE_PDF_GENERATION'] == 'true'
gem 'grover', '~> 1.3'
end

group :development, :test do
gem 'rspec'
gem 'rubocop', '~> 1.44', require: false
Expand Down
25 changes: 25 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -329,11 +329,36 @@ plugins:
- jekyll-redirect-from
- jekyll-sitemap
- jekyll-spec-insert
- pdf_generator_loader

# This format has to conform to RFC822
last-modified-at:
date-format: '%a, %d %b %Y %H:%M:%S %z'

# PDF Generator Configuration
pdf_generator:
enabled: true
# Generate PDFs for entire collections
collections:
- getting-started
- install-and-configure
- api-reference
- query-dsl
- aggregations
- mappings
- analyzers
# Generate PDFs for specific guides (more granular control)
guides:
- name: "Getting Started Guide"
collection: getting-started
filename: "getting-started-guide.pdf"
- name: "Installation Guide"
collection: install-and-configure
filename: "installation-guide.pdf"
- name: "API Reference"
collection: api-reference
filename: "api-reference.pdf"

# Exclude from processing.
# The following items will not be processed, by default. Create a custom list
# to override the default setting.
Expand Down
88 changes: 88 additions & 0 deletions _pdf_generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# PDF Generator for OpenSearch Documentation

Check failure on line 1 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'PDF Generator for OpenSearch Documentation' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'PDF Generator for OpenSearch Documentation' is a heading and should be in sentence case.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 1, "column": 3}}}, "severity": "ERROR"}

This plugin generates PDF versions of documentation collections during the Jekyll build process.

## File Structure

Check failure on line 5 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'File Structure' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'File Structure' is a heading and should be in sentence case.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 5, "column": 4}}}, "severity": "ERROR"}

All PDF generator code is contained in the `_pdf_generator/` directory:
- `pdf_generator.rb` - Main plugin implementation
- `README.md` - This documentation file

A minimal loader file exists in `_plugins/pdf_generator_loader.rb` to ensure Jekyll loads the plugin (Jekyll requires plugins to be in `_plugins` or be gems).

## Overview

The PDF generator creates downloadable PDF files for documentation collections and guides. PDFs are generated automatically during the Jekyll build and are saved to the `pdfs/` directory in the site destination.

## Configuration

PDF generation is configured in `_config.yml` under the `pdf_generator` section:

```yaml
pdf_generator:
enabled: true
# Generate PDFs for entire collections
collections:
- getting-started
- install-and-configure
- api-reference
# Generate PDFs for specific guides (more granular control)
guides:
- name: "Getting Started Guide"
collection: getting-started
filename: "getting-started-guide.pdf"
- name: "Installation Guide"
collection: install-and-configure
filename: "installation-guide.pdf"
```

### Configuration Options

Check failure on line 39 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Configuration Options' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Configuration Options' is a heading and should be in sentence case.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 39, "column": 5}}}, "severity": "ERROR"}

- `enabled`: Set to `true` to enable PDF generation, `false` to disable
- `collections`: Array of collection names to generate PDFs for (PDF filename will be `{collection-name}.pdf`)
- `guides`: Array of guide configurations with:
- `name`: Display name for the guide
- `collection`: Collection name to generate PDF from
- `filename`: Output PDF filename (optional, defaults to `{name}.pdf`)
- `start_page`: Optional URL or path to start from (for partial guides)

## How It Works

Check failure on line 49 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'How It Works' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'How It Works' is a heading and should be in sentence case.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 49, "column": 4}}}, "severity": "ERROR"}

1. During Jekyll build, the PDF generator plugin identifies configured collections/guides
2. After all pages are rendered, the plugin collects the rendered HTML content
3. HTML is cleaned and formatted for PDF output
4. PDFs are generated using Grover (Puppeteer-based PDF generation)
5. PDFs are saved to `_site/pdfs/` directory

## Dependencies

- `grover` gem: Ruby wrapper for Puppeteer (requires Node.js and Chrome/Chromium)
- `puppeteer`: Node.js package (installed automatically by grover)

Check failure on line 60 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: grover. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: grover. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 60, "column": 60}}}, "severity": "ERROR"}

## Accessing Generated PDFs

Check failure on line 62 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Accessing Generated PDFs' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Accessing Generated PDFs' is a heading and should be in sentence case.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 62, "column": 4}}}, "severity": "ERROR"}

Generated PDFs are available at:
- Local build: `http://localhost:4000/pdfs/{filename}.pdf`
- Production: `https://docs.opensearch.org/pdfs/{filename}.pdf`

## Troubleshooting

### PDF Generation Fails

Check failure on line 70 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'PDF Generation Fails' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'PDF Generation Fails' is a heading and should be in sentence case.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 70, "column": 5}}}, "severity": "ERROR"}

Check failure on line 70 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings. Raw Output: {"message": "[OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 70, "column": 1}}}, "severity": "ERROR"}

1. Ensure `grover` gem is installed: `bundle install`
2. Ensure Node.js is installed (required for Puppeteer)
3. Check Jekyll build logs for error messages
4. Verify collection names in configuration match actual collection names

### PDF Content Issues

Check failure on line 77 in _pdf_generator/README.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'PDF Content Issues' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'PDF Content Issues' is a heading and should be in sentence case.", "location": {"path": "_pdf_generator/README.md", "range": {"start": {"line": 77, "column": 5}}}, "severity": "ERROR"}

- The plugin automatically extracts main content and removes navigation elements
- If content is missing, check that documents have `title` and are not excluded with `nav_exclude: true`
- Documents are sorted by `nav_order` if available

## Customization

PDF styling can be customized by modifying the `pdf_styles` method in `pdf_generator.rb`.

PDF options (page size, margins, headers/footers) can be customized in the `pdf_options` method.

Loading
Loading