(Stack: Javascript, NodeJS, Notion API, Makefile, Pandoc)
Minimal example to
- fetch a set of Notion pages (from a database table) (via official Notion API + JS client)
- for each Notion page
- (with notion-to-md) parse page as markdown (with additional metadata parsing)
- output markdown as file (with
fs) tooutput/*
- with
pandocconvert each page to PDF & Docx inoutput/*
installation of these dependencies is OS-specific, so it won't be covered by
npm install
pandocpandocneedspdflatexor a different PDF engine (specify with--pdf-engine)- FWIW,
tinytexis a minimal solution
- FWIW,
puppeteeron Ubuntu depends onlibgbm.so.1, which might be absent- run
sudo apt-get install -y libgbm-dev
- run
Heads up: Locally, for each page a
meta.jsonand a<page>.jsonfile are produced. They are not part of the repository via.gitignore. These files store raw Notion data and leak the page IDs among other things. Please consider this before you put your outputs in a public repo.
Notion Source DB: https://fubits.notion.site/DB-Multiple-Pages-1890d20f20d04793a50bdaec3bd8200d
Create Notion Database Table and add a few pages or create a single Notion page (see .env file). Share the db/page with your integration account.
add to local
.envfile:
NOTION_API_KEY=<your-api-token>
SINGLE_PAGE=<your-page-id>
DB_TABLE=<your-database-table-id>
To parse a single page (SINGLE_PAGE, without further processing) run:
npm run export-md-singleTo parse all pages in a database (DB_TABLE) table run:
npm run export-md-allPandoc needs a PDF processing library (i.e.
texliveon Ubuntu); Setup is OS-specific.
To convert the single page .md file (path is hardcoded in the makefile) to PDF & Docx run:
make bake-singleTo convert all .md files in output/* to PDF & Docx run:
make bake-allThe PDF
maketask now demonstrates both, pandoc andpagedjs/pagedjs-clito produce the PDFs.pagedjsrenders the PDF fromhtml(+css).
- add parsing step to download external linked resources (images, videos, etc.); download files and set relative links
- make Node's
exec(makefile)run with relative filepaths - use pagedjs-cli as PDF engine for Pandoc