Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandoc filters need to know the document's path relative to the working directory #8492

Open
rnwst opened this issue Dec 17, 2022 · 3 comments

Comments

@rnwst
Copy link
Sponsor Contributor

rnwst commented Dec 17, 2022

Problem
The Pandoc filter documentation includes an example on how to include external files in a document. I currently have a similar problem—I'm writing a filter which inlines SVGs for HTML output (it checks if images in the markdown source have a .svg extension and replaces the image with a RawInline containing the SVG's contents). This requires finding and loading the SVG file. When the markdown source file is in the same directory, everything works fine:
file.md:

---
title: A Test
---

![Caption](test.svg)
$ pandoc file.md -f markdown -t html --filter my-filter -o dir/output.html

However, if the markdown file is not in the working directory, the filter expectedly cannot find the SVG file, since its path is relative, not absolute, and the filter is executed from the working directory, not the document directory:

$ pandoc dir/file.md -f markdown -t html --filter my-filter -o output.html
Error printed by my-filter: Cannot find file test.svg!

The same issue would occur in the example in the documentation if the markdown file to be processed was in another directory.

Describe your proposed improvement
There should be a mechanism to let filters know the document's path relative to the working directory. That way, if a file referenced in file.md was using a relative path, the correct path relative to the working directory could be computed by the filter. I propose to add an environment variable titled something along the lines of RELATIVE_PATH_TO_DOC to the environment variables set by Pandoc before calling a filter.

Describe alternatives you've considered
I tried using the rebase_relative_paths extension, but this would not actually solve the problem, as any files not processed by the filter would have incorrect paths when executing pandoc as follows:

$ pandoc dir/file.md -f markdown+rebase_relative_paths -t html --filter my-filter -o dir/output.html

If dir/file.md contains an image that is not an SVG (![Caption](test.jpg)), it will have an incorrect path in the HTML output (dir/test.jpg instead of test.jpg).

@jgm
Copy link
Owner

jgm commented Dec 17, 2022

PANDOC_STATE.input_files should give you the input files passed as arguments on the command line. That might be sufficient?

@rnwst
Copy link
Sponsor Contributor Author

rnwst commented Dec 17, 2022

Many thanks @jgm for the suggestion. Unfortunately, it is not a Lua filter and using Lua is not an option due to the libraries I need (the filter is a little more complex than what I have described above).

I printed out all the available environment variables as a test and unfortunately PANDOC_STATE is not among them (the documentation also doesn't suggest that PANDOC_STATE is available as an environment variable for non-Lua filters).

It seems Lua filters have many more global variables set than there are environment variables set for non-Lua filters. Would it be possible to harmonize the set of Lua global variables with the non-Lua environment variables?

$ pandoc --version
pandoc 2.19.2
Compiled with pandoc-types 1.22.2.1, texmath 0.12.5.4, skylighting 0.13.1.1,
citeproc 0.8.0.2, ipynb 0.2, hslua 2.2.1
Scripting engine: Lua 5.4

@tarleb
Copy link
Collaborator

tarleb commented May 6, 2023

Workaround: use a Lua filter to fix the image paths, then call your filter after that. It's also possible to call external filters from within Lua with pandoc.utils.run_json_filter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants