Skip to content
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

Pipe HTML directly via stdin instead of local file or remote URL #13

Closed
bsssshhhhhhh opened this issue Apr 11, 2016 · 2 comments
Closed

Comments

@bsssshhhhhhh
Copy link
Contributor

Hi, I really like how this looks so far, but I've come across a rather major limitation that prevents easy integration for me: Athenapdf cannot load HTML directly from stdin.

Currently Athenapdf only supports loading local files and remote files via http. I want to skip the whole 'file' part and just pipe the HTML source directly into Athenapdf. wkhtmltopdf supports this through the use of - as a replacement for the input URL, but Athenapdf does not. I tried using /dev/stdin instead, but it seems Athenapdf (or perhaps Electron) does not handle this.

/dev/stdin definitely points to the data I want:

$ echo '<html><head></head><body><h1>hi</h1></body></html>' | sudo docker run --rm -i arachnysdocker/athenapdf cat /dev/stdin
<html><head></head><body><h1>hi</h1></body></html>

However, trying to use /dev/stdin with athenapdf doesn't work at all - it renders an empty pdf:

$ echo '<html><head></head><body><h1>hi</h1></body></html>' | sudo docker run --rm -i arachnysdocker/athenapdf athenapdf -S /dev/stdin
Xlib:  extension "RANDR" missing on display ":99".
Xlib:  extension "RANDR" missing on display ":99".
%PDF-1.4
%▒▒▒▒
1 0 obj
<</Creator (Chromium)
/Producer (Skia/PDF)
/CreationDate (D:20160411201956+00'00')
/ModDate (D:20160411201956+00'00')>>
endobj
2 0 obj
<</Type /Catalog
/Pages 3 0 R>>
endobj
3 0 obj
<</Type /Pages
/Count 1
/Kids [4 0 R]>>
endobj
4 0 obj
<</Type /Page
/Resources <</ProcSets [/PDF /Text /ImageB /ImageC /ImageI]>>
/MediaBox [0 0 596 843]
/Contents 5 0 R
/Parent 3 0 R>>
endobj
5 0 obj
<</Length 18>> stream
1 0 0 -1 0 843 cm

endstream
endobj
xref
0 6
0000000000 65535 f
0000000015 00000 n
0000000150 00000 n
0000000197 00000 n
0000000252 00000 n
0000000399 00000 n
trailer
<</Size 6
/Root 2 0 R
/Info 1 0 R>>
startxref
465
%%EOFPDF Conversion: 340.725ms

In the meantime, here's a simple bash script that uses temporary files to achieve the effect I want - read HTML from stdin and output pdf to stdout:
used like this:
echo '<html><head></head><body><h1>Hello, world!</h1></body></html>' | bash ~/athenapdf.sh

athenapdf.sh

#!/bin/bash

tmp=$(mktemp)
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
cat /dev/stdin > $tmp
docker run --rm -v /tmp:/tmp/ -v $(pwd):/converted/ arachnysdocker/athenapdf athenapdf -S $tmp 2> /dev/null
rm -f $tmp
trap 0
exit 0
@MrSaints
Copy link
Collaborator

Hey @bsssshhhhhhh, thank you for raising this.

I'm personally supportive of this feature 👍, and I'll be sure to prioritise it.

Meanwhile, if anyone would like to start on it, and raise a PR, I'm more than happy to look into it.

@MrSaints
Copy link
Collaborator

MrSaints commented Apr 27, 2016

Thanks again @bsssshhhhhhh for resolving this.


For the record:

You can now run stdin conversions, e.g.

echo "<h1>stdin test</h1>" | docker run -i --rm -v $(pwd):/converted/ arachnysdocker/athenapdf athenapdf -

(be sure to include the -i flag)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants