This repository has been archived by the owner on Dec 9, 2018. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Comparison
WANMIN LIU edited this page May 29, 2014
·
23 revisions
This page compares common approaches to present PDF files online. Please read the paper [Online publishing via pdf2htmlEX](http://coolwanglu.github.io/pdf2htmlEX/doc/tb108wang.html) for more details.
Convert to HTML 5 | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
---|---|---|---|---|---|---|
Example | pdf2htmlEX | PDF.js | pdftoppm (poppler) Google Doc | pdftohtml (poppler) | Adobe PDF Plugin | N/A |
Briefing | PDF elements are converted into corresponding or closest HTML elements | PDF file is loaded, parsed and rendered by Javascript | PDF pages are converted into images and shown in web pages | Similar as “Convert to HTML 5”, but with much less features | Official plugin | Non-official PDF plugins, Flash-based plugins or others |
Open source | Yes | Some (pdf.js) | Poppler is open source. Google Doc may be based on poppler as well, because they showed same errors. | Some (pdftohtml) | No | Maybe |
Free | Yes | Some | Some | Some | Yes | Some |
note: There are free and/or open source tools for all but Adobe PDF plugin.
Convert to HTML 5 (pdf2htmlEX) | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
---|---|---|---|---|---|---|
Processing (server-side) | Normal, one time | None | Slow, one time | Fast, one time | None | None, usually |
Loading (client-side) | Fast | Fast | Slow | Fast | Fast | Fast |
Rendering (client-side) | Fast | Slow | Fast | Fast | Fast | Fast, usually |
Network cost | Small 1 | Small | Large 2 | Small | Small | Small |
1: HTTP compression is required
2: Could be Huge if higher resolution is needed
Convert to HTML 5 (pdf2htmlEX) | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
---|---|---|---|---|---|---|
HTML 5 | Yes | Yes, usually | No | No | No | No |
CSS | Yes | Yes | No | Yes | No | No |
Javascript | No | Yes | No | No | No | No |
Third-party plugin | No | No | No | No | Yes | Yes |
Convert to HTML 5 (pdf2htmlEX) | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
---|---|---|---|---|---|---|
Full PDF Feature ? | No, but usually enough | Maybe | Yes | No | Yes | Maybe |
Text Extraction (select/copy/search) | Yes | Yes, with text layer | No, usually 1 | Yes | Yes | Maybe |
Embedding Font | Yes | Yes | Yes | No | Yes | Yes, usually |
Link | Yes | Yes | No, usually 2 | Yes | Yes | Maybe |
Accurate rendering (layout/spacing) | Yes, usually 3 | Yes | Yes | No | Yes | Yes, usually |
Read while loading | Yes | Yes | Yes | Yes | No | Maybe |
1: Text extraction can be supported with a text layer
2: Link may be handled with Javascript
3: There are PDF elements which cannot be converted into HTML losslessly
Convert to HTML 5 (pdf2htmlEX) | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
---|---|---|---|---|---|---|
Customizable UI/Theme | Yes | Yes | Yes | Yes | No | No, usually 1 |
Extensible | Yes | Yes | Yes | Yes | No | Maybe 2 |
1: For some plugins there are commercial licensed versions with customizable UI
2: Some plugins have API available