Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Add pdf and djvu support #139

Open
1 of 2 tasks
ghost opened this issue Jul 15, 2019 · 48 comments
Open
1 of 2 tasks

[Feature request] Add pdf and djvu support #139

ghost opened this issue Jul 15, 2019 · 48 comments
Labels
enhancement New feature or request

Comments

@ghost
Copy link

ghost commented Jul 15, 2019

Hi,
Please consider adding pdf and djvu format support to this awesome ebook viewer to become a universal ebook viewer.
thanks.

  • pdf

  • djvu

@johnfactotum johnfactotum added the enhancement New feature or request label Jul 17, 2019
@artemisresende
Copy link

Hello @johnfactotum, I suggest implement the PDF support with the Mozilla's PDF.js project. I suppose that is a good start because they use only ES6, without other libs and it's too efficient.

The PDF reading feature will be a good functionallity to the app, because this is amazing! Allied with some other new features like book listing (as suggested in another requests), this will be great.

If you need some Pull Request, I can study the app and make when I have some time.

Lastly, I need to applause you because your job is fantastic!!

@johnfactotum
Copy link
Owner

PDF.js is probably the more sensible option to go with, although I haven't really looked much into it. Other possible ways include: Poppler (probably too low level), Evince, or convert to EPUB (easiest option but the result will be unsatisfactory).

I see that this is a popular request, but I'm a bit unsure whether it would really be a good idea to add PDF/DjVu support because

  • I made Foliate because there wasn't any EPUB/Kindle reader available on Linux that satisfied my personal needs. For PDF and DjVu files, there are already a lot of other good viewers out there.
  • PDF and DjVu are quite complex and very different from EPUB/Kindle formats. Foliate has a pretty small codebase that is highly coupled with Epub.js and the EPUB format in general.

Basically, I don't see a lot of advantage in using Foliate to view PDF/DjVu files. Maybe one gets to use the dictionary/translation/TTS tools? Or sync reading locations? But those feature can more easily and sensibly be implemented for other PDF viewers or as standalone programs than adding PDF/DjVu capabilities in Foliate.

That being said, I will certainly welcome pull requests, as long as it's not overly complicated or deviate too much from the goals of the project.

@artemisresende
Copy link

artemisresende commented Oct 1, 2019

Yeah, you're right, can be hard and not too attractive to the project.

Initially I suggested this because I think is easier for us - by the user perspective - centralize our documents in only one software, enjoying all these features, like you commented.

@technodrome
Copy link

Thanks for the project, it looks very promising. However, I wholeheartedly do not agree there is no need to include PDF support. Quite the opposite. Linux PDF readers are awful, to put it mildly. There is literally only one reader which can invert colors properly - that is configurable black or dark grey background and whiteish text - Sumatra and even that runs only on wine/crossover. Forget highlights/exports. So having one reader which supports both massively popular leading book formats - EPUB and PDF - makes absolutely total sense. Having only one of those will make mass adoption of this possibly very nice and useful software only that - half baked.

I do hope this much-needed feature lands as mozilla's pdf.js is a very nice project and there is support for theming. Highlights and notes are just what every real student/programmer/professional needs. No more proprietary formats.

Add that, notes and highlights with JSON export and you are literally the king of readers. Forget overbloated nonsense like Calibre.

@johnfactotum
Copy link
Owner

Yes, the lack of a proper night mode is what bugs me about Evince, which I otherwise happily use for PDF files. I think in Zathura you can configure light and dark colors, although I haven't really tried it myself. The current CSS filter based invert in Foliate isn't satisfactory, either. Maybe they can be improved with SVG filters, but I'm not sure.

I'm currently working on a GtkBuilder-based rewrite of Foliate, that will make things cleaner, more maintainable, and address the problems raised in #176. It will make it easier to implement PDF support, because the current code is kind of messy.

@itprojects
Copy link
Contributor

@johnfactotum If one day you do decide to "test" if PDF support could work, perhaps the following could be of use:

  1. Make a new app/project

com.github.johnfactotum.Foliate.pdf

  1. Start with the simplest PDF.js implementation

This is to have one working button - open.

  1. Allow/encourage other people to develop the new app in a direction convergent with Foliate. You become a code reviewer for that project.

If these steps produce something of value that can be merged with Foliate, then the two become one. If not, then no loss - creativity is not a linear process.

Note: the fact that you can make a JS app in Gnome-shell; and it be as powerful as the traditional programming languages (CPP/Java/Python) version, is simply mind-blowing.

@johnfactotum
Copy link
Owner

johnfactotum commented Oct 31, 2019

For what it's worth, I downloaded PDF.js's prebuilt viewer and used a plain WebView to open it. It works -- you don't even have to implement an "open" button because the viewer already has one, and it can open everything just fine.

But the experience isn't great. It's kind of slow -- a lot slower than your usual PDF.js in Firefox, and for some reason there's no kinetic scrolling. It's easy to test this by simply opening the PDF.js demo in Epiphany. Perhaps it's a WebKit problem (WebKit is not fully supported by PDF.js, according to its FAQ). Maybe one could try making a custom viewer, instead of building on the default one, but it's going to take more time and effort.

It seems to me that it might be easier to use Evince (as a library, not the app) instead. Evince has a clean API directly accessible through GObject introspection, and it supports more formats and performs better than PDF.js.

But in any case, how to implement annotations is going to be the biggest problem. It would be very easy to add PDF support if annotations aren't needed.

@itprojects
Copy link
Contributor

If you're going to implement Evince, maybe it's better to just close this feature request.

Evince already works well enough as a separate app!

@technodrome
Copy link

@johnfactotum I am not sure how Evince handles color inversion. Literally every other run-of-the-mill PDF viewer has the same old dumb features with black on white approach. But look what is happening around: more and more apps, even market-leading operating systems are adding dark mode as default. It is becoming mainstream, not everyone loves that hideously glaring white screens so omnipresent on Android.

I have problems with my eyesight so I cannot look at that brutal white background. It is just awful. I am using Dark Reader as Chrome extension which intelligently darkens (inverses) 98 % of webpages to dark mode. It is fantastic. It is dynamic and overridable. Maybe a look at the source code, since it is javascript, could hint a thing or two about possible approaches to implement such beautifully customizable features (its UI offers per-page CSS overrides) we literally desperately need in order to have at least one normal reader which can open worldwide popular formats. I mean it is 21st century and there still is not any clever and sleek app available to do this kind of thing! Just writing this I cannot believe it is so.

Annotations are a big issue, in my opinion. People need it. Researchers do, programmers do, too. Think of people reading books in a foreign language: the same applies. WIthout them, you feel like someone is holding one of your hands behind your back so you cannot resume normal workflow. If you want this - and I dare say very nice app you created - a real thing with real recognition and worldwide usage, you will need annotations, exports, color highlights. This is the basic toolkit of student or pro whatever they do. Can't really go around it. You don't add this, this project just becomes yet another of those countless hapless open-source apps which tried to reinvent the wheel, started with massive enthusiasm just to crash and burn badly because they simply did not take user workflows into consideration and tried to strong-arm the user into something unintuitive.

I do hope I have finally found the ultimate app which would finally, after years, enable me to study and learn with ease. It is just like the difference between Java and Ruby: both will do the job eventually, only the pain involved makes the difference of great versus poor experience for the user.

So fingers crossed!

@itprojects
Copy link
Contributor

itprojects commented Nov 11, 2019

@technodrome Thank you for saying, what many of us are thinking.

About colour inversion: mupdf and zathura have color inversion via a tint mode, and it woks great. The only problem is that there is no GUI, it's just keyboard shortcuts. [Zathura requires a small config file change.]

Zathura mode

For those who want to try the zathura feature in the above images:

Make a file /home/MYUSERNAME/.config/zathura/zathurarc

set recolor-lightcolor \#bea58b
set recolor-darkcolor \#000000
set default-bg \#bea58b

Then Ctrl+R to change colours.

@navidR
Copy link

navidR commented Dec 5, 2019

If you're going to implement Evince, maybe it's better to just close this feature request.

Evince already works well enough as a separate app!

@itprojects No, it doesn't. Evince is an old code base with a lot of lacking the feature. The whole traction for foliate was because it was providing the modern reading experience for users. By modern, I mean annotations, dictionary, etc. Particularly dictionary and lookup feature which is extremely important for people who English is not the first language.

@itprojects
Copy link
Contributor

@navidR Evince does have plenty of technical debts to pay.

Just to clarify:

Evince is an old code base with a lot of lacking the feature.

That's exactly why it would make sense to close the feature request, instead of implementing Evince.

Using PDF.js may or may not yield a different outcome.

Evince already works well enough as a separate app!

The focus is on the words "well enough", it's not perfect, nothing ever is.

@johnfactotum
Copy link
Owner

@navidR Lookup is indeed a very important feature. I think ideally it should be implemented at the toolkit or desktop level, so it can be used by all applications (like the Look Up feature in MacOS). It probably makes more sense to do it in the shell, as GNOME shell already has a search provider API that can be used to perform the lookup with various apps. Maybe there should be a portal for this.

@navidR
Copy link

navidR commented Dec 7, 2019

I think the GNOME Foundation is too busy with funding their useless animations I think. So for the time being if we can get it to work in foliate for both PDF and EPUB, that would be a wonderful step.

@ror6ax
Copy link

ror6ax commented Mar 20, 2020

https://github.com/RussCoder/djvujs seems to be alive and providing necessary functionality.

@digitalethics
Copy link

PDF-based ebooks may be hundreds of pages long and contain graphics that require fast PDF rendering engines. Alfresco has benchmarked most of them in their PDF rendering engine performance and fidelity comparison, with MuPDF[License] and pdfium coming out on top. Security-wise and given its licensing the latter is probably the best choice, performance- and feature-wise probably MuPDF. Can anyone help investigate what the current state of annotation features is for these two libraries?

@timonson
Copy link

timonson commented May 4, 2020

I would also like to mention mupdf as source for implementing the pdf feature.

@aquaspy
Copy link

aquaspy commented May 29, 2020

I also would love PDF support so I can use only foliate (the manojority of my files are in pdf and sometimes I really prefer to not convert it so I can keep it readable in all my platforms)

(I approve adding PDF without annotations, at least for now.)

@Dr-Terrible
Copy link

But in any case, how to implement annotations is going to be the biggest problem.

The W3C's standard for annotations covers both EPUB and PDF formats; its adoption was already proposed here #249 (comment).

There isn't the need to implement the entire RESTful part of the W3C standard (which includes both remote sharing and online publishing platforms). Annotations of only local docs is good enough as a starting point.

As a side note, consider that in academy all the commercial software for bibliographic publication/review internally use the W3C standard for annotations. Exporting/importing notes from those software would be a breeze.

@johnfactotum
Copy link
Owner

The W3C's standard for annotations covers both EPUB and PDF formats

That's good, but I was more referring to how to implementation them using PDF.js or other PDF backends. Most PDF libraries are pretty low level, and even basic stuff like having a selectable text layer is not trivial.[1] The higher level ones might not be easy to modify/extend to work well with the annotation model.

What makes Epub.js so easy to use is that it's high-level enough that you don't need to spend time redoing all the basic things like loading, paginating, adding highlights, etc., but at the same time it remains easily extendable and customizable. With most PDF libraries, it feels that they are either way, way too low level, or there's a built-in UI that can't be easily changed or integrated. (Ultimately, EPUB as a format is itself at a much higher level than PDF, so I guess that's a big reason, too.)

I'm not saying that it's necessarily going to be difficult. And obviously I'm not against having this feature. I guess my points are

  • PDF is a pretty complicated format; for now, I personally do not have the time, desire, nor the expertise required to work on this. (But PRs will obviously be welcome.)
  • Simply because Foliate works reasonably well for EPUBs, it doesn't mean that when it gets PDF support it's going to do a better job than existing solutions; in all likelihood it might end up being inferior in many respects.
  • PDF support, if it ever gets added, will probably share close to zero code with existing Foliate components, apart from re-using some trivial pieces of UI and things like dictionary integration. I can understand the desire to use one app to open many different formats, but there's no real benefits apart from having one less launcher entry on your desktop.

[1] Lector, for example, renders PDFs as images. No annotations, not searching, no nothing.

@johnfactotum
Copy link
Owner

@navidR For what it's worth, right now you can get Gnome Dictionary to look up selected text in Evince with a shortcut, by using a script:

#!/bin/sh
gnome-dictionary --look-up "$(wl-paste --primary)"

Here we use wl-clipboard to get the selected text on Wayland. On X11 one can use xclip.

Then go to GNOME Settings > Keyboard Shortcuts, and set a custom shortcut to run this script. That's it! You can now look up words in Gnome Dictionary from any app!

@csrgxtu
Copy link

csrgxtu commented Aug 16, 2020

feature request to support pdf

@johnfactotum johnfactotum mentioned this issue Aug 28, 2020
@poke1024
Copy link

@johnfactotum Really love foliate. Having some support for PDFs would be so great.

I use foliate as a replacement for Apple Books, i.e. as a ebook library management. In this function, it would already be very beneficial if I could add PDF files to the library. Clicking on them could then open them in the standard PDF reader. To have PDF files in a library, having them searchable by name, would already be a great addition.

I think the whole discussion here to include a first-grade PDF reader with lookup is sort of too complex for a first step.

@ghost
Copy link

ghost commented Jun 7, 2021

+1 for pdf/djvu support. it would be awesome

@larrasket
Copy link

+1

@VarLad
Copy link

VarLad commented Jun 19, 2021

Well, now lets talk about the complexity of being able to support both

Lets start with djvu. Any of the devs, any ideas?

@itprojects
Copy link
Contributor

Massive and sustained effort will be required to even get to the level of the main Foliate features.

Complexity in this case: writing two new apps, almost completely unrelated to Foliate.

[com.github.johnfactotum.Foliate.pdf]

[com.github.johnfactotum.Foliate.djvu]

To maintain the three formats PDF, DJVU, and EPUB, at least a dedicated develper for each will be required; all year round.

The Library will have to be re-designed.

Using Foliate in combination with Evince makes more sense. The DJVU/PDF Foliate will look and feel too much the same, in every case, because Foliate (rightly) adheres to Gnome design guidelines.

@yozachar
Copy link

csbooks does that, but I don't think it's open source. AUR

@VarLad
Copy link

VarLad commented Jun 23, 2021

But evince doesn't support djvu

Is there any similar lightweight alternative for djvu?

@itprojects
Copy link
Contributor

@VarLad Evince does support DJVU. Do you have the evince-common package?

A lightweight alternative would be zathura (with zathura-djvu zathura-pdf-poppler).

MuPDF? Opens most file formats. Epub and PDF.

@im-n1
Copy link

im-n1 commented Jan 17, 2022

Just wanna add +1 to the PDF support. It's 2022 and still no great PDF readed on linux that can remember where I left off.

@ghost
Copy link

ghost commented Jan 17, 2022

@im-n1 zathura remembers that

@knakamura8
Copy link

Is there any intention to implement this, or some roadmap that I cannot seem to find on the wiki? I appreciate that it seems complex to implement (read the above discussion). That having been said, some of the posts are dated months/years back, so I am curious as to whether or not the position is maintained that this is still a wontfix. Anyway, really enjoy the utility, regardless, cheers to the contributors and maintainers.

@StanczakDominik
Copy link

I'd just like to point out that KDE's Okular is great for PDFs, and it doesn't look like it's been mentioned elsewhere in the thread.

@johnfactotum
Copy link
Owner

Is there any intention to implement this, or some roadmap that I cannot seem to find on the wiki? I appreciate that it seems complex to implement (read the above discussion). That having been said, some of the posts are dated months/years back, so I am curious as to whether or not the position is maintained that this is still a wontfix. Anyway, really enjoy the utility, regardless, cheers to the contributors and maintainers.

I believe at this moment it is still very unlikely to get fixed in the near future. That being said, I do have some thoughts on how this could be implemented eventually.

I think I didn't really think things through on this. My previous statement that it would reuse close to zero code is probably false. Foliate already has basic support for fixed layout EPUBs, and it needs to support them no matter what.

So the most sensible approach would be essentially converting PDF files to fixed layout EPUBs, except preferably in an on-the-fly, on-demand way.

So the key is probably to improve support for fixed layout EPUBs first. Then it should be relatively straightforward to add support for other fixed layout formats.

@Noobao
Copy link

Noobao commented May 11, 2022

added +1 to the PDF support.Foliate is a really amazing application and this is currently the only feature it lacks for it to be my main reader for all my books.

@martinpescador
Copy link

+1 .pdf, etc.

My use case involves both .epub and .pdf in the same work flow, namely research. Came here because I was somewhat puzzled by the absence of .pdf support. Haven't head time and inclination to look at the different implementation of readers, and I am only a user of these apps, but I like Foliate and having to use more than one app for the same search is annoying.

Ideally I'd like to have a fully featured reader with integrated file management that allowed complex search on multiple files and able to connect to a bibliographic database, like Zotero. A diverse ecology with different aps is cool, there is freedom to create and scratch your particular itch. All good. Though there is certainly scope for software development in this area.

@knakamura8
Copy link

Will this be part of the roadmap within #962?

@johnfactotum
Copy link
Owner

johnfactotum commented Nov 19, 2022

Yes. One important feature of the new renderer is that it doesn't require all sections and resources to be loaded. Without this, either you have to convert the whole PDF to EPUB at once, or you'd need a totally different renderer that doesn't share any code with the EPUB renderer. Now it would be possible to implement it by using PDF.js for the rendering of individual pages, but using the same programming interface for handling inter-page layout as fixed-layout EPUB/Kindle or CBZ books.

The fixed layout renderer really needs to be improved first, though. Most importantly, it currently lacks zooming and continuous scrolling. Though I guess this need not be a blocker, as some support would be better than no support at all.

@dejalavidavolar
Copy link

t

Yes. One important feature of the new renderer is that it doesn't require all sections and resources to be loaded. Without this, either you have to convert the whole PDF to EPUB at once, or you'd need a totally different renderer that doesn't share any code with the EPUB renderer. Now it would be possible to implement it by using PDF.js for the rendering of individual pages, but using the same programming interface for handling inter-page layout as fixed-layout EPUB/Kindle or CBZ books.

The fixed layout renderer really needs to be improved first, though. Most importantly, it currently lacks zooming and continuous scrolling. Though I guess this need not be a blocker, as some support would be better than no support at all.

thanks!!!!!

@pradyumnac
Copy link

pradyumnac commented Jan 27, 2023

Among all readers, folite is the best in terms of snappiness and ux. Only pdf support is missing.

Are you guys working on this ( Specially pdf support)? thats the feature I miss most like others have mentioned in the thread

I dont hve much exp in js (renderer) but I will be happy to help in any way possible

@jiiiijiij
Copy link

Add pdf and foliate will rule

@payrim
Copy link

payrim commented Jun 29, 2023

could you please just make it so we could add PDF books in the library and open it with external apps (such as zathura?). i like all my books be in one place. <3<3

@johnfactotum johnfactotum mentioned this issue Aug 31, 2023
@loynoir
Copy link

loynoir commented Sep 12, 2023

Would be nice to have .pdf support within foliate, as both calibre and okular support it.


FYI

  • okular seems to have native support for .pdf.

https://github.com/search?q=repo%3AKDE%2Fokular%20pdftohtml&type=code

  • calibre seems to support .pdf using external binary pdftohtml. And external binary pdftohtml come from package poppler.

https://github.com/kovidgoyal/calibre/blob/master/src/calibre/ebooks/pdf/pdftohtml.py#L28-L36

https://github.com/kovidgoyal/calibre/blob/master/bypy/linux/__main__.py#L46

https://github.com/kovidgoyal/calibre/blob/master/bypy/sources.json#L378


@johnfactotum

I suggest foliate use pdftohtml as workaround as same as calibre.

@johnfactotum
Copy link
Owner

Some PDF support has been added in 1512c9d. I must add that it's very basic, a word here really means "fairly terrible", as in highly bugged and experimental.

The up side is that it was very easy to implement with the new renderer's architecture (only ~100 lines of Foliate's own code).

To make it usable, though, the fixed layout renderer really needs to be improved. It'd probably be rewritten at some point.

@Jose-jme
Copy link

Hola, Considere agregar compatibilidad con los formatos pdf y djvu a este increíble visor de libros electrónicos para convertirse en un visor de libros electrónicos universal. gracias.

* [x] pdf

* [ ] djvu

Hola entonces empezamos el proyecto

@Jose-jme
Copy link

Cantaten

@aehlke
Copy link

aehlke commented Jun 12, 2024

Would love to borrow the pdf.js dark mode extensions from https://github.com/shivaprsd/doqment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests