Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements slow OpenLibrary API calls #8

Open
Mozzo1000 opened this issue Jul 17, 2024 · 1 comment
Open

Improvements slow OpenLibrary API calls #8

Mozzo1000 opened this issue Jul 17, 2024 · 1 comment
Labels
backend Backend related enhancement New feature or request frontend Frontend related

Comments

@Mozzo1000
Copy link
Owner

Background

The frontend directly calls the OpenLibrary API to fetch books information (title, description, isbn, cover image, etc) and is fully responsible for powering the frontend search in which we can search books by title or isbn.

Our backend does not talk to OpenLibrary, all is handled by the frontend that then sends the information to the backend for saving books in the users library. The backend only saves the most necessary information, so we do not for example save cover images.

The problem

The OpenLibrary API is slow and this hinders us from fetching the information in a quick enough manner that does not impact user experience. Searching is exceptionally slow and even getting a cover image can take long enough for it to timeout and in the end not get any image at all.
This is okay for now, it works and as it is a free service the performance is nothing to complain about.

Proposed solution

OpenLibrary is open data and does provide us with data dumps that can be downloaded and handled locally. There has been previous work done to take the data dumps and importing it into a database that can be more easily searchable.
Doing something similar to this as well as serving all the image covers should give us more control and shorten the distance to the data.
We need to be able to do free-text search for titles as well as search by ISBN (we have thus far only focused on ISBN-13). And be able to retrieve different cover image sizes by ISBN.

Some disadvantages to this solution is that OpenLibrary has millions of books in there database and the dumps are around 40 GB in size (excluding all cover images). We will have to develop a pipeline for creating the local database based on data from new dumps that gets released regularly, and ingestion will probably take a long time.

More research on this topic need to be done.

@Mozzo1000 Mozzo1000 added enhancement New feature or request frontend Frontend related backend Backend related labels Jul 17, 2024
@Mozzo1000
Copy link
Owner Author

I spent some time looking into this and fixing the search experience was surprisingly easy. Work is being done in a separate repo as we don't have to create something that can only be used with this project. We might also want the user to select which search API to use when installing the service..
The cover images are an entirely different beast to handle, it looks to be around 1.2TB of images and that is a lot more than what a regular person wants to handle for this kind of service.

https://github.com/mozzo1000/openlibrary-local-db

https://fosstodon.org/@mozzo/112808589345794906

This issue will remain open even though most work is being done in a separate repo at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Backend related enhancement New feature or request frontend Frontend related
Projects
None yet
Development

No branches or pull requests

1 participant