Improvements slow OpenLibrary API calls #8

Mozzo1000 · 2024-07-17T14:28:00Z

Background

The frontend directly calls the OpenLibrary API to fetch books information (title, description, isbn, cover image, etc) and is fully responsible for powering the frontend search in which we can search books by title or isbn.

Our backend does not talk to OpenLibrary, all is handled by the frontend that then sends the information to the backend for saving books in the users library. The backend only saves the most necessary information, so we do not for example save cover images.

The problem

The OpenLibrary API is slow and this hinders us from fetching the information in a quick enough manner that does not impact user experience. Searching is exceptionally slow and even getting a cover image can take long enough for it to timeout and in the end not get any image at all.
This is okay for now, it works and as it is a free service the performance is nothing to complain about.

Proposed solution

OpenLibrary is open data and does provide us with data dumps that can be downloaded and handled locally. There has been previous work done to take the data dumps and importing it into a database that can be more easily searchable.
Doing something similar to this as well as serving all the image covers should give us more control and shorten the distance to the data.
We need to be able to do free-text search for titles as well as search by ISBN (we have thus far only focused on ISBN-13). And be able to retrieve different cover image sizes by ISBN.

Some disadvantages to this solution is that OpenLibrary has millions of books in there database and the dumps are around 40 GB in size (excluding all cover images). We will have to develop a pipeline for creating the local database based on data from new dumps that gets released regularly, and ingestion will probably take a long time.

More research on this topic need to be done.

Mozzo1000 · 2024-07-18T18:07:47Z

I spent some time looking into this and fixing the search experience was surprisingly easy. Work is being done in a separate repo as we don't have to create something that can only be used with this project. We might also want the user to select which search API to use when installing the service..
The cover images are an entirely different beast to handle, it looks to be around 1.2TB of images and that is a lot more than what a regular person wants to handle for this kind of service.

https://github.com/mozzo1000/openlibrary-local-db

https://fosstodon.org/@mozzo/112808589345794906

This issue will remain open even though most work is being done in a separate repo at the moment.

Mozzo1000 added enhancement New feature or request frontend Frontend related backend Backend related labels Jul 17, 2024

Terrance mentioned this issue Jul 27, 2024

Handle OpenLibrary search results with no ISBN #18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements slow OpenLibrary API calls #8

Improvements slow OpenLibrary API calls #8

Mozzo1000 commented Jul 17, 2024

Mozzo1000 commented Jul 18, 2024

Improvements slow OpenLibrary API calls #8

Improvements slow OpenLibrary API calls #8

Comments

Mozzo1000 commented Jul 17, 2024

Background

The problem

Proposed solution

Mozzo1000 commented Jul 18, 2024