You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The frontend directly calls the OpenLibrary API to fetch books information (title, description, isbn, cover image, etc) and is fully responsible for powering the frontend search in which we can search books by title or isbn.
Our backend does not talk to OpenLibrary, all is handled by the frontend that then sends the information to the backend for saving books in the users library. The backend only saves the most necessary information, so we do not for example save cover images.
The problem
The OpenLibrary API is slow and this hinders us from fetching the information in a quick enough manner that does not impact user experience. Searching is exceptionally slow and even getting a cover image can take long enough for it to timeout and in the end not get any image at all.
This is okay for now, it works and as it is a free service the performance is nothing to complain about.
Proposed solution
OpenLibrary is open data and does provide us with data dumps that can be downloaded and handled locally. There has been previous work done to take the data dumps and importing it into a database that can be more easily searchable.
Doing something similar to this as well as serving all the image covers should give us more control and shorten the distance to the data.
We need to be able to do free-text search for titles as well as search by ISBN (we have thus far only focused on ISBN-13). And be able to retrieve different cover image sizes by ISBN.
Some disadvantages to this solution is that OpenLibrary has millions of books in there database and the dumps are around 40 GB in size (excluding all cover images). We will have to develop a pipeline for creating the local database based on data from new dumps that gets released regularly, and ingestion will probably take a long time.
More research on this topic need to be done.
The text was updated successfully, but these errors were encountered:
I spent some time looking into this and fixing the search experience was surprisingly easy. Work is being done in a separate repo as we don't have to create something that can only be used with this project. We might also want the user to select which search API to use when installing the service..
The cover images are an entirely different beast to handle, it looks to be around 1.2TB of images and that is a lot more than what a regular person wants to handle for this kind of service.
Background
The frontend directly calls the OpenLibrary API to fetch books information (title, description, isbn, cover image, etc) and is fully responsible for powering the frontend search in which we can search books by title or isbn.
Our backend does not talk to OpenLibrary, all is handled by the frontend that then sends the information to the backend for saving books in the users library. The backend only saves the most necessary information, so we do not for example save cover images.
The problem
The OpenLibrary API is slow and this hinders us from fetching the information in a quick enough manner that does not impact user experience. Searching is exceptionally slow and even getting a cover image can take long enough for it to timeout and in the end not get any image at all.
This is okay for now, it works and as it is a free service the performance is nothing to complain about.
Proposed solution
OpenLibrary is open data and does provide us with data dumps that can be downloaded and handled locally. There has been previous work done to take the data dumps and importing it into a database that can be more easily searchable.
Doing something similar to this as well as serving all the image covers should give us more control and shorten the distance to the data.
We need to be able to do free-text search for titles as well as search by ISBN (we have thus far only focused on ISBN-13). And be able to retrieve different cover image sizes by ISBN.
Some disadvantages to this solution is that OpenLibrary has millions of books in there database and the dumps are around 40 GB in size (excluding all cover images). We will have to develop a pipeline for creating the local database based on data from new dumps that gets released regularly, and ingestion will probably take a long time.
More research on this topic need to be done.
The text was updated successfully, but these errors were encountered: