Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Work In Progress] Gracefully reload the ZIM Library #734

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Optimus-NP
Copy link

@Optimus-NP Optimus-NP commented Feb 22, 2025

Fixes #729

Refactor and Improve Code Formatting in kiwix-serve.cpp

  • Introduced listenDirectoryChanges() function (Linux-only) to monitor directory changes and reload the library dynamically.
  • Used inotify to detect modifications in watched directories and trigger library updates.
  • Implemented a dedicated monitor thread to watch for changes in the specified directory.
  • Ensured proper synchronization and thread safety while handling directory monitoring.
  • Reformatted and improved code readability by adjusting indentation, spacing, and alignment.
  • Fixed inconsistent whitespace in function definitions, loops, and conditions.
  • Standardized #include ordering for better organization.
  • Reordered #include statements to maintain consistency across platforms.
  • Improved error handling by adding better structured try-catch blocks.
  • Standardized macro definitions and improved formatting for readability.
  • Improved logging messages for better debugging and clarity.

Testing

 /usr/local/bin/kiwix-serve --port=8080 --library /tmp/zimlibrary/linux_library.xml --monitorLibrary

After running the specified command, the console logs indicated a successful configuration reload following modifications to the file /tmp/zimlibrary/linux_library.xml. To verify the changes, I accessed the web interface at http://localhost:8080/ and confirmed that the library updates were applied as expected.

Refactor and Improve Code Formatting in kiwix-serve.cpp

- Introduced `listenDirectoryChanges()` function (Linux-only) to monitor directory changes and reload the library dynamically.
- Used `inotify` to detect modifications in watched directories and trigger library updates.
- Implemented a dedicated **monitor thread** to watch for changes in the specified directory.
- Ensured proper synchronization and thread safety while handling directory monitoring.
- Reformatted and improved code readability by adjusting indentation, spacing, and alignment.
- Fixed inconsistent whitespace in function definitions, loops, and conditions.
- Standardized `#include` ordering for better organization.
- Reordered `#include` statements to maintain consistency across platforms.
- Improved error handling by adding better structured `try-catch` blocks.
- Standardized macro definitions and improved formatting for readability.
- Improved logging messages for better debugging and clarity.
@kelson42
Copy link
Contributor

kelson42 commented Mar 4, 2025

@Optimus-NP Thank you for your PR. This is really a lot of code with different purposes. Can you please put all the cosmetic changes in a dedicated PR?

@kelson42 kelson42 requested a review from veloman-yunkan March 4, 2025 04:20
@kelson42
Copy link
Contributor

kelson42 commented Mar 4, 2025

@Optimus-NP Thank younfor your PR. This is really a lot of code with different purposes. Can you please put all the cosmetocs changes in a dedicated PR?

Regarding your solution, I really don't know if this is the good approach and not even sure this is what is requested here. To me, it seems to fix #735 with a linux specific approach (so won't work on Windows).

@kelson42 kelson42 self-requested a review March 4, 2025 04:26
@kelson42 kelson42 marked this pull request as draft March 4, 2025 04:27
@Optimus-NP
Copy link
Author

Optimus-NP commented Mar 9, 2025

Hi @kelson42 @rgaudin ,

Thank you for reviewing the PR I shared. I initially misunderstood the ask in the issue: #729, but I now have a clearer understanding. Please take a look at my revised approach below.

Identifying Keys to Purge

Currently, when we reload the library via the Kiwix Manager from KiwixServer::reload, Kiwix Manager does not return any data to Kiwix Server. Moving forward, I plan to modify this behavior so that Kiwix Manager returns information about books that have been added, deleted, or modified. Using this data, we can then call the relevant purge APIs in the Varnish cache to update only the affected entries.

Purging in the Varnish Cache

Although I’m not fully familiar with the data structures used in the Varnish cache, if the data is stored per book, updating the signature of the KiwixManager::reload function should allow us to determine which cache keys need to be purged. Based on my research into Varnish cache documentation—particularly the CLI tool varnishadm—it appears possible to purge specific books from the cache rather than clearing the entire catalog for library.kiwix.org. This would enable more efficient cache management.

Request for Input from Kiwix Maintainers

I’d love to hear your thoughts on this approach. Additionally, I would appreciate any insights into how Varnish cache is currently implemented in front of the servers for library.kiwix.org.

I’m excited about working on this task and would love to implement this solution or explore better alternatives you might suggest.

@evrial
Copy link

evrial commented Mar 9, 2025

Honestly I think the most graceful way is keeping things simple and accept a path as single valid input to search zim files. This way you don't need any monitor thread and platform specific code.

@rgaudin
Copy link
Member

rgaudin commented Mar 10, 2025

@Optimus-NP there is not and should not be anything about varnish in kiwix. Varnish is an example of the need we have to be aware when the library is reloaded. We have a featuere that automatically refreshes the library ; we should have a way to be informed when this happens. That's the core of #729. The rest is context to justify the new feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Knowing when library has been reloaded
4 participants