BOS README.md
Python library for processing Bibles in various formats
License | Version | ||
Travis CI | Coverage | ||
Wheel | Implementation | ||
Status | Downloads | ||
Supported versions |
The Bible Organisational System (BibleOrgSys or BOS) came from starting to write Python code to read and display Bibles back in 2010. It didn't take too many little projects, before realising that they had things in common, especially the need to iterate through Bible "books", chapters, and verses.
The other realisation was the need for data standards, e.g., for Bible Books codes, Bible abbreviations (for linking from one Bible resource to another), etc., etc. So this project was begun (after playing with JSON and rejecting it for inability to include comments) by hand-crafting some XML files for data sets. (XML was chosen for being a standard that could be loaded into most computer language systems, and hence usable by others beyond this Python library.) These can be found in the DataFiles folder. Each of these datafiles also has a "converter" script to load, validate, and convert the data into Python lists and dicts (and from there, easily exportable as JSON or pickles or whatever).
This led on to writing modules to provide an API for these datasets which can be found in the Reference folder.
Other modules to import and export various Bible formats can be found in the Formats folder.
An internal, indexed Bible resource representation was created as seen in the Internals folder. The internal representation is based on USFM lines (because USFM is used in real life for MANY Bible translations) along with some additional, custom markers for additional fields and also to ease processing (such as segment end markers).
(More of the original design thinking and (oldish) documentation can be seen in the Documentation folder.)
Eventually the BibleOrgSys became the basis for the Freely-Given.org Bible Drop Box service which had the benefit of enabling the library to become robust enough to handle Bible formats in various stages of correctness.
- A way to define a Bible along with metadata (such as name, abbreviation, publication year, translators, copyright, licence, etc., etc.)
- Must be able to handle original language (Hebrew, Greek) Bibles as well as translations in any world language
- A way to iterate works, "books", chapters, and verses (W/B/C/V)
- A way to communicate this W/B/C/V information between windows and even between apps -- see here
- A way to map between different versifications, i.e., the numbering of chapters and verses can differ but we still want to find the same content. (NOTE: Allow versification mapping is allowed for in most parts of the system, this vital part has never been completed -- see here for more information on what is expected.)
- A standard, internal Bible representation
- Parsers to read various different Bible formats into the internal representation -- these might be individual files or folders of files (which can be loaded by multiple threads)
- The parsers require a strict mode to catch and document errors (for a Bible translator trying to fix/improve their work) and also a forgiving mode to load a Bible file into a reader even if it's not perfect
- Exporters to write various different Bible formats from the internal representation
- A way to integrate additional resources (such as Bible dictionaries) with internal Bibles (e.g., to create a Bible-study app)
An old version of BibleOrgSys is on PyPI, but we are in the (slow) process of breaking BibleOrgSys into smaller components and putting each of them separately onto PyPI. We hope to implement versification mapping and complete the PyPI uploading by the end of 2024.
We are also investigating ways of speeding up the system including:
- C or Rust functions for CPython
- Python compilers such as PyPy or Py2Exe or PyInstaller
- A stand-alone Rust or Golang (Go) Bible compiler (to build to our internal Bible format)