You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are quite a few EPUB books that don't conform to the EPUB specification and thus fail parsing validations in EpubReader. Sometimes, it might be desirable to turn off some of those validations and ignore the parts of the book that couldn't be parsed.
Proposed solution
Go through all EPUB parsing validation checks and create configuration options (one per each check) to turn them on or off individually. For example, the value of the package/manifest/item/id attribute must be unique (otherwise it will be impossible to determine which manifest item is referenced in the spine). A new PackageReaderOptions.SkipManifestItemsWithDuplicateIds configuration property will instruct PackageReader whether it should skip duplicate manifest items or throw an exception.
Additionally, create three configuration presets:
STRICT — all validation checks are enabled;
RELAXED — ignore errors that are most common in the real-world EPUB books (default option);
IGNORE_ALL_ERRORS — turn off all validation checks and try to salvage as much data as possible.
Description
There are quite a few EPUB books that don't conform to the EPUB specification and thus fail parsing validations in EpubReader. Sometimes, it might be desirable to turn off some of those validations and ignore the parts of the book that couldn't be parsed.
Proposed solution
Go through all EPUB parsing validation checks and create configuration options (one per each check) to turn them on or off individually. For example, the value of the
package/manifest/item/id
attribute must be unique (otherwise it will be impossible to determine which manifest item is referenced in the spine). A newPackageReaderOptions.SkipManifestItemsWithDuplicateIds
configuration property will instructPackageReader
whether it should skip duplicate manifest items or throw an exception.Additionally, create three configuration presets:
STRICT
— all validation checks are enabled;RELAXED
— ignore errors that are most common in the real-world EPUB books (default option);IGNORE_ALL_ERRORS
— turn off all validation checks and try to salvage as much data as possible.Additional context
Some of those options are already implemented and documented here: https://os.vers.one/EpubReader/malformed-epub/
The text was updated successfully, but these errors were encountered: