Library-internal representation of floats #1594
Replies: 4 comments 2 replies
-
Beta Was this translation helpful? Give feedback.
-
@MartinThoma |
Beta Was this translation helpful? Give feedback.
-
I am the co-Project Leader of ISO 32000 (the core PDF spec) and CTO at the PDF Association. What used to be Annex C "Implementation Limits" in earlier editions of the PDF spec was removed a number of years ago because it was not vendor-neutral and only reflected the implementation choice of a single vendor on a single platform and at a point in time about 15 years ago. Obviously hardware and software have both changed in that period, and different implementations have different requirements. Using doubles is common practice in PDF parsers however that has the usual problems of accumulated error, precision and accuracy and so you must also take care that calculations remain within an acceptable range. If you are not rendering then there is certainly less to worry about, but there are PDFs out there that do really silly things (like explicitly rescale very large content down to unit square (or less) and then scale it back up again for no reason - and expect perfect alignment of objects). Note also that PDF now has several features where the expectation of 32-bit only integer values is incorrect/unreasonable or wrong: new crypto, geospatial features, measurement properties and movie activation to name a few. Depending on what kind of SW you are implementing these may also influence your design choices. |
Beta Was this translation helpful? Give feedback.
-
PyMuPDF: https://discord.com/channels/770681584617652264/983871937711341618/1070764033529086103 |
Beta Was this translation helpful? Give feedback.
-
PDF documents have lots of floats. PyPDF was parsing them as Decimal so far, but we are thinking about switching to float (IEEE 754, a double).
How do other libraries do this?
Beta Was this translation helpful? Give feedback.
All reactions