-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: Add LzwCodec for encoding #2883
Conversation
can you clarify what you intend to do with the encoding? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2883 +/- ##
==========================================
+ Coverage 96.24% 96.27% +0.02%
==========================================
Files 51 52 +1
Lines 8625 8692 +67
Branches 1722 1734 +12
==========================================
+ Hits 8301 8368 +67
Misses 187 187
Partials 137 137 ☔ View full report in Codecov by Sentry. |
I want to have an easy way to check if the decoding does the right thing. This allows us to change the LZW implementation with the confidence that we don't break workflows. |
@Lucas-C Maybe the encoder is interesting for fpdf2? It wasn't in any discussion, but only mentioned in py-pdf/fpdf2#691 |
I would recommend to make the encoder roughly the same as the decoder, id est passing |
@stefan6419846 Done :-) 👍 |
Are we sure that we want to expose this as a public module while we do not officially support encoding objects with LZW? I tend to make |
Codecs might be kept private and called directly from filters. For the tests, as it currently is, you have proved that you have a function and the inverted function : I would have liked to have a minimum test that check also the compressed data |
Good point, I made it private. I'm uncertain about the module name, but as it is private it should not matter too much. |
@pubpub-zz You're absolutely right. I've added two examples. I would feel even better if there was documented (non-encoded, encoded) pairs that we could add to the test suite, but for now that should be fine. |
Yes, it could make for an interesting addition! I opened py-pdf/fpdf2#1271 to suggest this feature. Just to be clear @MartinThoma : are you explicitly allowing |
This is a change I wanted to do for a while :-)
While we might only need decoding for pypdf, having both decoding and encoding in one class massively helps with testing. We can still get it wrong, but it's harder to get both the encoder and the decoder wrong in a consistent way.
This PR adds an abstract
Codec
class as well as an LzwCodec implementation.We could even use hypothesis for property-based testing for all codecs :-)