Return string rather than bytes from SVG2PDFPreprocessor conversion. #1025

SpencerPark · 2019-05-14T01:29:36Z

The conversion should always return valid json but it seems it was slipping by unnoticed for a little while. As suggested in #904, #847 is causing the output extraction preprocessor to try and convert the bytestring to json which is raising an error.

This should fix #904.

MSeal · 2019-05-14T22:40:35Z

I commented in the issue, but I believe this was already addressed in nbconvert 5.5 release. Can you confirm the issue is fix / present in 5.5 before we look at making any change here?

Converting to ascii here is not safe either.

SpencerPark · 2019-05-14T23:27:21Z

Woops my bad, thought I was up to date. It is fixed in 5.5.0. #904 can be closed regardless of the rest of our discussion here.

The conversion to ascii shouldn't be a problem here as the string is encoded in base64 so all chars are in ascii ranges.

I still think it is strange to be returning bytes from the pdf conversion though as it adds output to a notebook cell (even if just intermediately). The newest extractoutput preprocessor handles both cases but the comments suggest it is expecting a b64 string (as opposed to b64 bytestring). binascii's a2b_base64 handles it by letting it pass though.

# Binary files are base64-encoded, SVG is already XML
if mime_type in {'image/png', 'image/jpeg', 'application/pdf'}:
    # data is b64-encoded as text (str, unicode),
    # we want the original bytes
    data = a2b_base64(data)

I'll defer to your expertise on whether it is acceptable to have invalid JSON while the notebook json is in memory (an represented as a python object where bytes is fine) potentially for performance reasons? nbformat also seems to handle bytes fine and encodes it during the serialization but displaying a pdf from jupyter would include a base64 string in the output.

MSeal · 2019-05-14T23:58:26Z

Yeah I tried making a similar change and there were some valid use-cases and patterns where the binary is expected for the in-memory representation. I can go dredge up the exact patterns but we probably can't change it in 5.x versions.

Going to close for now as the root issue is solved.

SpencerPark · 2019-05-15T00:41:15Z

Fair enough, thanks for the review!

Return string rather than bytes from SVG2PDFPreprocessor conversion.

95c3d3c

MSeal closed this May 14, 2019

MSeal added this to the no action milestone Sep 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return string rather than bytes from SVG2PDFPreprocessor conversion. #1025

Return string rather than bytes from SVG2PDFPreprocessor conversion. #1025

SpencerPark commented May 14, 2019

MSeal commented May 14, 2019

SpencerPark commented May 14, 2019

MSeal commented May 14, 2019

SpencerPark commented May 15, 2019

Return string rather than bytes from SVG2PDFPreprocessor conversion. #1025

Return string rather than bytes from SVG2PDFPreprocessor conversion. #1025

Conversation

SpencerPark commented May 14, 2019

MSeal commented May 14, 2019

SpencerPark commented May 14, 2019

MSeal commented May 14, 2019

SpencerPark commented May 15, 2019