-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pdf variant does not successfully include ⛄ (U+26C4 SNOWMAN WITHOUT SNOW) #6050
Comments
Font issue, I'd say $ pbpaste | echars They are in the same group in Unicode, but of course fonts don't pick up whole groups. |
Is this an issue with the pdf that comes out of xml2rfc, or the pdfized pdf-rendering-of-the-htmilzed-text that comes out of the datatracker? If the former, lets move this to the xml2rfc repo? |
PDF generated by |
in case it wasn't clear, i don't intend draft-dkg-rfcediting-non-ascii-ietf-tooling to ever become an RFC! that's just a test harness so i can push back on some of the FUD i was hearing about how non-ASCII text might be broken. I'm unaware of any RFC use case that would need either SNOWMAN character, but the demonstration is intended to highlight problems and identify structural issues in unicode coverage and transmission before some RFC really does try to use a symbol that isn't well-supported in one of the output formats. The problem pdf i found was generated by the datatracker -- i don't know what toolchain was used. When generating the file locally with thanks for looking into it, i really appreciate all the work that has been done on making the RFC series capable of including robust, modern documents with a stable and expansive character set. |
Thanks @dkg - I understand what you're doing - and what you provide above is enough for me to know which invocation of weasyprint to study. It's the one in the xml2fc environment used by the datatracker when it generates formats from xml submissions, which may well not have the right font set installed - we'll go look. |
(for the record, this I-D has been much more useful than just identifying the SNOWMAN weirdness -- it demonstrated that use cases i heard active concerns about during IETF 117 (cyrillic text, mathematical symbols) do work fine. what you see in my reports are the corner cases where things remain broken -- but the real takeaway from this for me is that the use cases people actually care about are not broken. thanks for all the work that has gone into this!) |
In the web view, on my machine, the "snowman without snow" comes from the "Apple Color Emoji" font, and the "snowman with snow" comes from the "Menlo" font. I guess that's because the CSS says That same CSS is passed into Weasyprint when making the PDF, and these are the fonts that end up in the PDF:
Not sure where/why "DejaVu" is picked up from, but I guess it doesn't have the character. Since we want to use Noto, should we add https://fonts.google.com/noto/specimen/Noto+Emoji? |
Describe the issue
draft-dkg-rfcediting-non-ascii-ietf-tooling is a test draft that contains multiple non-ascii characters. they all render just fine in the text and html variants, but the pdf variant fails to include ⛄ (U+26C4 SNOWMAN WITHOUT SNOW). it renders ☃ (U+2603 SNOWMAN) with no problem, though. Maybe this has something to do with codepoint coverage of the default fonts.
Code of Conduct
The text was updated successfully, but these errors were encountered: