Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that Font.charToGlyph won't fail because String.fromCodePoint is given an invalid code point (issue 11768) #11769

Merged

Conversation

Snuffleupagus
Copy link
Collaborator

Please note: This patch on its own is not sufficient to address the underlying problem in the referenced issue, hence why no test-case is included since the actual bug still needs to be fixed.

As can be seen in the specification, https://tc39.es/ecma262/#sec-string.fromcodepoint, String.fromCodePoint will throw a RangeError for invalid code points.

In the event that a CMap, in a composite font, contains invalid data and/or we fail to parse it correctly, it's thus possible that the glyph mapping that we build end up with entires that cause String.fromCodePoint to throw and thus Font.charToGlyph to break.
If that happens, as is the case in issue #11768, significant portions of a page/document may fail to render which seems very unfortunate.

While this patch doesn't fix the underlying problem, it's hopefully deemed useful not only for the referenced issue but also to prevent similar bugs in the future.

@Snuffleupagus Snuffleupagus force-pushed the charToGlyph-fontCharCode-range branch from 9651efe to 3e4bb4f Compare March 31, 2020 21:44
@Snuffleupagus Snuffleupagus changed the title Ensure that Font.charToGlyph won't fail completely (issue 11768) Ensure that Font.charToGlyph won't fail because String.fromCodePoint is given an invalid code point (issue 11768) Mar 31, 2020
@Snuffleupagus Snuffleupagus force-pushed the charToGlyph-fontCharCode-range branch from 3e4bb4f to 251c187 Compare March 31, 2020 21:45
@Snuffleupagus
Copy link
Collaborator Author

/botio-linux preview

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/4d1091cde93c899/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/4d1091cde93c899/output.txt

Total script time: 2.44 mins

Published

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/1c937e61dc82d6c/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/e2e9e68fdf4f92e/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.215.176.217:8877/e2e9e68fdf4f92e/output.txt

Total script time: 1.79 mins

  • Font tests: FAILED
  • Unit tests: FAILED
  • Regression tests: FAILED

Image differences available at: http://54.215.176.217:8877/e2e9e68fdf4f92e/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/1c937e61dc82d6c/output.txt

Total script time: 19.70 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/1c937e61dc82d6c/reftest-analyzer.html#web=eq.log

typeof fontCharCode === "number"
? String.fromCodePoint(fontCharCode)
: "";
let fontChar = "";
Copy link
Contributor

@timvandermeij timvandermeij Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method called often? If so, it may be beneficial to leave it undefined here and only set it to the empty string in the exception cases to avoid unnecessary string allocation in the common case when the code point is valid.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method called often?

Yes, but comparing the run-time of the tests here to other PRs there really doesn't seem to be any noticeable slowdowns overall (I think browsers may be able to optimize/re-use strings internally).

and only set it to the empty string in the exception cases

While I did consider that, it's not clear to me that it's entirely necessary here (also the code would become a bit "messy" in that case IMHO).
Also, you're still having to allocate the variable (albeit implicitly to undefined) which is probably why it doesn't seem to matter in practice.

: "";
let fontChar = "";
if (typeof fontCharCode === "number") {
if (fontCharCode < 0x110000) {
Copy link
Contributor

@timvandermeij timvandermeij Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps be better to check for < 0 or > 0x10FFFF just like in the specification, so also negative numbers are handled here? If that is not deemed necessary, can we at least change this to <= 0x10FFFF so it's easier to match with the spec?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps be better to check for < 0 or > 0x10FFFF just like in the specification,

I thought about the < 0 case, but honestly it seems to me that that would indicate much larger underlying problems in the code to the point that I'd not mind a breaking error (also considering that the String.fromCodePoint changes landed ~1.5 years ago and this is the first time they've caused a problem).

can we at least change this to <= 0x10FFFF so it's easier to match with the spec?

If you think the current formatting is too confusing, maybe because of the amount of digits involved, I suppose that that can be changed :-)

@Snuffleupagus Snuffleupagus force-pushed the charToGlyph-fontCharCode-range branch from 251c187 to 14ea345 Compare April 1, 2020 20:39
…nt` is given an invalid code point (issue 11768)

*Please note:* This patch on its own is *not* sufficient to address the underlying problem in the referenced issue, hence why no test-case is included since the *actual* bug still needs to be fixed.

As can be seen in the specification, https://tc39.es/ecma262/#sec-string.fromcodepoint, `String.fromCodePoint` will throw a RangeError for invalid code points.

In the event that a CMap, in a composite font, contains invalid data and/or we fail to parse it correctly, it's thus possible that the glyph mapping that we build end up with entires that cause `String.fromCodePoint` to throw and thus `Font.charToGlyph` to break.
If that happens, as is the case in issue 11768, significant portions of a page/document may fail to render which seems very unfortunate.

While this patch doesn't fix the underlying problem, it's hopefully deemed useful not only for the referenced issue but also to prevent similar bugs in the future.
@Snuffleupagus Snuffleupagus force-pushed the charToGlyph-fontCharCode-range branch from 14ea345 to 87142a6 Compare April 3, 2020 07:49
@pdfjsbot
Copy link

pdfjsbot commented Apr 3, 2020

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/aaffa554e119241/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 3, 2020

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/243971614f89fb7/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Apr 3, 2020

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/aaffa554e119241/output.txt

Total script time: 19.74 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/aaffa554e119241/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented Apr 3, 2020

From: Bot.io (Windows)


Failed

Full output at http://54.215.176.217:8877/243971614f89fb7/output.txt

Total script time: 60.00 mins

@timvandermeij timvandermeij merged commit 702fec5 into mozilla:master Apr 4, 2020
@timvandermeij
Copy link
Contributor

Thanks!

@Snuffleupagus Snuffleupagus deleted the charToGlyph-fontCharCode-range branch April 4, 2020 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants