Correct handling of unbraced kerns followed by spaces. #751

kohler · 2017-06-30T01:52:46Z

Example: a\mkern-1mu b should be parsed like a\mkern{-1mu}b. It wasn't, now it is.

Did not realize that Parser.nextToken.text can contain spaces (it can). Handle that.

This may allow, for instance, #727 to use the actual TeX definitions, rather than adding spurious braces.

kevinbarabash

Looks good. Small nit regarding how we consume spaces.

kevinbarabash · 2017-06-30T02:37:59Z

src/Parser.js

@@ -789,7 +789,7 @@ Parser.prototype.parseSizeGroup = function(optional) {
    let res;
    if (!optional && this.nextToken.text !== "{") {
        res = this.parseRegexGroup(
-            /^[-+]? *(?:$|\d+|\d+\.\d*|\.\d*) *[a-z]{0,2}$/, "size");
+            /^[-+]? *(?:$|\d+|\d+\.\d*|\.\d*) *[a-z]{0,2}\s*$/, "size");


We use this.consumeSpaces(); in other places. It might be good to do the same here for consistency. It should be noted that consumeSpaces does not consume all whitespace, only plain old spaces. I wonder if it should be upgrade?

There's no way to call consumeSpaces() here, this is before the tokens are created. It would need to be move into this.parseRegexGroup. And that would be a bigger project: ti would change the meaning. Please take commit as is.

How will this \s* interact with newlines? I suppose KaTeX doesn't support the double newline -> \par conversion, but if it did, I suppose this regex might cause trouble...

Agreed with @kohler on consumeSpaces -- that's only for when spaces have already been parsed by the tokenizer.

To avoid further discussion on hypothetical future problems or inconsistencies, I will make the regex spaces only. :)

And spaces-only is certainly more consistent with the rest of the regex! Thanks for pushing further.

Actually, I think the behavior you've implemented with \s* is more consistent with the rest of KaTeX, which treats \n and spaces identically. (just did some tests on https://khan.github.io/KaTeX/ ) I think it would be confusing for \kern 1em\n... to fail.

I would actually argue for \s* in both places. The following renders like x\kern 1em y in LaTeX.

x\kern 1em y

Ah, but whitespace conversion is already handled by KaTeX's lexer, which changes all [ \r\n\t]+ sequences to a single space character.

New commit has test making this clear.

Ah, nice! (And that's where double-newline handling would go.)

kevinbarabash · 2017-06-30T02:38:57Z

test/katex-spec.js

@@ -1037,17 +1040,42 @@ describe("A non-braced kern parser", function() {
    const emKern = "\\kern1em";
    const exKern = "\\kern 1 ex";
    const muKern = "\\kern 1mu";
+    const abKern1 = "a\\mkern1mub";
+    const abKern2 = "a\\kern-1mub";
+    const abKern3 = "a\\kern-1mu b";


Thanks for filling out the tests. Only abKern3 would fail without the changes to Parser.js.

edemaine · 2017-06-30T15:11:47Z

I didn't even realize that unbraced size arguments were supported by KaTeX! (because I always added extra spaces, it seems) Thanks for fixing this.

Did not realize that `Parser.nextToken.text` can contain spaces (it can). Handle that.

edemaine · 2017-06-30T15:32:12Z

I approve this PR, but I'll leave it to @kevinbarabash to confirm and push the button.

kevinbarabash · 2017-06-30T17:40:20Z

@kohler thanks for the changes. @edemaine thanks for looking this over. Always good to have a second pair of eyes.

kevinbarabash reviewed Jun 30, 2017

View reviewed changes

kevinbarabash self-assigned this Jun 30, 2017

Correct handling of unbraced kerns followed by spaces.

922db7d

Did not realize that `Parser.nextToken.text` can contain spaces (it can). Handle that.

kohler force-pushed the fix-unbraced-kern branch from 07c527e to 922db7d Compare June 30, 2017 15:28

edemaine mentioned this pull request Jun 30, 2017

Implement \coloneqq, \colonequals, etc. based on mathtools and colonequals #727

Merged

kevinbarabash merged commit 87b9123 into KaTeX:master Jun 30, 2017

kohler deleted the fix-unbraced-kern branch June 30, 2017 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct handling of unbraced kerns followed by spaces. #751

Correct handling of unbraced kerns followed by spaces. #751

kohler commented Jun 30, 2017

kevinbarabash left a comment

kevinbarabash Jun 30, 2017

kohler Jun 30, 2017

edemaine Jun 30, 2017

edemaine Jun 30, 2017

kohler Jun 30, 2017

kohler Jun 30, 2017

edemaine Jun 30, 2017 •

edited

Loading

edemaine Jun 30, 2017

kohler Jun 30, 2017

edemaine Jun 30, 2017

kevinbarabash Jun 30, 2017

edemaine commented Jun 30, 2017

edemaine commented Jun 30, 2017

kevinbarabash commented Jun 30, 2017

Correct handling of unbraced kerns followed by spaces. #751

Correct handling of unbraced kerns followed by spaces. #751

Conversation

kohler commented Jun 30, 2017

kevinbarabash left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edemaine Jun 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edemaine commented Jun 30, 2017

edemaine commented Jun 30, 2017

kevinbarabash commented Jun 30, 2017

edemaine Jun 30, 2017 •

edited

Loading