yield in nested function should be parsed as identifier #552

RReverser · 2017-05-20T13:28:04Z

function *f1() {
  function g() {
    return yield / 1;
  }
}

function *f2() {
  () => yield / 1
}

function *f3() {
  ({
    g() {
      return yield / 1;
    }
  })
}

and other variations currently cause "Unterminated regular expression".

This is due to us (#525 cc @marijnh) trying to extend Sweet.js algorithm to recognise yield as operator vs yield as an identifier at tokenizer level, but, after playing around, I'm not sure it's possible to do that correctly for all variations of nested functions.

marijnh · 2017-05-20T18:38:18Z

It should be possible to further refine the token context tracking to accurately track what kind of function is currently is in, and to have the loop in inGeneratorContext stop at the first function context it finds.

RReverser · 2017-05-20T19:11:42Z

For the normal functions, sure, but how would we track arrow functions, for example? Even in full-featured parser it requires quite a bit of hacks to detect them, and here

() => a + b - yield / 1

we would somehow need to figure out if we are still in the body of arrow function expression or not.

I'm somewhat afraid of tokenizer getting bloated with things that were earlier needed only for parser, but if you see a clean way to do that, I'd be happy to take a look.

marijnh · 2017-05-20T20:34:35Z

Oh, right, it'd also have to work for brace-less arrow bodies. Yeah, that might be too difficult to get right. That's a shame, though.

(I noticed Esprima has a half-hearted, largely broken implementation of this algorithm, but only uses it when tokenizing without parsing. That seems a bit dodgy. Sweet.js itself has been upgraded to ES6, but fails on more corner cases than Acorn, so there's not much to be learned from that either.)

I'm not sure which would be worse, not supporting pure tokenization at all, or supporting it but having it break on corner cases like this.

RReverser · 2017-05-20T20:48:25Z

Agree... guess let's keep this issue open for now and try "best effort" approach to support at least contexts which we can easily recognise.

Partial fix for #552

marijnh · 2018-09-11T09:06:57Z

That solution was backed out again due to being problematic, so I'm reopening this.

RReverser added a commit that referenced this issue May 31, 2017

Add f_stat token context

a923b6a

Partial fix for #552

marijnh closed this as completed in effe659 Sep 10, 2018

marijnh mentioned this issue Sep 11, 2018

Consider whether parse-less tokenizing is viable in the long run #589

Open

marijnh reopened this Sep 11, 2018

marijnh closed this as completed in 22585b1 Sep 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yield in nested function should be parsed as identifier #552

yield in nested function should be parsed as identifier #552

RReverser commented May 20, 2017 •

edited

Loading

marijnh commented May 20, 2017

RReverser commented May 20, 2017 •

edited

Loading

marijnh commented May 20, 2017

RReverser commented May 20, 2017

marijnh commented Sep 11, 2018

yield in nested function should be parsed as identifier #552

yield in nested function should be parsed as identifier #552

Comments

RReverser commented May 20, 2017 • edited Loading

marijnh commented May 20, 2017

RReverser commented May 20, 2017 • edited Loading

marijnh commented May 20, 2017

RReverser commented May 20, 2017

marijnh commented Sep 11, 2018

RReverser commented May 20, 2017 •

edited

Loading

RReverser commented May 20, 2017 •

edited

Loading