Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

track line and column #59

Merged
merged 7 commits into from
May 3, 2020
Merged

track line and column #59

merged 7 commits into from
May 3, 2020

Conversation

bakkot
Copy link
Contributor

@bakkot bakkot commented Apr 22, 2020

cc @rbuckton; this changes location tracking to include line and column numbers, not just offsets.

@bakkot bakkot force-pushed the more-position branch 3 times, most recently from 0149581 to 393cd89 Compare April 22, 2020 01:00
lib/parser.js Outdated Show resolved Hide resolved
lib/parser.js Outdated Show resolved Hide resolved
lib/ecmarkdown.d.ts Outdated Show resolved Hide resolved
lib/parser.js Outdated Show resolved Hide resolved
@bakkot bakkot mentioned this pull request Apr 30, 2020
@bakkot
Copy link
Contributor Author

bakkot commented May 1, 2020

@rbuckton I addressed your comments; do you want to take another look?

@rbuckton
Copy link
Contributor

rbuckton commented May 1, 2020

Out of curiosity, what does ecmarkdown consider to be a "line terminator"? Not all languages/editors agree ([1], [2]), so we should make sure ecmarkup, ecmarkdown, and grammarkdown agree.

[1] https://twitter.com/SeaRyanC/status/1253037372263952387
[2] microsoft/TypeScript#38078 (comment)

@bakkot
Copy link
Contributor Author

bakkot commented May 1, 2020

Out of curiosity, what does ecmarkdown consider to be a "line terminator"?

'\n'.

Copy link
Contributor

@rbuckton rbuckton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor suggestions, but you're welcome to ignore them.

src/parser.ts Outdated
@@ -365,31 +370,32 @@ export class Parser {

pushPos() {
if (this._posStack) {
this._posStack.push(this.getPos());
this._posStack.push(this.getPos() as Position);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: TS has a postfix-! syntax as a simple cast to remove null/undefined, but just as with the as cast, both are type-level only with no runtime enforcement.

This is also possibly unsound if somehow you end up with a defined this._posStack but an undefined node.location. Perhaps a better approach would be ensure the value is defined:

Suggested change
this._posStack.push(this.getPos() as Position);
const pos = this.getPos();
if (pos) {
this._posStack.push(pos);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you could change the body in this way (if you are using TS 3.7 or later):

pushPos() {
  const pos = this.getPos();
  if (pos) this._posStack?.push(pos);
}

src/parser.ts Outdated
}
}

popPos() {
return this._posStack ? this._posStack.pop() : -1;
return this._posStack ? this._posStack.pop() : undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return this._posStack ? this._posStack.pop() : undefined;
return this._posStack?.pop();

src/parser.ts Outdated
return this._posStack && tok.location ? tok.location.pos : -1;
// TODO rename to getStart ?
getPos(node: Node | Token = this._t.peek()) {
return this._posStack && node.location ? node.location.start : undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return this._posStack && node.location ? node.location.start : undefined;
return this._posStack && node.location?.start;

src/parser.ts Outdated
}

getEnd(node: Node | Token) {
return this._posStack && node.location ? node.location.end : -1;
return this._posStack && node.location ? node.location.end : undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return this._posStack && node.location ? node.location.end : undefined;
return this._posStack && node.location?.end;

let actualStart: Position = start ?? (this.popPos() as Position);
let actualEnd: Position =
end ??
(this._t.previous === undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could also be:

let actualEnd: Position = end ?? { line: 1, column: 1, offset: 0, ...this._t.previous?.location?.end };

src/tokenizer.ts Outdated
@@ -425,12 +429,20 @@ export class Tokenizer {
}
}

enqueueLookahead(tok: Token, pos: number) {
getLocation() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
getLocation() {
getLocation(): Position {

Useful if you decide to rename or add a new member.

src/parser.ts Outdated
@@ -306,7 +310,8 @@ export class Parser {
this._t.next();
}

return this.finish({ name: 'text', contents });
let endLoc = !this._posStack || lastRealTok === null ? undefined : lastRealTok.location!.end;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let endLoc = !this._posStack || lastRealTok === null ? undefined : lastRealTok.location!.end;
let endLoc = this._posStack && lastRealTok?.location?.end;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastRealTok?.location?.end

The ! is fine after location because it really is always defined if it's reached - this._posStack being truthy implies every token has a location property.

... Well, that's what I thought, but using lastRealTok?.location!.end breaks tests. Which is surprising to me; I had thought the ! was just a type assertion, but it appears to be changing the semantics of the generated code. I guess TS is parsing it as the end of the "chain"? That's really surprising to me.

... Oh, I see this has already been discovered and fixed upstream. I guess I'll just // @ts-ignore this line until that's released.

@rbuckton
Copy link
Contributor

rbuckton commented May 1, 2020

Out of curiosity, what does ecmarkdown consider to be a "line terminator"?

'\n'.

For parsing or emit? Only supporting \n for parse isn't Windows friendly. It should at least support CR, LF, and CRLF, but should probably support the same line terminators HTML uses, or possibly any additional line terminators recognized by markdown implementations such as commonmark (given that ecmarkdown claims to be a markdown-like syntax). I think I need to go over what line terminators are supported by grammarkdown as well in the near future.

@bakkot
Copy link
Contributor Author

bakkot commented May 1, 2020

For parsing or emit?

For both.

I mostly would like to maintain ecmarkdown as a tool specifically for processing parts of the ECMAScript specification. In that context, the goal is not to accept all reasonable inputs, but rather to accept only a very constrained set of inputs and reject all others, so that there are as few ways of writing the spec as possible. Concretely, if a contributor to ECMA-262 mixed CRLF line endings into an algorithm, it would be rejected by me in my capacitor as editor, which means it should ideally be rejected by the tooling so that the contributor can find that out for themselves.

(That said, we should certainly at least give better errors than the generic parse error or malformed output you'd get now.)

Is there a particular reason to support inputs which would be rejected as parts of ECMA-262?

given that ecmarkdown claims to be a markdown-like syntax

Well, I'm also planning on changing that, since it seems like overkill for what this tool is actually used for.

@bakkot bakkot merged commit 747edc1 into master May 3, 2020
@bakkot
Copy link
Contributor Author

bakkot commented May 3, 2020

@rbuckton I adopted most of your suggestions and filed #67 to track line endings; thank you!

@bakkot bakkot deleted the more-position branch May 3, 2020 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants