Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Proxy or forgo ParenthesizedExpression and ParenthesizedType AST nodes in favor of "comment-like" parentheses trivia #15481

Closed
jcmoore opened this issue Apr 30, 2017 · 4 comments
Labels
Declined The issue was declined as something which matches the TypeScript vision Suggestion An idea for TypeScript

Comments

@jcmoore
Copy link

jcmoore commented Apr 30, 2017

I'm guessing this is a pretty big ask, but here goes...

A (naive) review of prior issues (esp. #920) suggests that Parenthesized Nodes were added to support the typing system. My understanding of the typing system is vague at best, but the inclusion of Parenthesized Nodes makes traversing makes ASTs more tedious because, very often, Expression and Type Nodes need to be recursively checked to potentially unwrap the more meaningful underlying node (especially when searching for structural patterns).

Existing workarounds include:

  • heavily leveraging higher level traversal APIs
  • preprocessing whole ASTs to remove Parenthesized Nodes before use traversal
    • but this may add significant complexity should it become desirable to print/emit the AST again without mucking up the parsing precedence

The suggestion (if possible and in order of preference) is either:

  • to optionally (or altogether*) omit Parenthesized Nodes, choosing instead to merge parenthesis trivia directly into the first (would-be) child node that is not a Parenthesized Node
    • the effect would be similar to how (some?) comments are merged into node.jsDoc
    • if made optional, the functionality could be exposed as a sibling API of createSourceFile or createSourceFile itself could be made to accept an additional optional setting parameter
      • (if an additional parameter is accepted, it may make sense to have it be a "multi-field" configuration object to reduce the likelihood of future method signature changes)
  • to continue to include Parenthesized Nodes in the AST, but implement them as Proxies that transparently behave like the underlying non-parenthesized descendent node with the ability to tap into the un-proxied data (probably by some specialized getter/symbol/method)
    • this would be a breaking change on the Parenthesized Node interfaces and it might be especially difficult to properly handle node.parent links

Definitely an optional configuration parameter for createSourceFile() (or a new sibling method thereto) would be preferable and I HOPE that it wouldn't be too difficult to merge data -- that would have otherwise resided in the Parenthesized Nodes -- directly into a given non-parenthesized node.


(*) Of course if there were an appetite to remove Parenthesized Nodes altogether, deprecation would need to be considered because of the breaking change to existing tooling

@mhegazy
Copy link
Contributor

mhegazy commented May 1, 2017

Nodes makes traversing makes ASTs more tedious

Not sure i agree with that. but assuming i did, this would be a large breaking change to all our API consumers, and a substantial change to the code base for the value of convenience writing a tree traversal switch statement.

@mhegazy mhegazy added Suggestion An idea for TypeScript Declined The issue was declined as something which matches the TypeScript vision labels May 1, 2017
@mhegazy mhegazy closed this as completed May 1, 2017
@jcmoore
Copy link
Author

jcmoore commented May 2, 2017

Not sure this appeal and clarification will make a difference... but:

The extent to which adding a sibling method to createSourceFile() (say createModifiedSourceFile() for lack of a better name) and then implementing parenthesis as trivia might create breaking changes for supporting/related methods surprises me, but this is easily beyond my scope of understanding.

However, I'd like to try to clarify when and how Parenthesized Nodes can be inconvenient -- which I do not believe concerns the writing of switch statements. Please consider the following examples linking to astexplorer.net:

  1. alpha["beta"].gamma
  2. alpha.beta.gamma
  3. (((alpha)).beta).gamma
  4. // multiline -- see below
alpha  ./**/beta//
 .
gamma

Say, for example, you wanted to search a bunch of files for the expression alpha.betagiven each file as a SourceFile z.

You could find examples (2) and (4) with something like the following:

z.forEachChild(node =>
    node.kind === SyntaxKind.PropertyAccessExpression &&
    node.right.kind === SyntaxKind.Identifier &&
    node.right.text === "beta" &&
    node.left.kind === SyntaxKind.Identifier &&
    node.left.text === "alpha" &&
    console.log("found", node))

but you would miss (3), despite its semantic similarity, unless you did something more like:

z.forEachChild(function (node) {
    if (node.kind === SyntaxKind.PropertyAccessExpression &&
        node.right.kind === SyntaxKind.Identifier &&
        node.right.text === "beta") {

        var temp = node.left;
        while (temp.kind === SyntaxKind.ParenthesizedExpression) {
            temp = temp.expression;
        }

        if (temp.kind === SyntaxKind.Identifier &&
            temp.text === "alpha") {
            console.log("found", node);
        }
    }
})

(For good reason, neither approach would find (1) )

The more extensive the structure of the search query, the more occasions you might need to iteratively/recursively pass over Parenthesized Nodes.

Say further that you had hoped to make a VSCode plugin that exposed AST-based search to end-users via a text input. A relatively simple approach might be to parse the input text to AST and -- provided an AST of the document being searched -- attempt to match some subtree of the input to some subtree of the document. The presence of Parenthesized Nodes can make locating semantically meaningful matches a much more involved process.

Going somewhat beyond search, I'd also make a case for AST-centric mechanisms for evaluating code similarity -- given one or more SourceFiles, finding semantically similar subtrees therein is complicated by the existence of Parenthesized Nodes.

If you have recommendations for approaches preferable to what I've described above that would also be helpful.

I may be misunderstanding something, but it seems to me that while parenthesis might be essential during token parsing for stickiness/precedence, once the AST is assembled Parenthesized Nodes wouldn't communicate anything to the compiler/type-checker that couldn't reasonably be communicated in some other way (for instance as extra metadata/trivia on the Expression or Type nodes themselves).

@RyanCavanaugh
Copy link
Member

RyanCavanaugh commented May 2, 2017

We have a function called skipParentheses, I'd recommend doing the same.

Any unary operator could be hoisted out into node metadata. I don't see what's gained by making the tree into "It's a tree, sometimes, except when there are parens, in which case there are flags over here" for the sake of not having to call skipParentheses when you decide you don't care about parens. We've managed to write > 100K LOC like this; it's not really a problem.

@jcmoore
Copy link
Author

jcmoore commented May 3, 2017

Thanks for pointing out skipParentheses(), good to know*.

I've not fully read nor fully understood the JS spec, but I would point out parenthesis used in the manner that would generate a ParenthesizedExpression in the parser seems to be referred to as a Grouping Operator which could be considered a unary operator (lowercase) but is distinct from canonical Unary Operators (uppercase). I think there is a significant nuance here in that, Unary Operator CAN change the expression value of their operand, but the Grouping Operator CANNOT.

Also, I think it's curious that you might consider an AST without explicit Parenthesized Nodes to be any more "exceptional" than an AST without explicit Comment Nodes, but I guess one is entitled to their opinion.

Of course "we're not going to consider this proposal simply because we don't think it's important" is a response I can accept.

(*) Though frankly I don't think it makes implementing subtree comparisons much cleaner -- which is ultimately the reason for my ask. After all, I've already implemented something like skipParentheses() as I've linked in the OP -- i.e. "traversal APIs".

@microsoft microsoft locked and limited conversation to collaborators Jun 14, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Declined The issue was declined as something which matches the TypeScript vision Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

3 participants