Proposal: CST-like Bogus AST Nodes #1867
Replies: 4 comments 12 replies
-
This is true but I don't think it adds much value because authors can always fall back to The question would then also be why we even need the function in the first place because it would then only be visible to the parser crate (or the future, an AST crate) but the parser has no interest in ever iterating over the AST. I assume what you wanted is to make this method "public to rome only". I don't know of a way to do that in Rust that has no "friendly assemblies" that are allowed to poke into internals.
The AST currently doesn't expose a
or write them by hand... The main challenge I see is that every union needs its dedicated name. What would you call
We would probably need to have a |
Beta Was this translation helpful? Give feedback.
-
For @jamiebuilds My first question is: what's the different between an
How would that work with linters? If we decorate the whole statement - like in this example - in a bogus, analyzers won't be able see anything. Unless we will have analyzers with different access rights. For @MichaReiser
I believe |
Beta Was this translation helpful? Give feedback.
-
My main worry is about the number of types this will introduce. We also need to consider what parents a node can have if a AST is invalid. For example, a |
Beta Was this translation helpful? Give feedback.
-
I spend some more time investing error recovery and explored what Swift's doing. These are my findings. Error Recovery
Desired properties
Token Rules
Rome Parser's approach
Downsides
Swift's approach
Upsides
Downsides
Implementation ProposalIdeally, the decision if a node should be wrapped inside of an
ExamplesIllegal Class IDclass await {
a;
} Rome Parser - JsClassStatement:
- class_token: 'class'
- id: UnknownBinding
- 'await'
- l-paren: '{'
- members: List
- JsPropertyClassMember:
- id: Identifier
- 'a'
- ';'
- r-paren: '}' Swift Assuming there's no - JsUnknownStatement:
- 'class'
- Unknown:
- Unknown:
- 'await'
- '{'
- List:
- JsPropertyClassMember:
- id: Identifier
- 'a'
- ';'
- '}' Illegal Reference Identifierawait + test; Rome Parser - JsExpressionStatement:
- expression: JsBinaryExpression
- left: JsUnknownExpression
- 'await'
- operator: '+'
- right: JsIdentifierExpression
- identifier: IdentifierReference
- name: 'test'
Swift - JsExpressionStatement:
- expression: JsBinaryExpression
- left: JsUnknownExpression
- Unknown:
- 'await'
- operator: '+'
- right: JsIdentifierExpression
- identifier: IdentifierReference
- name: 'test'
Variableslet a, [b]; Rome Parser - JsVariableStatement:
- kind: 'let'
- declarations: List
- VariableDeclaration:
- id: JsIdentifierBinding
- identifier: 'a'
- ','
- JsUnknownVariableDeclaration:
- JsArrayPattern:
- '['
- elements: List
- JsIdentifierBinding
- identifier: 'b'
- ']' Swift - JsUnknownStatement:
- kind: 'let'
- Unknown:
- VariableDeclaration:
- id: JsIdentifierBinding
- identifier: 'a'
- ','
- Unknown:
- JsArrayPattern:
- '['
- elements: List
- JsIdentifierBinding:
- identifier: 'b'
- ']' Expressionsa + b + Rome Parser JsExpressionStatement:
- expression: JsBinaryExpression
- left: JsBinaryExpression
- left: JsIdentifierExpression
- identifier: JsIdentifierReference
- name: 'a'
- operator: '+'
- right: JsIdentifierExpression
- identifier: JsIdentifierReference
- name: 'b'
- operator: '+'
- right: (missing required) Swift JsExpressionStatement:
- expression: JsUnknownExpression
- JsBinaryExpression:
- left: JsIdentifierExpression
- identifier: JsIdentifierReference (identifier)
- name: 'a'
- operator: '+'
- right: JsIdentifierExpression
- identifier: JsIdentifierReference
- name: 'b'
- '+' Missing Members{ , } Rome Parser - JsObjectExpression:
- l-paren: '{'
- members: AstSeparatedList:
- missing element
- ','
- r-paren: '};
Swift - JsUnknownExpression
- '{'
- Unknown
- missing element
- ','
- '};
Unexpected Tokenimport 4 from "mod"; Rome Parser JsUnknownStatement:
- 'import_kw'
- 4
- 'from_kw'
- JsModuleSource:
- source: '"mod"'
Swift JsUnknownStatement:
- 'import_kw'
- Unknown:
- 4
- 'from_kw'
- JsModuleSource:
- source: '"mod"'
Import Statementsimport { a , b } from "mod" assert { type: "json", other: }; Rome ParserJsImportStatement:
- 'import'
- JsNamedImportClause:
- specifier_list: JsNamedImportSpecifierList
- l_brack: '{'
- specifiers: AstSeparatedList
- JsNamedShorthandImportSpecifier:
- JsBindingIdentifier:
- 'a'
- ','
- JsNamedShorthandImportSpecifier:
- JsBindingIdentifier:
- 'b'
- r_brack: '}'
- 'from'
- JsModuleSource:
source: "mod"
- JsImportAssertion
- assert_kw: 'assert'
- l_brack: '{'
- entries: AstSeparatedList
- JsImportAssertionEntry:
- key: 'type'
- colon: ':'
- value: '"json"'
- JsImportAssertionEntry:
- key: 'other'
- colon: ':'
- value: missing required
- r_brack: '}' SwiftJsUnknownStatement:
- 'import'
- Unknown:
- JsNamedImportSpecifierList:
- l_brack: '{'
- specifiers: AstSeparatedList
- JsNamedShorthandImportSpecifier:
- JsBindingIdentifier:
- 'a'
- ','
- JsNamedShorthandImportSpecifier:
- JsBindingIdentifier
- 'b'
- r_brack: '}'
- 'from'
- JsModuleSource:
source: "mod"
- Unknown
- 'assert'
- '{'
- Unknown:
- JsImportAssertionEntry:
- key: 'type'
- colon: ':'
- value: '"json"'
- Unknown:
- 'other'
- ':'
- '}' The An option would be to contain the error by introducing an Imports 2import { a , a } from "mod"; Rome ParserJsImportStatement:
- 'import'
- JsNamedImportClause:
- specifier_list: JsNamedImportSpecifierList
- l_brack: '{'
- specifiers: AstSeparatedList
- JsNamedShorthandImportSpecifier:
- JsBindingIdentifier:
- 'a'
- ','
- JsUnknownNamedImportSpecifier:
- 'a'
- r_brack: '}'
- 'from'
- JsModuleSource:
source: "mod" SwiftJsUnknownStatement:
- 'import'
- Unknown:
- Unknown:
- '{'
- Unknown:
- JsNamedShorthandImportSpecifier:
- JsBindingIdentifier:
- 'a'
- ','
- Unknown:
- 'a'
- r_brack: '}'
- 'from'
- JsModuleSource:
source: "mod" SummarySwift's approach errors on the safe side by converting a node to an I particularly like Swift's generic approach to determine when a node must be converted to an |
Beta Was this translation helpful? Give feedback.
-
Description
CST nodes have a lot of qualities that make it nice to represent both errors and error recovery.
Take the following code:
It should look something like this in CST form:
We can tell a few things from this:
JsFunctionStatement
even though it contains errors.foo
is the identifier.Just as easily we could have created this CST:
Or this one:
Or this one:
It that sense, the CST is a very clear representation of how our error recovery works, and it provides a lot of flexibility for us to improve.
But we throw all that out as soon as we're using the AST.
In our current design we have a limited set of "Bogus" (Unknown) nodes where we can say an error exists. But it doesn't provide us the flexibility to recover inside of the node. It also forces us to pretend an error is more wide-spread than it really should be.
But what if we made these bogus nodes more CST-like?
This provides us with all of the flexibility of the CST:
AST navigation
Let's take this AST:
There are a number of different operations an analyzer could run on this:
To summarize:
BogusJsStatement
.While I think you should be able to access the
BogusJsStatement
, I don't think it should give you access to its children.My biggest concern is that analyzers, particularly third-party ones, would eventually end up depending on indexes of the
BogusJsStatement
and it would make every change to our error recovery a breaking change.Luckily, Rust allows us to make it so that those indexes are not a public API:
However, we should consider what types we return for child nodes trying to reach back up the tree into the bogus node.
FunctionBody::parent()
could return any of:AnyJsNode
Result<AnyJsNode, AnyBogusJsNode>
AnyJsFunction | AnyBogusJsNode
Some(AnyJsFunction)
I'm not sure which is the best choice, but the last two of those would require us to codegen
.parent()
methods everywhere.Beta Was this translation helpful? Give feedback.
All reactions