- bool literal isn't mentioned anywhere, I propose
BOOL_LIT
should be added toliteral
- I propose
NIL
should beNIL_LIT
for consistency colcom
is identical tocolon
. Why differentiate?IND
without curly braces /optInd
is not explained exactly- I propose
COMMENT
should be changed toDOC_COMMENT
or similar to be more explicit about its actual meaning forExpr
isn't explained in the manual, or elsewhere^+
and^*
is not defined to include and optional trailing separator, but in reality it does, eg:simpleStmt ^+ ';'
- there are many instances where the
^+
and^*
are paraphrased in some way, for example:
identColonEquals = IDENT (comma IDENT)* comma?
sometimes with and sometimes without trailing separator.
My proposition would be to change it to:
^+
and ^*
don't allow optional trailing separators and
^+?
and ^*?
do allow optional trailing separators.
rewriting the example to:
identColonEquals = IDENT ^+? comma
expr
definition containscaseStmt
which is inconsistent, because e.g.forStmt
was renamed toforExpr
just for that
pragmaStmt
,importStmt
,exportStmt
,fromStmt
andincludeStmt
call foroptInd
but really there is no whitespace restriction on the compiler side eg:
import std/strutils
except
substr, substr
compiles just fine
11. the current definition of module
and stmt
:
module = stmt ^* (';' / IND{=})
stmt = (IND{>} complexOrSimpleStmt^+(IND{=} / ';') DED)
/ simpleStmt ^+ ';'
strictly requires IND{>}
and DED
around a complexOrSimpleStmt
and makes a nim file like this:
let a = 9
impossible
12. the way ifExpr
is defined
condExpr = expr colcom expr optInd
('elif' expr colcom expr optInd)*
'else' colcom expr
ifExpr = 'if' condExpr
is incorrect, since
discard if true:
true
else :
echo "hi"
true
is valid syntax, but in the definition it allows only a single expr
per block.
so it should be more like this:
condExpr = expr colcom stmt
('elif' expr colcom stmt)*
'else' colcom stmt
which looks an awful lot like condStmt
, so why not just do
ifExpr = ifStmt
, like you did with forExpr
?
stmt
can be an expr
after all, via exprStmt
.
13. caseStmt
only appears as part of an expr
. It should be part of the complexOrSimpleStmt
definition as well (maybe renamed to caseExpr
for the use in expr
for constistency)
14. blockStmt
and blockExpr
are excactly the same. why not define one with the other, like with the forStmt
?
15. another example of a missed use for ^+
declColonEquals = identWithPragma (comma identWithPragma)* comma?
(':' optInd typeDesc)? ('=' optInd expr)?
# this could be shortened to
declColonEquals = identWithPragma ^+? comma
(':' optInd typeDesc)? ('=' optInd expr)?
- experimental feature: parameter constraints are not mentioned in the syntax, even though term rewriting macros are (they are called
pattern
) this could be done by:
# changing
paramList = '(' declColonEquals ^* (comma/semicolon) ')'
# to
paramList = '(' paramColonEquals ^* (comma/semicolon) ')'
# and adding
paramColonEquals = (identWithPragma ^+? identWithPragma)
(':' optInd typeDesc paramConstraint?)? ('=' optInd expr)?
paramConstraint = '{' expr '}'
- the definition of
variable
doesn't allow for pragmas like so:
var x {.deprecated.}: char
since it uses identColonEquals
, it could probably be changed to use declColonEquals
, this would also solve 23.
18. the operator precedence table should mention single '|' and '~' in the OP8
row
19. typeDesc
and typeDefAux
have the same definition, why differentiate?
20. the definition of typeDef
is incorrect and unnecessarily long
typeDef = identWithPragmaDot genericParamList? '=' optInd typeDefAux
indAndComment? / identVisDot genericParamList? pragma '=' optInd typeDefAux
indAndComment?
# should be
typeDef = (identWithPragmaDot genericParamList? '=' optInd typeDefAux
indAndComment?) / (identVisDot genericParamList? pragma '=' optInd typeDefAux
indAndComment?)
# and can be shortened to
typeDef = (identWithPragmaDot genericParamList?) / (identVisDot genericParamList? pragma)
'=' optInd typeDefAux indAndComment?
- the definition of
identVisDot
is incorrect
identVisDot = symbol '.' optInd symbol OPR?
# this strictly requires symbol.symbol, rendering the following syntax invalid
type a = int
# it should be
identVisDot = symbol ('.' optInd symbol)? OPR?
- the definition of
constant
is incorrect
constant = (varTuple / identWithPragma) (colon typeDesc)? '=' optInd expr indAndComment
# this doesn't allow for
const (b, c) = (9, 9)
# it should be
constant = varTuple / (identWithPragma (colon typeDesc)? '=' optInd expr) indAndComment
identColonEquals
should usesymbol
instead ofIDENT
, or not used in places were the
`symbol`
is valid (see 17), otherwise this would be invalid syntax:
var `b` = 9
objectPart
inobjectDecl
should be optional
objectDecl = 'object' pragma? ('of' typeDesc)? COMMENT? objectPart
# to
objectDecl = 'object' pragma? ('of' typeDesc)? COMMENT? objectPart?
else this is invalid syntax:
type a = object
- the recursive definition of
objectPart
is inconsistent with other definitions and wrong,
objectPart = IND{>} objectPart^+IND{=} DED
/ objectWhen / objectCase / 'nil' / 'discard' / declColonEquals
since it would allow for syntax like:
type a = object
a: int
b: int
I propose a change to:
objectPart = section(objectWhen / objectCase / 'nil' / 'discard' / declColonEquals)
which covers all cases and is much simpler and constistent with prior conventions
26. the definition of conceptParam
seems incorrect.
conceptParam = ('var' | 'out')? symbol
conceptDecl = 'concept' conceptParam ^* ',' (pragma)? ('of' typeDesc ^* ',')?
&IND{>} stmt
would make this syntax impossible (straight from the docs).
type
MyConcept = concept x, var v, ref r, ptr p, static s, type T
...
It could be changed to:
conceptParam = ('var'|'ref'|'ptr'|'static'|'type'|'out')? symbol
- the definition for primary seems way too complicated for what it's saying
- the fact that arbitrary parentheses are allowed around
stmt
s andexpr
s is not really documented in the grammar spec - castExpr is just has a trailing
/
at the end - what are these:
'cast'|'addr'|'type'
doing inprimarySuffix
- no
objectConstr
even though it has specific syntax objectBranches
should not allow for anelif
branch, or at least I couldn't construct a valid example.
QUESTION:
what is this about?
colonBody = colcom stmt postExprBlocks?
A:
template and macro call bodys
template d(b: int, c: untyped): int =
0
var a: int = d(3): discard true
QUESTION:
what is this syntax about?
type a = object
discard
nil
A: for objectCase and objectWhen
type a = object
case kind: bool
of true:
a: int
of false:
discard # used to be nil as well
QUESTION:
postExprBlocks
A:
macro related
QUESTION:
out
A:
unused keyword so far
QUESTION:
qualifiedSuffix
primarySuffix = '(' (exprColonEqExpr comma?)* ')'
| '.' optInd symbol ('[:' exprList ']' ( '(' exprColonEqExpr ')' )?)? generalizedLit?
| DOTLIKEOP optInd symbol generalizedLit?
| '[' optInd exprColonEqExprList optPar ']'
| '{' optInd exprColonEqExprList optPar '}'
| &( '`'|IDENT|literal|'cast'|'addr'|'type') expr # command syntax
- what's with the generalizedLit at the end
- what's the single `
- what kind of suffix are the {}
====================================================================
Hi there,
I've been working on a tree sitter parser for nim for a while now and recently got back to it after a longer pause. The parser is nearing feature completion. However there are still a few roadblocks, one being the following. If you have a deeper understanding of the nim grammar, your help would be very appreciated.
grammar spec, regarding postExprBlocks
and colonBody
:
postExprBlocks = ':' stmt? ( IND{=} doBlock
| IND{=} 'of' exprList ':' stmt
| IND{=} 'elif' expr ':' stmt
| IND{=} 'except' exprList ':' stmt
| IND{=} 'finally' ':' stmt
| IND{=} 'else' ':' stmt )*
colonBody = colcom stmt postExprBlocks?
variable = (varTuple / identColonEquals) colonBody? indAndComment
exprStmt = simpleExpr
(( '=' optInd expr colonBody? )
/ ( expr ^+ comma
postExprBlocks
))?
So these are all the places in the spec where postExprBlocks
and colonBody
occur.
From my understanding colonBody
is about the body for a template/macro call like:
templ:
discard
The way the spec currently describes it, colonBody
is allowed either after a variable declaration, like:
let a = templ: # variable > identColonEquals
discard
or after an assignment, like:
a = templ: # exprStmt > simpleExpr '=' expr
discard
With the current spec a normal template call like:
templ:
discard
seems impossible. There only is this fragment in exprStmt
/ ( expr ^+ comma
postExprBlocks
))?
which allows for a syntax like expr colon stmt
.
This fragment however strictly requires at least 1 additional expr
before the colon,
so it actually calls for expr expr colon stmt
or expr expr comma expr ... colon stmt
.
This looks an awful lot like the command call syntax, which is allowed for template calls like this,
however it seems oddly out of place here, since:
- it requires at least 1 argument,
- there is no equivalent for a normal function call with parentheses,
- the command call syntax is already taken care of by
primarySuffix
, from what I understand. So actuallyexpr colon stmt
is already enough to allow for:
templ 1, 2:
discard
and
4. it uses postExprBlocks
and not colonBody
.
This brings me to my second problem: postExprBlocks
.
First things that confuses me is this:
postExprBlocks = ':' stmt? ( ...
postExprBlocks
allowing for another colon stmt
? colonBody
already does that, and it can be followed by postExprBlocks
,
so it would allow for a syntax like:
templ:
discard:
discard
which of course is nonsense, but what would be an actual use case of this? Since the nesting is already possible just with colonBody
.
Second thing is this:
doBlock = 'do' paramListArrow pragma? colcom stmt
only appears in the postExprBlocks
in the whole grammar spec.
I can only assume that doBlock
is about the do notation.
Looking at the examples:
sort(cities) do (x,y: string) -> int:
cmp(x.len, y.len)
is completely impossible with the current grammar spec, because:
postExprBlocks
requires a:
before thedoBlock
postExprBlocks
can only appear after a declaration, or an assignment (incolonBody
) or afterexpr expr
(inexprStmt
)
Third thing is:
| IND{=} 'of' exprList ':' stmt
| IND{=} 'elif' expr ':' stmt
| IND{=} 'except' exprList ':' stmt
| IND{=} 'finally' ':' stmt
| IND{=} 'else' ':' stmt )*
Frankly my understanding of macro related syntax is limited so this might be obvious, but I don't know when these could ever appear.
If you could explain any of the above problems, it would be greatly appreciated.
Thank you :)
===================================
ANSWER:
-
Turns out
colonBody
is a fraud, where it's supposed to be,postExprBlocks
is actually used in the parser.nim.
/ ( expr ^+ comma
postExprBlocks
))?
isn't entirely accurate and is indeed about the command syntax. The duplication might be unnecessary.
postExprBlocks = ':' stmt? ( IND{=} doBlock
| IND{=} 'of' exprList ':' stmt
| IND{=} 'elif' expr ':' stmt
| IND{=} 'except' exprList ':' stmt
| IND{=} 'finally' ':' stmt
| IND{=} 'else' ':' stmt )*
should actually be
postExprBlocks = doBlock /
(':' stmt? ( IND{=} doBlock
| IND{=} 'of' exprList ':' stmt
| IND{=} 'elif' expr ':' stmt
| IND{=} 'except' exprList ':' stmt
| IND{=} 'finally' ':' stmt
| IND{=} 'else' ':' stmt )*