Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make a spec-looking reference implementation of outer grammar. #4965

Open
phadej opened this issue Dec 20, 2017 · 3 comments
Open

Make a spec-looking reference implementation of outer grammar. #4965

phadej opened this issue Dec 20, 2017 · 3 comments

Comments

@phadej
Copy link
Collaborator

phadej commented Dec 20, 2017

  • The grammar is written in

    -- $grammar
    --
    -- @
    -- CabalStyleFile ::= SecElems
    --
    -- SecElems ::= SecElem* '\n'?
    -- SecElem ::= '\n' SecElemLayout | SecElemBraces
    -- SecElemLayout ::= FieldLayout | FieldBraces | SectionLayout | SectionBraces
    -- SecElemBraces ::= FieldInline | FieldBraces | SectionBraces
    -- FieldLayout ::= name ':' line? ('\n' line)*
    -- FieldBraces ::= name ':' '\n'? '{' content '}'
    -- FieldInline ::= name ':' content
    -- SectionLayout ::= name arg* SecElems
    -- SectionBraces ::= name arg* '\n'? '{' SecElems '}'
    -- @
    --
    -- and the same thing but left factored...
    --
    -- @
    -- SecElems ::= SecElem*
    -- SecElem ::= '\n' name SecElemLayout
    -- | name SecElemBraces
    -- SecElemLayout ::= ':' FieldLayoutOrBraces
    -- | arg* SectionLayoutOrBraces
    -- FieldLayoutOrBraces ::= '\n'? '{' content '}'
    -- | line? ('\n' line)*
    -- SectionLayoutOrBraces ::= '\n'? '{' SecElems '\n'? '}'
    -- | SecElems
    -- SecElemBraces ::= ':' FieldInlineOrBraces
    -- | arg* '\n'? '{' SecElems '\n'? '}'
    -- FieldInlineOrBraces ::= '\n'? '{' content '}'
    -- | content
    -- @
    --
    -- Note how we have several productions with the sequence:
    --
    -- > '\n'? '{'
    --
    -- That is, an optional newline (and indent) followed by a @{@ token.
    -- In the @SectionLayoutOrBraces@ case you can see that this makes it
    -- not fully left factored (because @SecElems@ can start with a @\n@).
    -- Fully left factoring here would be ugly, and though we could use a
    -- lookahead of two tokens to resolve the alternatives, we can't
    -- conveniently use Parsec's 'try' here to get a lookahead of only two.
    -- So instead we deal with this case in the lexer by making a line
    -- where the first non-space is @{@ lex as just the @{@ token, without
    -- the usual indent token. Then in the parser we can resolve everything
    -- with just one token of lookahead and so without using 'try'.

  • This is better done as a separate package in this repo cabal-trifecta or cabal-megaparsec, as there are technical limitations of having this as test-suite in Cabal package.

cc @sboosali

@hvr
Copy link
Member

hvr commented Dec 20, 2017

Could we use that as basis for a exactprint-style parser?

@phadej
Copy link
Collaborator Author

phadej commented Dec 20, 2017

@hvr, Field omits comments. I'd rather not to deal with them in this. exactprint-style parser can of course be a fork, if it's suitable, but it's not a goal.

@sboosali
Copy link
Collaborator

sboosali commented Dec 20, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants