docs/directives.rst

.. include:: links.rst


Grammar Directives
------------------

|TatSu| allows *directives* in the grammar that control the behavior of the generated parsers. All directives are of the form ``@@name :: <value>``. For example:

.. code::

    @@ignorecase :: True


The *directives* supported by |TatSu| are described below.


``@@grammar :: <word>``
~~~~~~~~~~~~~~~~~~~~~~~

    Specifies the name of the grammar, and provides the base name for the classes in parser source-code generation.


``@@comments :: <regexp>``
~~~~~~~~~~~~~~~~~~~~~~~~~~

Specifies a regular expression to identify and exclude inline (bracketed) comments before the text is scanned by the parser. For ``(* ... *)`` comments:

.. code::

    @@comments :: /\(\*((?:.|\n)*?)\*\)/

.. note::
   Prior to 5.12.1, comments implicitly had the `(?m) <https://docs.python.org/3/library/re.html#re.MULTILINE>`_ option defined. This is no longer the case.

``@@eol_comments :: <regexp>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Specifies a regular expression to identify and exclude end-of-line comments before the text is scanned by the parser. For ``# ...`` comments:

.. code::

    @@eol_comments :: /#([^\n]*?)$/

.. note::
   Prior to 5.12.1, eol_comments implicitly had the `(?m) <https://docs.python.org/3/library/re.html#re.MULTILINE>`_ option defined. This is no longer the case.

``@@ignorecase :: <bool>``
~~~~~~~~~~~~~~~~~~~~~~~~~~

If set to ``True`` makes |TatSu| not consider case when parsing tokens. Defaults to ``False``:


.. code::

    @@ignorecase :: True


``@@keyword :: {<word>|<string>}+``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Specifies the list of strings or words that the grammar should consider as *"keywords"*.
May appear more than once. See the `Reserved Words and Keywords`_ section for an explanation.

.. _`Reserved Words and Keywords`: syntax.html#reserved-words-and-keywords


``@@left_recursion :: <bool>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Enables left-recursive rules in the grammar. See the `Left Recursion`_ sections for an explanation.

.. _`Left Recursion`: left_recursion.html


``@@namechars :: <string>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~

A list of (non-alphanumeric) characters that should be considered part of names when using the `@@nameguard`_ feature:

.. code::

    @@namechars :: '-_$'

.. _`@@nameguard`: #nameguard-bool


``@@nameguard :: <bool>``
~~~~~~~~~~~~~~~~~~~~~~~~~

When set to ``True``, avoids matching tokens when the next character in the input sequence is alphanumeric or a ``@@namechar``. Defaults to ``True``. See the `'text' expression`_ for an explanation.

.. code::

    @@nameguard :: False

.. _`'text' expression`: syntax.html?highlight=nameguard#text-or-text


``@@parseinfo :: <bool>``
~~~~~~~~~~~~~~~~~~~~~~~~~

When ``True``, the parser will add parse information to every ``AST`` and ``Node`` generated by the parse under a ``parseinfo`` field. The information will include:

* ``rule`` the rule name that parsed the node
* ``pos`` the initial position for the node in the input
* ``endpos`` the final position for the node in the input
* ``line`` the initial input line number for the element
* ``endline`` the final line number for the element

Enabling ``@@parseinfo`` will allow precise reporting over the input source-code while performing semantic actions.


``@@whitespace :: <regexp>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Provides a regular expression for the whitespace to be ignored by the parser. If no definition is
provided, then ``r'(?m)\s+'`` will be used as default:

.. code::

    @@whitespace :: /[\t ]+/

To disable any parsing of whitespace, use ``None`` for the definition:

.. code::

    @@whitespace :: None