Skip to content

Handle comments in sway-fmt-v2 #2122

@mitchmindtree

Description

@mitchmindtree

This is a dedicated issue for tracking the handling of comments in sway-fmt-v2, part of #1516. I'll add thoughts/ideas here as I progress.


Currently, sway-fmt-v2 works by using sway_parse to parse the file into a syntax tree. This is useful as it allows us to take advantage of the same parsing that the compiler uses and trivially apply formatting based on the item types.

Unfortunately, comments are omitted during the parsing stage, as they are irrelevant to the compilation of the program. In sway-fmt-v2's case this means we end up ignoring user comments and the current implementation deletes them entirely.

Retrieving spans

This problem is discussed in #1517. We can at least start by retrieving the Spans during lexing, however we likely still do not want to include them within the syntax tree.

Interleaving comments

Without having comments in the syntax tree itself, we'll need a general way of interleaving the comments into the syntax tree traversal so that we know if items we are formatting precede/follow/intersect comments. Comments can appear anywhere, including in the middle of a leaf node's span (e.g. between a function argument and the next comma punctuation).

One way of approaching this might be to add a Span::intersects method that allows for testing whether a span overlaps another. One way to achieve this could be to use a BTreeMap<Span, Comment>, and lookup this map with the item's span as we traverse the AST to check whether or not we intersect any comments.

This requires Span being Ord, which currently is not the case. One Ord implementation for Spans that could make sense is to order them by their AST visitation order. In other words, enclosing spans should always precede sub-spans. Handling file paths in the Ord implementation is less clear. Potentially, this should be split into a separate FileSpan type to separate this concern from the Span itself.

Preserving whitespace

The whitespace between a comment's span and the following/preceding item can also be intentional and should likely be preserved to some extent. E.g. if a comment follows an item on the same line of text, we don't want to push the comment below by inserting a newline after the item.

MVP requirements

As a very initial implementation, we should not attempt to do any reformatting of the comments themselves, but instead simply preserve the comments as they are. For comments that exist between AST nodes, we can start by adding them them verbatim with the same surrounding whitespace as what they had previously. We can think about fancier comment handling / rearranging of comments after crossing the MVP line and replacing V1.


TODO

  • Investigate how comments are handled in rustfmt to see if it provides any useful insights. Edit: rustfmt's comment.rs module alone is bigger than sway-fmt-v2 in its entirety 😂 In general, I think we have a little better interop with our compiler internals and can avoid some of the manual parsing required there in favour of updating sway-parse.
  • Add a CommentedTokenStream to sway_parse along with a strip_comments method. Add lex_commented and CommentedTokenStream to sway_parse #2123.
  • Implement Ord for Span as mentioned above. If unsuitable for sway-types due to paths, use our own Span type with conversions implemented. Edit: after taking another look at Span I think we don't want to use it directly. Instead we should use our own span (i.e. a simple byte range) and handle one file at a time.
  • Use CommentedTokenStream to collect a map of spans -> comments in sway-fmt-v2.
  • Come up with a generic way of preserving comments around/within items using the span->comment map.

Metadata

Metadata

Assignees

Labels

P: highShould be looked at if there are no critical issues leftcompiler: parserEverything to do with the parserformatterteam:compilerCompiler Team

Type

No type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions