Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch indentation when removing braces (and other bug fixes in -indent -rewrite) #17522

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

adpi2
Copy link
Member

@adpi2 adpi2 commented May 16, 2023

Primary goal

The goal of this PR is to ensure the -ident rewrite does not break the compilation, by patching the indentation where needed. It fixes #17456.

For instance, if we consider this piece of code:

def foo = {
  val msg = "Hello, World!"
println(msg)
}

It used to be rewritten to:

def foo =
  val msg = "Hello, World!"
println(msg) // [Error] Not found: msg

After the PR it is rewritten to:

def foo =
  val msg = "Hello, World!"
  println(msg)

They are many other patterns of compilation breakages involving indentation: in control structure, in pattern matching, in argument passing.

The PR consists mostly in two parts:

  • when entering a new region, it computes the required indentation, based on the required indentation of the enclosing region, and the current indentation of the first line of the region.
  • on each line of the source, it checks that the indentation is valid, and patches it otherwise.

Some important considerations are:

  • an indentation that is seamingly wrong, but does not break the compilation, should not be patched. For instance
def foo = {
  val msg = "Hello, World!"
    println(msg)
}

should stay unchanged. The rewrite is not a formatting tool, it tries to do as few changes as possible.

Secondary goals

While working on this I found many other bugs or unintended behaviors that I fixed along the way.

Disabling rewrite in Outline Parser

The outline parser, that parses the source path, should not emit any patch.

Preservation of comments

The indent rewrite should not remove any comments.

def foo(bool: Boolean) = 
  if (bool) /* 1 */ { /* 2 */
    ???
  /* 3 */ } /* 4 */  else /* 5 */ { /* 6 */
    ???
  /* 7 */ } /* 8 */

should be rewritten to

def foo(bool: Boolean) = 
  if (bool) /* 1 */ /* 2 */
    ???
  /* 3 */ /* 4 */  else /* 5 */ /* 6 */
    ???
  /* 7 */ /* 8 */

The associated test is indent-comments.scala.

Back-quoting operators in brace-less syntax

foo >=> {
  bar
}

is rewritten to

foo `>=>`:
  bar

Do not remove braces after }

foo { 5 } {
  6
}

should not be rewritten to:

foo { 5 }: // missing arguments for method foo
  6

but left unchanged.

Do not remove braces of blocks in sequence of statements

Fix #10986

{
  given Foo = foo1
  doSomethingWithFoo
}
{
  given Foo = foo2
  doSomethingWithFoo
}

should not be rewritten to:

  given Foo = foo1
  doSomethingWithFoo
  given Foo = foo2
  doSomethingWithFoo

In general we cannot remove the braces of a standalone block in a sequence of statements.

Back-quoting end ident in end match {

Otherwise it is interpreted as an end marker.

Tests and results

Added tests

Rewritting dotty to indent

The output of running the rewrite in the scala3-library and scala3-compiler can be found in #17618.

In summary, it changed 13.381 lines and removed 10.034 lines. It removed more than 10.000 pairs of {}. In the other 3.350 lines there are some } else { rewritten to else, some } // comment rewritten to // comment but also some fixed indentations.

Here are a few noticeable patches:

  • fixed indentation in here
  • a fixed off-by-one indentation here
  • some preserved comments here
  • a backquoted operator here

Running the rewrite in the community build

Here is the current status of running the rewrite in the entire community build (in #17617):

Community Build A

project status comments
izumiReflect ✔️ fixed an invalid control-structure pattern manually
scalaSTM ✔️
scalatest ✔️ after some formatting fixes (-no-ident was used)
a test fails because of source lines
scalatestplusTestNG ✔️
scissLucre ✔️
zio ✔️ fixed an off-by-one indentation manually

Community Build B

project status comments
cats
catsEffect3
catsMtl ✔️
coop ✔️
discipline ✔️
disciplineMunit ✔️
disciplineSpecs2 ✔️
fs2
monocle ✔️
munit
munitCatsEffect ✔️
perspective ✔️
scalacheckEffect
scodec ✔️
scodecBits ✔️
simulacrumScalafixAnnotations ✔️
spire ✔️
http4s ✔️

Community Build C

project status comments
akka ✔️
betterfiles
cask ✔️
effpi ✔️
endpoints4s ✔️
fansi ✔️
fastparse ✔️
geny ✔️
intent ✔️
jacksonModuleScala ✔️
libretto ✔️
minitest ✔️
onnxScala ✔️
oslib ✔️
parboiled2
playJson ✔️
pprint ✔️
protoquill
requests ✔️
scalacheck ✔️
scalaCollectionCompat ✔️
scalaJava8Compat ✔️
scalap ✔️
scalaParallelCollections ✔️
scalaParserCombinators ✔️
scalaPB ✔️
scalatestplusScalacheck ✔️
scalaXml ✔️
scalaz ✔️
scas ✔️
sconfig ✔️
shapeless ✔️
sourcecode ✔️
specs2 ✔️
stdLib213 ✔️
ujson ✔️
upickle ✔️
utest ✔️
verify ✔️
xmlInterpolator ✔️

Why does the rewrite fails in some community build projects?

  • If the project cannot compile without -no-indent it is not possible to rewrite it to -indent.
  • In general it is recommended to run -new-syntax -rewrite before running -indent -rewrite. Some exotic patterns that are valid with the new control structure syntax are not with the combination of the old and brace-less syntax. For instance:

This is valid:

while (true) {
  { ??? }
  (???)
}

This is valid:

while true do
  { ??? }
  (???)

But this is not valid:

while (true)
  { ??? }
  (???) //end of statement expected but '(' found

@adpi2 adpi2 force-pushed the insert-indent branch 10 times, most recently from e355923 to 017c518 Compare May 22, 2023 15:55
@adpi2 adpi2 force-pushed the insert-indent branch 12 times, most recently from a415bdc to c7a3eb5 Compare May 30, 2023 05:31
@adpi2 adpi2 changed the title Patch indentation with -indent -rewrite Patching indentation in -indent -rewrite to not break compilation May 30, 2023
adpi2 added a commit to adpi2/scala3 that referenced this pull request May 31, 2023
If the first indentation of the region is greater than the indentation
of the enclosing region, we use it to indent the whole region.
Otherwise we use the incremented indentation of the enclosing region.

```scala
  def foo = {
        x // we replicate indentation of x downward in region
    y
  }
```

```scala
  def foo = {
x // indentation of x is incorrect, we increment enclosing indentation
  y
  }
```

A bigger indentation than the required one is permitted except
just after a closing brace.

```scala
def bar = {
  x
    .toString // permitted indentation
  def foo = {
  }
    bar //  must be unindented, to not fall into the body of foo
}
```

And other bug fixes (see scala#17522)
@@ -1545,7 +1553,7 @@ object Scanners {

/* Initialization: read first char, then first token */
nextChar()
nextToken()
(this: @unchecked).nextToken()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initialization of ToIndentParser is unsafe because the ctor of Parser calls in.nextToken() where in is a ToIndentScanner and can access uninitialized fields in ToIndentParser. In practice the if token != EMPTY in ToIndentScanner.nextToken prevents this from happening.


import ast.untpd._

class ScriptParser(source: SourceFile)(using Context) extends Parser(source) {
Copy link
Member Author

@adpi2 adpi2 Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that ScriptParser is no so much useful anymore and I would like to avoid adding a ToIndentScriptParser.

@adpi2 adpi2 requested a review from odersky June 19, 2023 16:19
@adpi2 adpi2 assigned odersky and unassigned adpi2 Jun 19, 2023
@odersky odersky assigned adpi2 and unassigned odersky and adpi2 Jun 26, 2023
@odersky
Copy link
Contributor

odersky commented Jun 26, 2023

@adpi2 Some tests are still failing.

adpi2 added 3 commits July 5, 2023 11:50
We cannot remove braces after a new line,
to prevent from removing braces in stat seq.
@adpi2
Copy link
Member Author

adpi2 commented Jul 5, 2023

The CI is green again.

@adpi2 adpi2 assigned odersky and unassigned adpi2 Jul 5, 2023
@julienrf julienrf added this to the 3.3.2 milestone Jul 10, 2023
@Kordyjan Kordyjan modified the milestones: 3.3.2, 3.4.0 Aug 9, 2023
@adpi2
Copy link
Member Author

adpi2 commented Aug 16, 2023

ping @odersky?

@dwijnand dwijnand modified the milestones: 3.4.0, 3.5.0, 3.4.1 Dec 11, 2023
@Kordyjan Kordyjan modified the milestones: 3.4.1, 3.4.2 Feb 14, 2024
@Kordyjan Kordyjan modified the milestones: 3.4.2, 3.5.0 Mar 28, 2024
@Kordyjan Kordyjan removed this from the 3.5.0 milestone May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rewrite compiler flag on some generated code can break it -indent -rewrite breaks code blocks
5 participants