Split cases data files into smaller tests and introduce more subdirectories #3078

saroad2 · 2022-05-19T08:30:18Z

Description
We've made a lot of progress improving the testing process in #3034 and #3062.

A problem that still remains is that the existing cases data files are ENORMOUS. those data files can contain up to 300 lines (and even more), which is way too much for a test. Cases files, and tests in general, should be short and concise in order for us to track down errors as easy as possible.

I would like to suggest scanning all of our cases data files and split them into small cases, each case testing only one behavior.
In that way when encountering a failed test, we'll see exactly which test has failed and on which line it failed without searching over hundreds of lines of code.

Why now?

Up until now, every case data file had to be added manually to a list of cases. This is no longer the case after #3062. Now, whenever you add a test to a directory it will be added automatically to the relevant cases list.

Now we can afford having as many cases data files as we want without writing all of these tests to huge, hard-coded lists. All of our data files will be read automatically and will be processed automatically according to the directory that they're in.

How can we do it?

Very simple:

We change the all_data_cases function to search cases also in subdirectories of the given directory.
We split each directory of cases to subdirectories with the same subject. For example, we can add to the "simple cases" directory "comments" subdirectory for testing comments formats, "functions" subdirectory for testing functions formats, etc.
We move every existing case to its relevant subdirectory
If this case is too big, we split it to smaller cases in that directory.

If we keep each case in its right directory, all of the tests should keep passing without changing them.

Additional context

This is a continuation of a discussion over at #3035 of @felix-hilden and me.

The text was updated successfully, but these errors were encountered:

saroad2 · 2022-05-30T08:27:28Z

@JelleZijlstra , @ichard26 , @cooperlees , would love to hear your opinion on this so I could start implement it.

This is a direct continuation of all the other stuff we did regarding unit testing

cooperlees · 2022-05-30T16:40:43Z

Howdy - This sounds great to me and I have no objections to lots of smaller modules and test data since the overhead of adding it to the all_data_cases allows easier scaling etc.

I'm happy to review and help get these improvements merged. I just ask lets do lots of smaller PRs moving there and I'll try do faster reviews so you don't get blocked :)

JelleZijlstra · 2022-05-31T01:56:37Z

Honestly I don't see a big problem with the current size of the data files. Having too many small files makes it harder to navigate.

felix-hilden · 2022-05-31T06:57:30Z

The size is less of a bother to me too, but I do dislike the fact that expected output can be a long scroll from the input. Another option to fix that could be to change the file format to allow multiple test blocks.

saroad2 · 2022-05-31T06:59:35Z

Hey @JelleZijlstra , thanks for the honest opinion.

There are a few reasons why we would want smaller test cases. Take a look at long_strings.py test case. It has 661 lines and total of 67 statements. When we test those statements as part of test_preview_format we are actually running 67 individual tests all at once, one for each formatted statement.

This is fine when all tests are passing, but it can be hell when this test is failing: the error prints are humongous, long and confusing, and trying to search for a single error in a test file of 661 lines is unnecessary at best.

If we manage to split this case to 67 smaller cases, one for each statement, we get a few advantages:

If one statement is failing the others will still pass, something that will help us narrow down where the exact problem is.
If we add to the error message the exact location of the case file, people won't get lost when trying to find where the error is.
When reading a test case, one won't need to run through hundreds of lines of code, she would read a test case of 10-20 lines at best. this will make all our test case more readable and understandable.
Adding and removing cases would be as easy as writting a new short file, not editing huge case files.

I hope this convinces you that splitting the test cases is the right move. If you have any caviets, I would love to hear them.

JelleZijlstra · 2022-05-31T14:26:32Z

Why would we need to read through the whole file? That sounds like it can be solved by making the test machinery output better line numbers.

I tried making a small change that breaks tests (formatting 1e6 as 1e+6). Here's the output I got:

_________________________ test_python_36[numeric_literals_skip_underscores] _________________________

filename = 'numeric_literals_skip_underscores'

    @pytest.mark.parametrize("filename", all_data_cases("py_36"))
    def test_python_36(filename: str) -> None:
        source, expected = read_data("py_36", filename)
        mode = black.Mode(target_versions=PY36_VERSIONS)
>       assert_format(source, expected, mode, minimum_version=(3, 6))

tests/test_format.py:108: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/util.py:74: in assert_format
    _assert_format_equal(expected, actual)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

expected = '#!/usr/bin/env python3.6\n\nx = 123456789\nx = 1_2_3_4_5_6_7\nx = 1e1\nx = 0xB1ACC\nx = 0.00_00_006\nx = 12_34_567j\nx = 0.1_2\nx = 1_2.0\n'
actual = '#!/usr/bin/env python3.6\n\nx = 123456789\nx = 1_2_3_4_5_6_7\nx = 1e+1\nx = 0xB1ACC\nx = 0.00_00_006\nx = 12_34_567j\nx = 0.1_2\nx = 1_2.0\n'

    def _assert_format_equal(expected: str, actual: str) -> None:
        if actual != expected and not os.environ.get("SKIP_AST_PRINT"):
            bdv: DebugVisitor[Any]
            out("Expected tree:", fg="green")
            try:
                exp_node = black.lib2to3_parse(expected)
                bdv = DebugVisitor()
                list(bdv.visit(exp_node))
            except Exception as ve:
                err(str(ve))
            out("Actual tree:", fg="red")
            try:
                exp_node = black.lib2to3_parse(actual)
                bdv = DebugVisitor()
                list(bdv.visit(exp_node))
            except Exception as ve:
                err(str(ve))
    
        if actual != expected:
            out(diff(expected, actual, "expected", "actual"))
    
>       assert actual == expected
E       AssertionError

tests/util.py:56: AssertionError
--------------------------------------- Captured stderr call ----------------------------------------
Expected tree:
file_input
  simple_stmt
    expr_stmt
      NAME '#!/usr/bin/env python3.6\n\n' 'x'
      EQUAL ' ' '='
      NUMBER ' ' '123456789'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1_2_3_4_5_6_7'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1e1'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0xB1ACC'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0.00_00_006'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '12_34_567j'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0.1_2'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1_2.0'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  ENDMARKER ''
/file_input
Actual tree:
file_input
  simple_stmt
    expr_stmt
      NAME '#!/usr/bin/env python3.6\n\n' 'x'
      EQUAL ' ' '='
      NUMBER ' ' '123456789'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1_2_3_4_5_6_7'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1e+1'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0xB1ACC'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0.00_00_006'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '12_34_567j'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0.1_2'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1_2.0'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  ENDMARKER ''
/file_input
--- expected
+++ actual
@@ -1,10 +1,10 @@
 #!/usr/bin/env python3.6
 
 x = 123456789
 x = 1_2_3_4_5_6_7
-x = 1e1
+x = 1e+1
 x = 0xB1ACC
 x = 0.00_00_006
 x = 12_34_567j
 x = 0.1_2
 x = 1_2.0

To fix the test I need two things: the data filename and the approximate line where it's failing. The first is pretty easy, because we give a diff with line numbers, but the filename is harder to find because we print so much junk before it.

Here's some concrete things that would help me fix this test failure faster:

Hide the pytest stack trace, it doesn't add anything. I don't care about the code in _assert_format_equal.
Hide the AST dump, it's never helped me
Use fewer context lines around the diff, 4 is a bit much.

felix-hilden · 2022-05-31T14:54:49Z

True, having better test output would be half the battle. Maybe we could make a Pytest argument for those that want to see verbose test output like so (no experience with it though).

The other half is writing tests. It's inconvenient to scroll around the file needlessly, although in fairness it has worked well so far! So.. the inconvenience of scrolling vs. the inconvenience of making a large set of changes and having more files or more test blocks.. Honestly I'm not sure 🤷‍♂️

saroad2 · 2022-06-01T14:54:04Z

In my humble opinion, keeping these long case files is a mistake, from all the reasons I mentioned above.

In the end, it's up to you. If you choose to split the cases, I would do that happily.

saroad2 added the T: enhancement New feature or request label May 19, 2022

felix-hilden added C: maintenance Related to project maintenance, e.g. CI, testing, policy changes, releases and removed T: enhancement New feature or request labels May 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split cases data files into smaller tests and introduce more subdirectories #3078

Split cases data files into smaller tests and introduce more subdirectories #3078

saroad2 commented May 19, 2022 •

edited

Loading

saroad2 commented May 30, 2022

cooperlees commented May 30, 2022

JelleZijlstra commented May 31, 2022

felix-hilden commented May 31, 2022

saroad2 commented May 31, 2022

JelleZijlstra commented May 31, 2022 •

edited

Loading

felix-hilden commented May 31, 2022

saroad2 commented Jun 1, 2022

Split cases data files into smaller tests and introduce more subdirectories #3078

Split cases data files into smaller tests and introduce more subdirectories #3078

Comments

saroad2 commented May 19, 2022 • edited Loading

saroad2 commented May 30, 2022

cooperlees commented May 30, 2022

JelleZijlstra commented May 31, 2022

felix-hilden commented May 31, 2022

saroad2 commented May 31, 2022

JelleZijlstra commented May 31, 2022 • edited Loading

felix-hilden commented May 31, 2022

saroad2 commented Jun 1, 2022

saroad2 commented May 19, 2022 •

edited

Loading

JelleZijlstra commented May 31, 2022 •

edited

Loading