Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split cases data files into smaller tests and introduce more subdirectories #3078

Open
saroad2 opened this issue May 19, 2022 · 8 comments
Open
Labels
C: maintenance Related to project maintenance, e.g. CI, testing, policy changes, releases

Comments

@saroad2
Copy link
Contributor

saroad2 commented May 19, 2022

Description
We've made a lot of progress improving the testing process in #3034 and #3062.

A problem that still remains is that the existing cases data files are ENORMOUS. those data files can contain up to 300 lines (and even more), which is way too much for a test. Cases files, and tests in general, should be short and concise in order for us to track down errors as easy as possible.

I would like to suggest scanning all of our cases data files and split them into small cases, each case testing only one behavior.
In that way when encountering a failed test, we'll see exactly which test has failed and on which line it failed without searching over hundreds of lines of code.

Why now?

Up until now, every case data file had to be added manually to a list of cases. This is no longer the case after #3062. Now, whenever you add a test to a directory it will be added automatically to the relevant cases list.

Now we can afford having as many cases data files as we want without writing all of these tests to huge, hard-coded lists. All of our data files will be read automatically and will be processed automatically according to the directory that they're in.

How can we do it?

Very simple:

  • We change the all_data_cases function to search cases also in subdirectories of the given directory.
  • We split each directory of cases to subdirectories with the same subject. For example, we can add to the "simple cases" directory "comments" subdirectory for testing comments formats, "functions" subdirectory for testing functions formats, etc.
  • We move every existing case to its relevant subdirectory
  • If this case is too big, we split it to smaller cases in that directory.

If we keep each case in its right directory, all of the tests should keep passing without changing them.

Additional context

This is a continuation of a discussion over at #3035 of @felix-hilden and me.

@saroad2 saroad2 added the T: enhancement New feature or request label May 19, 2022
@felix-hilden felix-hilden added C: maintenance Related to project maintenance, e.g. CI, testing, policy changes, releases and removed T: enhancement New feature or request labels May 19, 2022
@saroad2
Copy link
Contributor Author

saroad2 commented May 30, 2022

@JelleZijlstra , @ichard26 , @cooperlees , would love to hear your opinion on this so I could start implement it.

This is a direct continuation of all the other stuff we did regarding unit testing

@cooperlees
Copy link
Collaborator

Howdy - This sounds great to me and I have no objections to lots of smaller modules and test data since the overhead of adding it to the all_data_cases allows easier scaling etc.

I'm happy to review and help get these improvements merged. I just ask lets do lots of smaller PRs moving there and I'll try do faster reviews so you don't get blocked :)

@JelleZijlstra
Copy link
Collaborator

Honestly I don't see a big problem with the current size of the data files. Having too many small files makes it harder to navigate.

@felix-hilden
Copy link
Collaborator

The size is less of a bother to me too, but I do dislike the fact that expected output can be a long scroll from the input. Another option to fix that could be to change the file format to allow multiple test blocks.

@saroad2
Copy link
Contributor Author

saroad2 commented May 31, 2022

Hey @JelleZijlstra , thanks for the honest opinion.

There are a few reasons why we would want smaller test cases. Take a look at long_strings.py test case. It has 661 lines and total of 67 statements. When we test those statements as part of test_preview_format we are actually running 67 individual tests all at once, one for each formatted statement.

This is fine when all tests are passing, but it can be hell when this test is failing: the error prints are humongous, long and confusing, and trying to search for a single error in a test file of 661 lines is unnecessary at best.

If we manage to split this case to 67 smaller cases, one for each statement, we get a few advantages:

  1. If one statement is failing the others will still pass, something that will help us narrow down where the exact problem is.
  2. If we add to the error message the exact location of the case file, people won't get lost when trying to find where the error is.
  3. When reading a test case, one won't need to run through hundreds of lines of code, she would read a test case of 10-20 lines at best. this will make all our test case more readable and understandable.
  4. Adding and removing cases would be as easy as writting a new short file, not editing huge case files.

I hope this convinces you that splitting the test cases is the right move. If you have any caviets, I would love to hear them.

@JelleZijlstra
Copy link
Collaborator

JelleZijlstra commented May 31, 2022

Why would we need to read through the whole file? That sounds like it can be solved by making the test machinery output better line numbers.

I tried making a small change that breaks tests (formatting 1e6 as 1e+6). Here's the output I got:

_________________________ test_python_36[numeric_literals_skip_underscores] _________________________

filename = 'numeric_literals_skip_underscores'

    @pytest.mark.parametrize("filename", all_data_cases("py_36"))
    def test_python_36(filename: str) -> None:
        source, expected = read_data("py_36", filename)
        mode = black.Mode(target_versions=PY36_VERSIONS)
>       assert_format(source, expected, mode, minimum_version=(3, 6))

tests/test_format.py:108: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/util.py:74: in assert_format
    _assert_format_equal(expected, actual)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

expected = '#!/usr/bin/env python3.6\n\nx = 123456789\nx = 1_2_3_4_5_6_7\nx = 1e1\nx = 0xB1ACC\nx = 0.00_00_006\nx = 12_34_567j\nx = 0.1_2\nx = 1_2.0\n'
actual = '#!/usr/bin/env python3.6\n\nx = 123456789\nx = 1_2_3_4_5_6_7\nx = 1e+1\nx = 0xB1ACC\nx = 0.00_00_006\nx = 12_34_567j\nx = 0.1_2\nx = 1_2.0\n'

    def _assert_format_equal(expected: str, actual: str) -> None:
        if actual != expected and not os.environ.get("SKIP_AST_PRINT"):
            bdv: DebugVisitor[Any]
            out("Expected tree:", fg="green")
            try:
                exp_node = black.lib2to3_parse(expected)
                bdv = DebugVisitor()
                list(bdv.visit(exp_node))
            except Exception as ve:
                err(str(ve))
            out("Actual tree:", fg="red")
            try:
                exp_node = black.lib2to3_parse(actual)
                bdv = DebugVisitor()
                list(bdv.visit(exp_node))
            except Exception as ve:
                err(str(ve))
    
        if actual != expected:
            out(diff(expected, actual, "expected", "actual"))
    
>       assert actual == expected
E       AssertionError

tests/util.py:56: AssertionError
--------------------------------------- Captured stderr call ----------------------------------------
Expected tree:
file_input
  simple_stmt
    expr_stmt
      NAME '#!/usr/bin/env python3.6\n\n' 'x'
      EQUAL ' ' '='
      NUMBER ' ' '123456789'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1_2_3_4_5_6_7'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1e1'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0xB1ACC'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0.00_00_006'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '12_34_567j'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0.1_2'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1_2.0'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  ENDMARKER ''
/file_input
Actual tree:
file_input
  simple_stmt
    expr_stmt
      NAME '#!/usr/bin/env python3.6\n\n' 'x'
      EQUAL ' ' '='
      NUMBER ' ' '123456789'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1_2_3_4_5_6_7'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1e+1'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0xB1ACC'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0.00_00_006'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '12_34_567j'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '0.1_2'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  simple_stmt
    expr_stmt
      NAME 'x'
      EQUAL ' ' '='
      NUMBER ' ' '1_2.0'
    /expr_stmt
    NEWLINE '\n'
  /simple_stmt
  ENDMARKER ''
/file_input
--- expected
+++ actual
@@ -1,10 +1,10 @@
 #!/usr/bin/env python3.6
 
 x = 123456789
 x = 1_2_3_4_5_6_7
-x = 1e1
+x = 1e+1
 x = 0xB1ACC
 x = 0.00_00_006
 x = 12_34_567j
 x = 0.1_2
 x = 1_2.0

To fix the test I need two things: the data filename and the approximate line where it's failing. The first is pretty easy, because we give a diff with line numbers, but the filename is harder to find because we print so much junk before it.

Here's some concrete things that would help me fix this test failure faster:

  • Hide the pytest stack trace, it doesn't add anything. I don't care about the code in _assert_format_equal.
  • Hide the AST dump, it's never helped me
  • Use fewer context lines around the diff, 4 is a bit much.

@felix-hilden
Copy link
Collaborator

True, having better test output would be half the battle. Maybe we could make a Pytest argument for those that want to see verbose test output like so (no experience with it though).

The other half is writing tests. It's inconvenient to scroll around the file needlessly, although in fairness it has worked well so far! So.. the inconvenience of scrolling vs. the inconvenience of making a large set of changes and having more files or more test blocks.. Honestly I'm not sure 🤷‍♂️

@saroad2
Copy link
Contributor Author

saroad2 commented Jun 1, 2022

In my humble opinion, keeping these long case files is a mistake, from all the reasons I mentioned above.

In the end, it's up to you. If you choose to split the cases, I would do that happily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: maintenance Related to project maintenance, e.g. CI, testing, policy changes, releases
Projects
None yet
Development

No branches or pull requests

4 participants