Skip to content

Conversation

@david-yz-liu
Copy link
Contributor

@david-yz-liu david-yz-liu commented Jul 22, 2021

Steps

  • For new features or bug fixes, add a ChangeLog entry describing what your PR does.
  • Write a good description on what the PR does.

Description

My interest in astroid's lookup algorithm was piqued so I found some outstanding issues to investigate. This PR contains three logical changes, but I decided to submit just one PR because my fixes are all in the same function, LookupMixin._filter_stmts, which is the main helper method for determining which assignment/delete statements are "relevant" for a given variable.

To help review, I split up the PR into three different commits, one for each issue. If you'd like, I can also make three separate PRs from them, no big deal. I also added a bunch of test cases for each one---including some test cases for existing correct behaviour, as I wasn't sure whether these cases for lookup were being tested indirectly elsewhere in the test suite. Happy to delete some tests I added if they're redundant.

  1. False positive undefined-loop-variable with pylint 2.5.3 pylint#3711. The issue here was that an assignment statement to a variable in one branch of an if statement overrode statements associated with a use of that variable in another branch:

    >>> import astroid
    >>> ast = astroid.parse("""
    ...     x = 10
    ...     if x > 10:
    ...         x = 100
    ...     elif x > 0:
    ...         x = 200
    ... """)
    >>> node = [n for n in ast.nodes_of_class(astroid.Name) if n.name == 'x'][1]  # the x in x > 0
    >>> node.lookup('x')  # This should return the `x` from `x = 10`
    (<Module.builtins l.0 at 0x1f3b2e1fdf0>, ())

    Fixed it by explicitly ignoring statements that "are_exclusive" with the variable in _filter_stmts. Note that there had previously been a check for this, I just moved the check up a bit.

  2. Scope lookup problem when using closures #180. The issue here was that AssignName nodes representing function parameters didn't have a "_parent_stmts" value consistent with the other nodes in the function's body: doing node.statement().parent returns the parent of the FunctionDef, which won't result in a "match" at this check in _filter_stmts:

    https://github.com/PyCQA/astroid/blob/22e0cdcfcdff3ea80142ba6b90758b42a85ed8e0/astroid/node_classes.py#L1178-L1179

    This results in the following error:

    >>> ast = astroid.parse("""
    ...     def f(x):
    ...         x = 100
    ...         if True:
    ...             print(x)
    ... """)
    >>> name = [n for n in ast.nodes_of_class(astroid.Name) if n.name == 'x'][0]
    >>> name.lookup('x')  # should only be the x = 100
    (<FunctionDef.f2 l.1 at 0x1f3b39b4dc0>, [<AssignName.x l.2 at 0x1f3b39b45b0>, <AssignName.x l.3 at 0x1f3b39b4b80>])

    One subtlety: because of this check,

    https://github.com/PyCQA/astroid/blob/22e0cdcfcdff3ea80142ba6b90758b42a85ed8e0/astroid/node_classes.py#L1222-L1225

    the bug only occurs when the variable being used does not have the same parent as the assignment statement overwriting the function parameter, which is why I included the strange if True: above. (Comment: I wonder if this check was actually an attempt to fix this same problem?)

    The fix I added was to have a special case for Arguments nodes when updating _stmt_parents. Not a big fan of having special cases like this, but not sure if there's a cleaner way.

  3. (Edit: the change is inspired by, but does not fix, the issue) pylint doesn't understand the scope of except clause variables pylint#626. The issue here is pretty straightforward, ExceptHandlers have their own local scope for the exception variable, as of Python 3. From the Python spec:

    When an exception has been assigned using as target, it is cleared at the end of the except clause.

    Here's an example:

    >>> ast = astroid.parse("""
    ...     try:
    ...         1 / 0
    ...     except ZeroDivisionError as e:
    ...         pass
    ...     print(e)
    ... """)
    >>> name = [n for n in ast.nodes_of_class(astroid.Name) if n.name == 'e'][0]
    >>> name.lookup('e')  # Should be empty
    (<Module l.0 at 0x1f3b39bbc70>, [<AssignName.e l.4 at 0x1f3b39bbd00>])

    My fix here was a bit clunky, adding a special case to reset the accumulated assignments in _filter_stmts if the variable being was was inside the ExceptHandler that bound that variable, and otherwise to skip that AssignName. I think it would be possible (preferred?) to make ExceptHandler its own scope (and subclass LocalsDictNodeNG), but this was a bigger change and so didn't want to make that change without discussion.

Type of Changes

Type
🐛 Bug fix

Related Issue

Closes pylint-dev/pylint#3711
Closes #180
Closes Ref pylint-dev/pylint#626. The issue still occurs in pylint, see #1111 (comment).

@Pierre-Sassoulas Pierre-Sassoulas added Bug 🪳 topic-control-flow Control flow understanding labels Jul 29, 2021
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.6.6 milestone Jul 29, 2021
@Pierre-Sassoulas Pierre-Sassoulas self-requested a review July 30, 2021 19:26
@Pierre-Sassoulas Pierre-Sassoulas added the pylint-tested PRs that don't cause major regressions with pylint label Jul 30, 2021
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thank you for the very detailed merge request. I tried to include that in pylint and this does fix 3711 (which was broken before see the tests suite in pylint-dev/pylint#4777. ), but regarding pylint's 626 I don't think it does the result stay the same. Or maybe I did not understand something ?

I get the following:

____________________________________________________________________________________ test_functional[unused_variable] _____________________________________________________________________________________

self = <pylint.testutils.lint_module_test.LintModuleTest object at 0x7fe67f24d640>

    def runTest(self):
>       self._runTest()
E       AssertionError: Wrong results for file "unused_variable":
E       
E       Expected in testdata:
E        166: unused-variable
E        168: undefined-variable

@david-yz-liu
Copy link
Contributor Author

david-yz-liu commented Jul 30, 2021

No @Pierre-Sassoulas that's totally my bad. I got so excited about the change that I must have messed up testing with pylint.

After double-checking just now: The change in the third commit does fix an astroid error in how exceptions are handled, and in particular the test test_except_var_after_block_single [1] does currently fail on master. I incorrectly thought that fixing the bug here would also fix the Pylint error, but that's not the case---the latter requires additional work in the VariablesChecker I think, or perhaps making ExceptHandler its own scope.

Anyway, I see two different options:

  1. I can remove the third commit and just keep the first two commits, closing the first two issues.
  2. I can keep the third commit and just remove references to the Pylint issue 626, since that's still outstanding.

In all cases, I can rebase and force push, or make new commits on top of this branch, or create a separate branch/PR. Not sure what workflow you'd prefer.

Let me know what you'd like me to do, and sorry again for the inconvenience!

[1] Note: I just made a small commit beba399 clarifying the test that fails on master; there are a few others as well. (Edit: squashed the commit with a force push, but the tests reflect this comment.)

@Pierre-Sassoulas
Copy link
Member

Thank you for the explanation. I think we can keep the third commit, and only remove the "Closes 626" from the changelog. Could you provide the use case the third commit fixes so I can add it in #4777, please ? :)

@david-yz-liu
Copy link
Contributor Author

Updated to remove the reference to issue 626. Here's an example of an improvement to pylint:

def main(lst):
    try:
        raise ValueError
    except ValueError as e:
        pass

    for e in lst:
        pass

    # e will be undefined if lst is empty
    print(e)

main([])  # Raises UnboundLocalError

Currently pylint doesn't report any errors with the above code, but after this change, it correctly reports

$ python -m pylint test.py
************* Module test
test.py:11:10: W0631: Using possibly undefined loop variable 'e' (undefined-loop-variable)

@Pierre-Sassoulas
Copy link
Member

Thank for the example ! pylint did not raise anything before (Old result available on the pylint MR, I updated the tests suite with your example) and now rightly raise an undefined-loop-variable on print(e). I wonder if it should be undefined-variable instead ? If that's an issue, do you think it's an issue with astroid or should we change a check in pylint ?

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.6.6, 2.7.0 Aug 1, 2021
@Pierre-Sassoulas
Copy link
Member

Regarding the conflict I had during merge, I think a change was introduced independentely and I took your version. Do not hesitate to rebase and fix the conflict yourself if I did that wrong :)

@david-yz-liu
Copy link
Contributor Author

@Pierre-Sassoulas thank you!

Regarding the error, I think undefined-loop-variable is appropriate---the same error is reported by pylint if you remove the try block altogether, i.e. on

def main(lst):
    for e in lst:
        pass

    # e will be undefined if lst is empty
    print(e)  # also undefined-loop-variable

main([])  # Raises UnboundLocalError

My understanding is that's the whole point of this message, so I don't think it's a problem.

Regarding the merge conflict, there was a conflict because #1097 changed the code right above my change in _filter_stmts. I resolved the conflict just by keeping both sets of changes, so you can see the diff now doesn't affect #1097's change at all. I force-pushed so the branch is back to the original 3-commit structure. I think it's good to merge :)

@Pierre-Sassoulas
Copy link
Member

Make sense, thank you for clearing that up :) ! And thank you for the huge work on this, this is going to be the main feature of 2.6.6 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug 🪳 pylint-tested PRs that don't cause major regressions with pylint topic-control-flow Control flow understanding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

False positive undefined-loop-variable with pylint 2.5.3 Scope lookup problem when using closures

2 participants