Making Pylint faster 2 #519

nickdrozd · 2018-03-17T19:57:39Z

My last PR contained what I would describe as compiler-style
optimizations, mainly loop unrolling and getting rid of runtime type
checking and dynamic attribute handling. I applied those changes more
or less mechanically, without really understanding the purpose of all
the functions involved.

In spite of those improvements, the last yappi graph still showed
major hotspots in _get_return_nodes_skip_functions and
_get_yield_nodes_skip_lambdas (and also, to a lesser extent, in
_get_assign_nodes). Thinking about how those functions were being
used, I realized that a lot of unnecessary work was still being done.
Consider return statements, for instance. Most node types (e.g.
function calls, assign statements, comprehensions) cannot legally
contain them, so if we're searching for return statements, we can
pass those nodes by immediately. The only nodes that can contain
return statements are nodes with multi-line bodies, like If and
FunctionDef, and thus they are the only ones that need to implement
_get_return_nodes_skip_functions. Similar reasoning holds for the
other changes here. (I'm sure there is plenty of cleverness to be applied
further.)

To test these changes, I ran pylint (with pylint's pylintrc
file) against pycodestyle.py, as before, as well as against pylint
itself. These cases differ in two import ways: 1) pylint is a
multi-file project with nested directories, while pycodestyle.py is
just a single file, and 2) pylint is written to pass pylint
without any errors, while pycodestyle.py raises lots of pylint
errors.

Despite their differences, pylint ran about twenty seconds faster in
both cases:

pycodestyle.py

Before

real	0m55.362s
user	0m24.856s
sys	0m30.283s

real	0m54.343s
user	0m24.282s
sys	0m29.905s

real	0m54.996s
user	0m24.679s
sys	0m30.169s

After

real	0m34.826s
user	0m16.159s
sys	0m18.539s

real	0m35.294s
user	0m16.434s
sys	0m18.723s

real	0m35.089s
user	0m16.328s
sys	0m18.658s

pylint

Before

real	3m34.382s
user	1m38.719s
sys	1m55.555s

real	3m34.185s
user	1m38.867s
sys	1m55.192s

real	3m34.986s
user	1m38.878s
sys	1m55.964s

After

real	3m13.449s
user	1m29.708s
sys	1m43.619s

real	3m13.627s
user	1m29.739s
sys	1m43.785s

real	3m13.680s
user	1m29.919s
sys	1m43.661s

Now, as these optimizations become increasingly specialized, there
comes a risk that the code will become increasingly inelegant.
Certainly some elegance has already been sacrificed by replacing
nodes_of_class with type-specific variations. But I think that
pylint's slowness is a major usability issue, especially for large
projects, and therefore the tradeoff is definitely worth it.

Assign statements can only appear in multiline bodies (like function definitions) and in the value side of assign statements (e.g. `b = 4` is an assign node contained in `a = b = 4`), so there is no need to check other nodes for them.

Return statements can only appear in multiline bodies (like function definitions), so there is no need to check other nodes for them.

Yield statements can only appear in multiline bodies (like function definitions) and in lambdas, so there is no need to check other nodes for them.

brycepg · 2018-03-19T04:52:59Z

astroid/scoped_nodes.py

+        for child_node in self.body:
+            yield from child_node._get_assign_nodes()
+
+    def _get_yield_nodes_skip_lambdas(self):


Do you think it would make sense to make some of these methods into Mixin classes to reduce code duplication? I know this won't work for functions which use node specific attributes, but I see _get_yield_nodes_skip_lambdas is exactly the same in multiple locations.

PCManticore · 2018-03-21T09:22:50Z

Thanks for tackling this @nickdrozd ! My main concern is, as you said, that these changes starts to sacrifice the readability of the code for the performance, with the code becoming extremely specialized and too clever for its sake. We should try to strike a balance between these two, so some degree of code generalisation should be kept around.

I have two points for discussion for this PR. As Bryce mentioned this patch introduces some duplicate methods which are also a bit hard to understand without having the context of why they were implemented like this (for new contributors for instance). I suggest we should also keep them separated from the node implementations themselves, since right now performance tricks mingle with the rest of the code. I don't have something in particular in mind, maybe we can implement these in, say, something like astroid.node_specialisation or something along these lines? I imagine for now these methods could be part of mixins or even specialized functions per each kind of node. Also this code has to be heavily documented on why a function is implemented in one way or another.

The other point is that even with these specialisations, as mentioned in your other PR, I think that us not having a way to verify in an automated way that the performance is not hurt, through some new feature or change we're adding, will definitely be a disadvantage. We should invest some effort into this before coming up with new specialisations, since we're not going to run yappi ourselves periodically.

A third point that came in my mind during writing this comment is more of a question really. I wonder if it's worth looking into rewriting some of the astroid's stack in a more efficient language rather than doing it all in Python. After a point, there won't be a lot of specialized changes that could be applied to get a significant boost in performance. I know @ceridwen also thought about that at some point. If we could rewrite the AST in C (maybe a CPython extension or Cython, something along these lines), we could introduce run these tree functions from C rather than from pure Python. This could lead to the biggest performance gain with the effect that the code could be kept super simple and not overly specialized. Unfortunately I definitely don't have time for a big project like this one.

ceridwen · 2018-03-21T17:17:05Z

Something that I would like to do (but I also don't have time for big projects at this point) is to create a corpus of Python code for testing astroid against, or least run pytest on all the standard library modules to make sure it doesn't crash. The same corpus could be used for profiling/performance testing. As it is, we have two problems. First, all the benchmarks that anyone has ever run astroid/pylint against are awfully specific, which is fine for users who are really only concerned how fast astroid is on their particular projects but not good for astroid development because we don't know if optimizing for one project might hurt performance in others. Second, we don't have any continuous integration set up for performance regressions so any future changes in the code base could easily slow down astroid without their authors realizing it.

As for this patch in particular, have we dropped support for Python 2.7? If so, someone should really update the readme and PyPi tags because they still claim support for 2.7. (Dropping support for 2.7 before it officially EOLs in 2020 seems like a bad idea to me, but I won't reopen that if it's decided.). Beyond that, like @brycepg says, there's a lot of code duplication that could probably be reduced, at least, with mixins.

PCManticore · 2018-03-21T17:29:19Z

I agree with @ceridwen on the fact that we need performance regressions.

@ceridwen regarding Python 2 support, we decided to drop it in pylint-dev/pylint#1763. This is restricted to the future major releases, pylint 2.0 and astroid 2.0 We'll still release bug fixes for the current astroid 1.6 and pylint 1.8 until Python 2 is EOL, so probably until January 2020.

brycepg · 2018-03-22T02:12:38Z

One problem I've found with CI performance regression tests is that Appveyor/Travis CI are not made to provide performance consistency. (Benchmarks, travis CI issue). Plus they would be too slow, the CI is already nearly unbearably slow

I think It wouldn't be too difficult for me to set up a folder in pylint for integration tests and curate some number of projects to run pylint against using pytest-benchmark, which could then be run locally on a developer's machine (which would need some performance tuning such as setting the CPU governor to performance) when needed, with an automated setup step for downloading the project to run the benchmarks.

I'll open a new issue for performance integration tests, but I don't think they should block this issue; It's pretty obvious that astroid shouldn't search assignment statements for return nodes.

brycepg · 2018-03-26T02:47:13Z

I'd say this change generally makes code faster (definitely not slower):

----------------------------- benchmark: 14 tests ------------------------------
Name (time in s)                        Mean            StdDev            Rounds
--------------------------------------------------------------------------------
test_pycodestyle (0004_nickdro)       6.5699 (1.0)      0.6506 (1.0)           3
test_pycodestyle (0003_master)        8.7538 (1.33)     0.9468 (1.46)          3
test_yapf (0004_nickdro)              9.7190 (1.48)     1.0315 (1.59)          3
test_yapf (0003_master)              10.0176 (1.52)     0.9679 (1.49)          3
test_pyfunctional (0004_nickdro)     10.9815 (1.67)     1.2713 (1.95)          3
test_pyfunctional (0003_master)      10.8861 (1.66)     2.1725 (3.34)          3
test_utilisnips (0004_nickdro)       10.2468 (1.56)     0.7364 (1.13)          3
test_utilisnips (0003_master)        10.6300 (1.62)     0.8745 (1.34)          3
test_requests (0004_nickdro)         14.8844 (2.27)     1.5987 (2.46)          3
test_pgcli (0004_nickdro)            15.1486 (2.31)     1.5961 (2.45)          3
test_requests (0003_master)          15.8045 (2.41)     1.8531 (2.85)          3
test_pgcli (0003_master)             16.3026 (2.48)     1.6357 (2.51)          3
test_lektor (0004_nickdro)           27.9670 (4.26)     3.4107 (5.24)          3
test_lektor (0003_master)            29.9270 (4.56)     4.0418 (6.21)          3
--------------------------------------------------------------------------------

The corpus is here:
https://github.com/brycepg/pylint-corpus

PCManticore · 2018-03-26T16:16:50Z

Thanks for tackling the performance benchmarks Bryce! Regarding the output, what are the values from before and after this patch? Would be great if you'd offer a small breakdown on those values.

Regarding the PR itself, this is not blocked by the performance benchmarks, but by my other point regarding the specialization, separation & code duplication. I'm waiting on @nickdrozd to follow up. Overall I think the patch and its improvements are great to have, just needs some polishing before getting it in.

brycepg · 2018-03-27T03:08:15Z

0004_nickdro is nickroid's branch
0003_master is the master branch
test_<repo> is the benchmark for a python repository (from most downloaded, or most starred on github with a pylintrc). pylint is run three times on each repository and the mean time in seconds is taken. (lower being better).

It appears pycodestyle is the most optimized out of all the repos in terms of percent difference, which makes sense since that's where nick took the call graphs. Only the pyfunctional didn't see improvements (both results were really close).

I unfortunately do not have access to a dedicated server for proper performance testing and these measurements were taken in a Paravirtual machine.

nickdrozd · 2018-03-27T13:14:55Z

Okay, I figured out what I think is a reasonably elegant way to consolidate all that repeated code, the MultiLineBlockMixin. In fact you're all witnesses to history, as this is the first mixin I've written in my entire career.

@PCManticore It would certainly be nice if a bird flew by and dropped Pylint-rewritten-in-C into our hands, but short of that I doubt it will ever happen. And is it really needed anyway? I think there are still plenty of performance gains to be had just from changes to the existing Python code.

PCManticore · 2018-03-29T06:08:40Z

@nickdrozd Definitely, this would be a huge project for which none of us has time nor the incentive to do it. Although it would be interesting to think about it if we're aiming for pylint to become as fast as pyflakes & co with regard to linting. That is, even if we optimize pylint as much as we can, it will never be as fast as the likes of pyflakes & co.

PCManticore

Looks great now! Thanks for the efforts @nickdrozd ! Left two small comments, let me know when you're ready with the merge.

PCManticore · 2018-03-29T06:09:21Z

astroid/mixins.py

+
+
+class MultiLineBlockMixin:
+    def _get_multi_line_blocks(self):


We could use a decorators.cachedproperty here.

PCManticore · 2018-03-29T06:10:44Z

astroid/mixins.py

@@ -131,3 +131,41 @@ def real_name(self, asname):
        raise exceptions.AttributeInferenceError(
            'Could not find original name for {attribute} in {target!r}',
            target=self, attribute=asname)
+
+
+class MultiLineBlockMixin:


Can you add a docstring on why we have this mixin?

`_multi_line_block_fields` is a list of strings indicating which fields contain multi-line blocks. `_get_multi_line_blocks` dynamically accesses these attributes the first time it is called and then caches the resulting references so as to avoid repeated expensive attribute lookups.

nickdrozd · 2018-03-30T02:00:49Z

@PCManticore done

PCManticore · 2018-03-30T06:40:21Z

Thanks @nickdrozd ! Looking forward to the next improvements! :)

nickdrozd added 3 commits March 17, 2018 15:15

Improve _get_assign_nodes algorithm

956d5fc

Assign statements can only appear in multiline bodies (like function definitions) and in the value side of assign statements (e.g. `b = 4` is an assign node contained in `a = b = 4`), so there is no need to check other nodes for them.

Improve _get_return_nodes_skip_functions algorithm

aa9c0e9

Return statements can only appear in multiline bodies (like function definitions), so there is no need to check other nodes for them.

Improve _get_yield_nodes_skip_lambdas algorithm

45dede4

Yield statements can only appear in multiline bodies (like function definitions) and in lambdas, so there is no need to check other nodes for them.

nickdrozd force-pushed the speed branch from 97f063c to 45dede4 Compare March 17, 2018 20:17

brycepg reviewed Mar 19, 2018

View reviewed changes

brycepg mentioned this pull request Mar 22, 2018

Performance Benchmarks / Integration tests pylint-dev/pylint#1954

Open

PCManticore approved these changes Mar 29, 2018

View reviewed changes

nickdrozd force-pushed the speed branch from 267aa43 to 21e7468 Compare March 30, 2018 01:58

PCManticore merged commit 75b6c1a into pylint-dev:master Mar 30, 2018

PCManticore mentioned this pull request Apr 5, 2018

Django Manager.from_queryset causing significant performance issues #532

Closed

PCManticore mentioned this pull request Apr 24, 2018

Stop specifying a default linter microsoft/vscode-python#974

Closed

PCManticore mentioned this pull request May 7, 2018

Major performance slowdown after updating pylint-dev/pylint#2057

Closed

nickdrozd mentioned this pull request May 13, 2018

Making Pylint faster 3 #545

Closed

PCManticore mentioned this pull request Jul 16, 2018

Possible to cache astroid AST between runs? pylint-dev/pylint#2292

Closed

nickdrozd deleted the speed branch August 1, 2018 15:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making Pylint faster 2 #519

Making Pylint faster 2 #519

nickdrozd commented Mar 17, 2018

brycepg Mar 19, 2018 •

edited

Loading

PCManticore commented Mar 21, 2018

ceridwen commented Mar 21, 2018

PCManticore commented Mar 21, 2018

brycepg commented Mar 22, 2018 •

edited

Loading

brycepg commented Mar 26, 2018

PCManticore commented Mar 26, 2018

brycepg commented Mar 27, 2018

nickdrozd commented Mar 27, 2018

PCManticore commented Mar 29, 2018

PCManticore left a comment

PCManticore Mar 29, 2018

PCManticore Mar 29, 2018

nickdrozd commented Mar 30, 2018

PCManticore commented Mar 30, 2018

Making Pylint faster 2 #519

Making Pylint faster 2 #519

Conversation

nickdrozd commented Mar 17, 2018

pycodestyle.py

Before

After

pylint

Before

After

brycepg Mar 19, 2018 • edited Loading

Choose a reason for hiding this comment

PCManticore commented Mar 21, 2018

ceridwen commented Mar 21, 2018

PCManticore commented Mar 21, 2018

brycepg commented Mar 22, 2018 • edited Loading

brycepg commented Mar 26, 2018

PCManticore commented Mar 26, 2018

brycepg commented Mar 27, 2018

nickdrozd commented Mar 27, 2018

PCManticore commented Mar 29, 2018

PCManticore left a comment

Choose a reason for hiding this comment

PCManticore Mar 29, 2018

Choose a reason for hiding this comment

PCManticore Mar 29, 2018

Choose a reason for hiding this comment

nickdrozd commented Mar 30, 2018

PCManticore commented Mar 30, 2018

brycepg Mar 19, 2018 •

edited

Loading

brycepg commented Mar 22, 2018 •

edited

Loading