Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making Pylint faster 2 #519

Merged
merged 4 commits into from
Mar 30, 2018
Merged

Making Pylint faster 2 #519

merged 4 commits into from
Mar 30, 2018

Conversation

nickdrozd
Copy link
Contributor

My last PR contained what I would describe as compiler-style
optimizations, mainly loop unrolling and getting rid of runtime type
checking and dynamic attribute handling. I applied those changes more
or less mechanically, without really understanding the purpose of all
the functions involved.

In spite of those improvements, the last yappi graph still showed
major hotspots in _get_return_nodes_skip_functions and
_get_yield_nodes_skip_lambdas (and also, to a lesser extent, in
_get_assign_nodes). Thinking about how those functions were being
used, I realized that a lot of unnecessary work was still being done.
Consider return statements, for instance. Most node types (e.g.
function calls, assign statements, comprehensions) cannot legally
contain them, so if we're searching for return statements, we can
pass those nodes by immediately. The only nodes that can contain
return statements are nodes with multi-line bodies, like If and
FunctionDef, and thus they are the only ones that need to implement
_get_return_nodes_skip_functions. Similar reasoning holds for the
other changes here. (I'm sure there is plenty of cleverness to be applied
further.)

To test these changes, I ran pylint (with pylint's pylintrc
file) against pycodestyle.py, as before, as well as against pylint
itself. These cases differ in two import ways: 1) pylint is a
multi-file project with nested directories, while pycodestyle.py is
just a single file, and 2) pylint is written to pass pylint
without any errors, while pycodestyle.py raises lots of pylint
errors.

Despite their differences, pylint ran about twenty seconds faster in
both cases:

pycodestyle.py

Before

real	0m55.362s
user	0m24.856s
sys	0m30.283s

real	0m54.343s
user	0m24.282s
sys	0m29.905s

real	0m54.996s
user	0m24.679s
sys	0m30.169s

After

real	0m34.826s
user	0m16.159s
sys	0m18.539s

real	0m35.294s
user	0m16.434s
sys	0m18.723s

real	0m35.089s
user	0m16.328s
sys	0m18.658s

pylint

Before

real	3m34.382s
user	1m38.719s
sys	1m55.555s

real	3m34.185s
user	1m38.867s
sys	1m55.192s

real	3m34.986s
user	1m38.878s
sys	1m55.964s

After

real	3m13.449s
user	1m29.708s
sys	1m43.619s

real	3m13.627s
user	1m29.739s
sys	1m43.785s

real	3m13.680s
user	1m29.919s
sys	1m43.661s

Now, as these optimizations become increasingly specialized, there
comes a risk that the code will become increasingly inelegant.
Certainly some elegance has already been sacrificed by replacing
nodes_of_class with type-specific variations. But I think that
pylint's slowness is a major usability issue, especially for large
projects, and therefore the tradeoff is definitely worth it.

Assign statements can only appear in multiline bodies (like function
definitions) and in the value side of assign statements (e.g. `b = 4`
is an assign node contained in `a = b = 4`), so there is no need to
check other nodes for them.
Return statements can only appear in multiline bodies (like function
definitions), so there is no need to check other nodes for them.
Yield statements can only appear in multiline bodies (like function
definitions) and in lambdas, so there is no need to check other nodes
for them.
for child_node in self.body:
yield from child_node._get_assign_nodes()

def _get_yield_nodes_skip_lambdas(self):
Copy link
Contributor

@brycepg brycepg Mar 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would make sense to make some of these methods into Mixin classes to reduce code duplication? I know this won't work for functions which use node specific attributes, but I see _get_yield_nodes_skip_lambdas is exactly the same in multiple locations.

@PCManticore
Copy link
Contributor

Thanks for tackling this @nickdrozd ! My main concern is, as you said, that these changes starts to sacrifice the readability of the code for the performance, with the code becoming extremely specialized and too clever for its sake. We should try to strike a balance between these two, so some degree of code generalisation should be kept around.

I have two points for discussion for this PR. As Bryce mentioned this patch introduces some duplicate methods which are also a bit hard to understand without having the context of why they were implemented like this (for new contributors for instance). I suggest we should also keep them separated from the node implementations themselves, since right now performance tricks mingle with the rest of the code. I don't have something in particular in mind, maybe we can implement these in, say, something like astroid.node_specialisation or something along these lines? I imagine for now these methods could be part of mixins or even specialized functions per each kind of node. Also this code has to be heavily documented on why a function is implemented in one way or another.

The other point is that even with these specialisations, as mentioned in your other PR, I think that us not having a way to verify in an automated way that the performance is not hurt, through some new feature or change we're adding, will definitely be a disadvantage. We should invest some effort into this before coming up with new specialisations, since we're not going to run yappi ourselves periodically.

A third point that came in my mind during writing this comment is more of a question really. I wonder if it's worth looking into rewriting some of the astroid's stack in a more efficient language rather than doing it all in Python. After a point, there won't be a lot of specialized changes that could be applied to get a significant boost in performance. I know @ceridwen also thought about that at some point. If we could rewrite the AST in C (maybe a CPython extension or Cython, something along these lines), we could introduce run these tree functions from C rather than from pure Python. This could lead to the biggest performance gain with the effect that the code could be kept super simple and not overly specialized. Unfortunately I definitely don't have time for a big project like this one.

@ceridwen
Copy link
Contributor

Something that I would like to do (but I also don't have time for big projects at this point) is to create a corpus of Python code for testing astroid against, or least run pytest on all the standard library modules to make sure it doesn't crash. The same corpus could be used for profiling/performance testing. As it is, we have two problems. First, all the benchmarks that anyone has ever run astroid/pylint against are awfully specific, which is fine for users who are really only concerned how fast astroid is on their particular projects but not good for astroid development because we don't know if optimizing for one project might hurt performance in others. Second, we don't have any continuous integration set up for performance regressions so any future changes in the code base could easily slow down astroid without their authors realizing it.

As for this patch in particular, have we dropped support for Python 2.7? If so, someone should really update the readme and PyPi tags because they still claim support for 2.7. (Dropping support for 2.7 before it officially EOLs in 2020 seems like a bad idea to me, but I won't reopen that if it's decided.). Beyond that, like @brycepg says, there's a lot of code duplication that could probably be reduced, at least, with mixins.

@PCManticore
Copy link
Contributor

I agree with @ceridwen on the fact that we need performance regressions.

@ceridwen regarding Python 2 support, we decided to drop it in pylint-dev/pylint#1763. This is restricted to the future major releases, pylint 2.0 and astroid 2.0 We'll still release bug fixes for the current astroid 1.6 and pylint 1.8 until Python 2 is EOL, so probably until January 2020.

@brycepg
Copy link
Contributor

brycepg commented Mar 22, 2018

One problem I've found with CI performance regression tests is that Appveyor/Travis CI are not made to provide performance consistency. (Benchmarks, travis CI issue). Plus they would be too slow, the CI is already nearly unbearably slow

I think It wouldn't be too difficult for me to set up a folder in pylint for integration tests and curate some number of projects to run pylint against using pytest-benchmark, which could then be run locally on a developer's machine (which would need some performance tuning such as setting the CPU governor to performance) when needed, with an automated setup step for downloading the project to run the benchmarks.

I'll open a new issue for performance integration tests, but I don't think they should block this issue; It's pretty obvious that astroid shouldn't search assignment statements for return nodes.

@brycepg
Copy link
Contributor

brycepg commented Mar 26, 2018

I'd say this change generally makes code faster (definitely not slower):

----------------------------- benchmark: 14 tests ------------------------------
Name (time in s)                        Mean            StdDev            Rounds
--------------------------------------------------------------------------------
test_pycodestyle (0004_nickdro)       6.5699 (1.0)      0.6506 (1.0)           3
test_pycodestyle (0003_master)        8.7538 (1.33)     0.9468 (1.46)          3
test_yapf (0004_nickdro)              9.7190 (1.48)     1.0315 (1.59)          3
test_yapf (0003_master)              10.0176 (1.52)     0.9679 (1.49)          3
test_pyfunctional (0004_nickdro)     10.9815 (1.67)     1.2713 (1.95)          3
test_pyfunctional (0003_master)      10.8861 (1.66)     2.1725 (3.34)          3
test_utilisnips (0004_nickdro)       10.2468 (1.56)     0.7364 (1.13)          3
test_utilisnips (0003_master)        10.6300 (1.62)     0.8745 (1.34)          3
test_requests (0004_nickdro)         14.8844 (2.27)     1.5987 (2.46)          3
test_pgcli (0004_nickdro)            15.1486 (2.31)     1.5961 (2.45)          3
test_requests (0003_master)          15.8045 (2.41)     1.8531 (2.85)          3
test_pgcli (0003_master)             16.3026 (2.48)     1.6357 (2.51)          3
test_lektor (0004_nickdro)           27.9670 (4.26)     3.4107 (5.24)          3
test_lektor (0003_master)            29.9270 (4.56)     4.0418 (6.21)          3
--------------------------------------------------------------------------------

The corpus is here:
https://github.com/brycepg/pylint-corpus

@PCManticore
Copy link
Contributor

Thanks for tackling the performance benchmarks Bryce! Regarding the output, what are the values from before and after this patch? Would be great if you'd offer a small breakdown on those values.

Regarding the PR itself, this is not blocked by the performance benchmarks, but by my other point regarding the specialization, separation & code duplication. I'm waiting on @nickdrozd to follow up. Overall I think the patch and its improvements are great to have, just needs some polishing before getting it in.

@brycepg
Copy link
Contributor

brycepg commented Mar 27, 2018

0004_nickdro is nickroid's branch
0003_master is the master branch
test_<repo> is the benchmark for a python repository (from most downloaded, or most starred on github with a pylintrc). pylint is run three times on each repository and the mean time in seconds is taken. (lower being better).

It appears pycodestyle is the most optimized out of all the repos in terms of percent difference, which makes sense since that's where nick took the call graphs. Only the pyfunctional didn't see improvements (both results were really close).

I unfortunately do not have access to a dedicated server for proper performance testing and these measurements were taken in a Paravirtual machine.

@nickdrozd
Copy link
Contributor Author

Okay, I figured out what I think is a reasonably elegant way to consolidate all that repeated code, the MultiLineBlockMixin. In fact you're all witnesses to history, as this is the first mixin I've written in my entire career.

@PCManticore It would certainly be nice if a bird flew by and dropped Pylint-rewritten-in-C into our hands, but short of that I doubt it will ever happen. And is it really needed anyway? I think there are still plenty of performance gains to be had just from changes to the existing Python code.

@PCManticore
Copy link
Contributor

@nickdrozd Definitely, this would be a huge project for which none of us has time nor the incentive to do it. Although it would be interesting to think about it if we're aiming for pylint to become as fast as pyflakes & co with regard to linting. That is, even if we optimize pylint as much as we can, it will never be as fast as the likes of pyflakes & co.

Copy link
Contributor

@PCManticore PCManticore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now! Thanks for the efforts @nickdrozd ! Left two small comments, let me know when you're ready with the merge.



class MultiLineBlockMixin:
def _get_multi_line_blocks(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a decorators.cachedproperty here.

@@ -131,3 +131,41 @@ def real_name(self, asname):
raise exceptions.AttributeInferenceError(
'Could not find original name for {attribute} in {target!r}',
target=self, attribute=asname)


class MultiLineBlockMixin:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a docstring on why we have this mixin?

`_multi_line_block_fields` is a list of strings indicating which
fields contain multi-line blocks. `_get_multi_line_blocks` dynamically
accesses these attributes the first time it is called and then caches
the resulting references so as to avoid repeated expensive attribute
lookups.
@nickdrozd
Copy link
Contributor Author

@PCManticore done

@PCManticore PCManticore merged commit 75b6c1a into pylint-dev:master Mar 30, 2018
@PCManticore
Copy link
Contributor

Thanks @nickdrozd ! Looking forward to the next improvements! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants