Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebar3 Erlang compiler: performance improvements #2322

Merged
merged 3 commits into from
Aug 10, 2020

Conversation

max-au
Copy link
Contributor

@max-au max-au commented Aug 9, 2020

This PR contains 3 commits, that work best together, but could also be applied independently.
First, for large projects, rebar_compiler_erl spends vast majority of time re-scanning file system in order to find parse_transform/behaviour file by the name (e.g. "cth_readable" parse_transform that is applied to all files for "TEST" profile). When there are several hundred directories to scan, it becomes a bottleneck.
Caching results of this scan helps to bring down analysis of a large repo from 11 minutes to just 3, on 8-core intel CPU.

Second improvement runs EPP concurrently within an application, saving considerable amount of time for large applications.

Last change circumvents documented interfaces for digraph save/load, and uses internal undocumented structures. Better approach would be to make a PR to digraph module in OTP, but it will take several years, and a dangerous switch "don't check if the graph is acyclic".

@ferd ferd self-assigned this Aug 9, 2020
This commit makes rebar_compiler_erl to start a gen_server that acts
as a filesystem cache, avoiding re-scanning it over and over.
@max-au max-au force-pushed the max-au/cache-epp-deps branch from 543a1b7 to 725a732 Compare August 10, 2020 00:29
Copy link
Collaborator

@ferd ferd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this PR as something we need for large setups. I'm currently not noticing any performance drop on smaller repos (i.e. rebar3's own), though there's one bit I highlighted where error handling needs improving.

If you're short on time for these, I can do the changes myself.

not lists:member(Path, AbsIncls)],
%% Add the rest
[digraph:add_edge(G, Source, Incl) || Incl <- AbsIncls],
%% admittedly, mark_dirty does not need to happen every time
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now needs to happen every time, since it has been taken out of the call-site doing the spawning in most cases. Since this function is now expected to be called on a change, the comment is outdated and mark_dirty must happen every time to cover for the parent site no longer doing it. It's a bit of a messy split in responsibility now so updating the comment to mention that would make snese.

src/rebar_compiler_dag.erl Show resolved Hide resolved
end,
Control ! {deps, self(), AbsIncls}
end
).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be safer to just have a spawn_link to propagate crashes or a spawn_monitor to catch and report them in the finalise call. The custom compilers here can be user-defined plugins and I would expect Compiler:dependencies/3-4 call to be possibly failure prone.

The reporting does not necessarily need to be crystal clear (just the stacktrace is what we had before), but as of this patch, we would instead likely just hang indefinitely in case of a compiler plugin crash.

@tsloughter
Copy link
Collaborator

Ah good, @ferd, my main question was going to be, "what is the effect on small and medium size projects".

@ferd
Copy link
Collaborator

ferd commented Aug 10, 2020

Ah good, @ferd, my main question was going to be, "what is the effect on small and medium size projects".

Most of the high cost in general would come from copying top-level State records since those contain a lot of data. This PR splits off on Compiler:dependencie/3-4 calls which only contain lists of files and are generally far smaller. So the RAM amount (which I'm most concerned with) is unlikely to move a lot there.

The coordination cost is also likely to be rather small there, and offset by the DAG loading and cache for this specific PR.

I'm always at around 12s for a full rebar3 rebuild from scratch (was at ~11s here) and ~1s for partial builds (including erts start as an escript), which is on par with current master on my VPS where I did the previous benchmarks.

@max-au
Copy link
Contributor Author

max-au commented Aug 10, 2020

Yes, I tested this on smaller codebases, there are no regressions (other than memory, for cases when a very large/deep source tree is cached).
I will address issues and resubmit shortly.

max-au added 2 commits August 10, 2020 12:27
Instead of sequentially running *.erl (or any other) files
one-by-one, spawn a separate process, in a scatter-gather way,
then wait for all processes to complete.
Combined with the previous change it introduces a significant
speed-up for large repositories, up to a number of cores that
the host has.
Saves a few seconds when loading DAG with thousands of files.
Unfortunately, uses internal knowledge about digraph implementation.
@max-au max-au force-pushed the max-au/cache-epp-deps branch from 725a732 to 8fca720 Compare August 10, 2020 19:38
@max-au
Copy link
Contributor Author

max-au commented Aug 10, 2020

Addressed review comments.
And only now I found rebar_parallel.erl, which I should've probably used. Can be done as a separate refactor though.

@ferd
Copy link
Collaborator

ferd commented Aug 10, 2020

Yeah we can always refactor at a later point, but get this in 3.14 first.

@ferd ferd merged commit 0952780 into erlang:master Aug 10, 2020
Comment on lines +103 to +109
{NewFs, Files} = list_directory(Dir, Fs),
case lists:member(Name, Files) of
true ->
{{true, filename:join(Dir, Name)}, NewFs};
false ->
resolve(Name, NewFs, Tail)
end.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently writing a fix for this; this search needs to be recursive in order to work since we support directory recursion. Not having it is currently causing issues for some builds in 3.14, but I'll try to re-shape it such that it doesn't cause problems in terms of performance if you don't need recursion to work.

Copy link
Collaborator

@ferd ferd Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems there's already some recursion below, so there might just be an issue with properly returning sub-paths rather than the top directory.

ferd added a commit to ferd/rebar3 that referenced this pull request Sep 1, 2020
This was due to a subtle mistake in the new code in
erlang#2322 which optimized the search
behaviour of files not found by EPP. The patch mistakenly always kept
the top-level src/ directory for an app even if it did a lookup in
recursive ones.

This patch tracks and maintains the proper subdirectory hierarchy. To
keep performance adequate, the lists:keyfind/3 function is used, which
is also now a NIF to the same extent as lists:member/2; manual lookups
would likely end up reducing performance, particularly in deep
hierarchies.

A test case has been added to track the regression.
Reported by @elbrujohalcon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants