Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DAG resolving for recursive directories #2535

Merged
merged 1 commit into from
Apr 17, 2021

Conversation

ferd
Copy link
Collaborator

@ferd ferd commented Apr 15, 2021

This fixes #2534

While fixing the bug reported above, as explained in #2534 (comment)
I found out that the DAG preparation took in include paths that were
explicit, and did not resolve them properly (and therefore silently
failed to track updates). On the other hand, the compiler worked fine,
which highlighted a difference between options passed to EPP via the
compiler, and those we pass internally when building the DAG.

I fixed this, which in turn caused problems with test cases for
recursive parse transforms; the reason being that modules were being
resolved as source files under _build/, which turned out to be due to
the EPP resolver we wrote that would go recursive in all directories to
find files. So when the top-level directory for an app was passed in
(which is needed for include files in legacy cases), it accidentally
would find build modules before the rest.

This was fixed by removing recursivity in EPP, which in turn broke the
behaviour recursive lookup in the DAG; this required going back to the
rebar_compiler_erl module's paths and sending in explicit lookup paths
for includes (which are also in a set used for behaviours and parse
transforms!)

So here we are:

  1. Take the recursive lookup out of EPP (which is used only for erl
    files anyway)
  2. Move the path expansion around that to be done in the compiler
    behaviour's context function instead since it's required for include
    files as well (EPP won't cover these)
  3. Ensure that path expansion in the context function also respects the
    rebar.config recursion options for the erlang compiler, which will
    prevent potential clashes around subdirectories with conflicting (or
    optionally-built) files
  4. Adding a test for the new case

This is one interesting case of tests partially saving our asses, once
more. They're slow but they're good.

CC @max-au since this touches fun code y'all needed, but I expect no breakage.

This fixes erlang#2534

While fixing the bug reported above, as explained in erlang#2534 (comment)
I found out that the DAG preparation took in include paths that were
explicit, and did not resolve them properly (and therefore silently
failed to track updates). On the other hand, the compiler worked fine,
which highlighted a difference between options passed to EPP via the
compiler, and those we pass internally when building the DAG.

I fixed this, which in turn caused problems with test cases for
recursive parse transforms; the reason being that modules were being
resolved as source files under `_build/`, which turned out to be due to
the EPP resolver we wrote that would go recursive in all directories to
find files. So when the top-level directory for an app was passed in
(which is needed for include files in legacy cases), it accidentally
would find build modules before the rest.

This was fixed by removing recursivity in EPP, which in turn broke the
behaviour recursive lookup in the DAG; this required going back to the
`rebar_compiler_erl` module's paths and sending in explicit lookup paths
for includes (which are also in a set used for behaviours and parse
transforms!)

So here we are:

1. Take the recursive lookup out of EPP (which is used only for erl
   files anyway)
2. Move the path expansion around that to be done in the compiler
   behaviour's context function instead since it's required for include
   files as well (EPP won't cover these)
3. Ensure that path expansion in the context function also respects the
   rebar.config recursion options for the erlang compiler, which will
   prevent potential clashes around subdirectories with conflicting (or
   optionally-built) files
4. Adding a test for the new case

This is one interesting case of tests partially saving our asses, once
more.  They're slow but they're good.
@ferd ferd force-pushed the fix-epp-resolving-recursion branch from e956803 to 8133c24 Compare April 15, 2021 17:34
Src <- rebar_dir:all_src_dirs(RebarOpts, ["src"], [])
]) ++
%% top-level dir for legacy stuff
[OutDir],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reviews' sake: I got the list for these paths from line 242-246. Since we were passing them to the compiler, they had to be right already.

@erszcz
Copy link

erszcz commented Apr 16, 2021

I confirm this fixes #2534. Thank you!

@ferd ferd merged commit c30dd48 into erlang:master Apr 17, 2021
@max-au
Copy link
Contributor

max-au commented Apr 21, 2021

found no breakages so far. Thanks!

@ferd
Copy link
Collaborator Author

ferd commented Apr 21, 2021

found no breakages so far. Thanks!

well there was one but we got it quick enough and cut a patch release: #2540

@MYStill
Copy link

MYStill commented Dec 5, 2024

到目前为止没有发现任何破损。谢谢!

好吧,有一个,但我们很快就得到了它并发布了补丁版本:#2540
#2535
If the project directory hierarchy is complex, it can cause stack overflow. 5000 files and a large directory list can result in very large memory
image
In version 3.24.0, I rolled back this, but I'm not sure if there will be any other issues, but there won't be any stack overflow

@ferd
Copy link
Collaborator Author

ferd commented Dec 5, 2024

That list accumulator is just the pending queue of parallel compilation tasks. There's as many workers as there are either files or schedulers. All the tasks then get enqueued in the coordinator.

I guess the only way we could fix this is to make a buffered synchronous pool that has a max queue depth (say 10x the worker count) after which, calls hang.

The list of results are however still going to be queued up and accumulated. Is the problem really the accumulation of tasks on line 198, or do you think the accumulation of results on 237 is also agoing to be a problem after that?

@MYStill
Copy link

MYStill commented Dec 6, 2024

image
1、When the InDirs and DepOpts parameters are particularly large, the Work memory is also large in the closure mode
2、I have tried caching this Args in the work queue with parameters passed in, and the memory will stabilize, but later it seems that copying the InDirs process will also cause compilation to slow down
3、The result of local testing also shows that parameters can affect the memory size of anonymous functions. The result of executing erts_debug: size (InDirs) is approximately 1000000

@ferd
Copy link
Collaborator Author

ferd commented Dec 6, 2024

Yeah what I'm wondering here is whether there's even enough memory to carry the list of all directories and options in memory.

I can imagine only tricky mechanisms to fix this:

  1. change the whole algorithm to work progressively to load less data in memory (eg. do a few files at a time). Probably too complex to do without a full rewrite because the DAG itself is loaded in memory
  2. create references or representations to directories and options that better share memory (by using IDs that refer to ETS tables containing the options, for example, or possibly binaries that are reference-counted when possible). This still likely represents major changes to the serialization mechanism and a few interesting logic changes, but it doesn't require rewriting the underlying data structures nor algorithm.

Ultimately, it's going to be really hard to do any of this without an actual app to work on because without access to the build, I can't even profile or look at anything on my own.

@MYStill
Copy link

MYStill commented Dec 8, 2024

https://github.com/MYStill/rebar_indirs_test.git
If a reproducible environment is needed, I have tried simulating an environment here and hope it can be helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistency between compiling and recompiling source files under src/ subdirs
5 participants