Rewrite recursive cfg traversal to non-recursive #495

dberlin · 2019-02-07T21:47:57Z

This removes the rest of the stack usage in the forward CFG traversal.

On very large BB testcases i have (millions of BBs), this reduces stack usage to essentially nothing.

This implementation is also somewhat faster for other reasons:
The recursive implementation will touch a lot more nodes due to how it recurses - it will touch each node, start processing the successors, and then recurse on itself (which then processes successors). Depending on CFG structure, when it pops back up, it will now touch a whole bunch of successors that it already processed (it then later discovers it already checked them).
This is fairly bad cache behavior, etc.

This implementation will not, they are cut off before any other processing is done.
It is a minor thing for most CFGs, of course.

I have not updated the reverse traversal. In practice, the next step really should be to turn this into a generic depth first iterator rather than duplicate the code.

The code is based on code i wrote for LLVM's depth first iterator (http://llvm.org/doxygen/DepthFirstIterator_8h_source.html), and meant to be able to be turned into one fairly easily.

I just don't have time at the moment to do it :)

dberlin · 2019-02-07T22:57:48Z

(the initial version had a bad rebase)

s3rvac · 2019-02-08T08:04:06Z

Thank you for the PR! 👍 I will review it and will let you know.

s3rvac

I have included several comments concerning the code. Most importantly, we will need to figure out why some of the regression tests started failing.

src/llvmir2hll/graphs/cfg/cfg_traversal.cpp

dberlin · 2019-02-08T16:23:09Z

Hey Petr, i'll fix the compile failure on 4.9.

As for the regtests, how do i run those?
I have -DRETDEC_TESTS=On with cmake (i always test stuff :P), but the llvm2hir tests all pass.

If there is a way i can run the failing tests, i'm happy to debug it.

If there's no way to run it, i'll start the simple way by printing out the CFG traversal nodes on large testcases and seeing if they differ.

dberlin · 2019-02-08T16:27:11Z

Hey Petr, i'll fix the compile failure on 4.9.

As for the regtests, how do i run those?
I have -DRETDEC_TESTS=On with cmake (i always test stuff :P), but the llvm2hir tests all pass.

If there is a way i can run the failing tests, i'm happy to debug it.

Forget it, found the other regression test repo.
I'll try to get it set up and running.

s3rvac · 2019-02-08T17:17:35Z

Great 👍 Yes, all instructions are here. Let me know if you have any issues with the setup or execution of the tests. You can then run one of the failing tests via

python runner.py integration.factorial -r factorial.thumb.gcc.O2.g.elf

The outputs from the test will then be in

retdec-regression-tests/integration/factorial/outputs/Test_2017 (factorial.thumb.gcc.O2.g.elf)

dberlin · 2019-02-09T09:13:53Z

I know what's wrong. I mixed too much logic into the iterator version.
Fix coming.

dberlin · 2019-02-11T03:02:08Z

I've updated it to the latest version of the patch, which just moves the logic into a (local) depth first iterator to make it easier to debug.

I have not finished making all requested changes, but wanted to put a working version here. It passes the regression test the other one failed.

I've tested this version by instrumenting pre/post versions of the compiler to print traversals and ensuring the same traversals occur in the same order (using https://gist.github.com/39013640d89f08ec285a04e68d7197bf for the pre).

There are a few spurious differences in traversals i am tracking down , but they don't affect generated code AFAICT.

Some of them are caused by the fact that functions in callinfo/et al are sorted in pointer order so the overall processing order of some things is not deterministic.

I know y'all do SCC finding, but you store the results in sets with only a pointer sort order. So you are losing the topo ordering of SCCs and the ordering of functions in the SCC.

Even with disabling ASLR, this means, for example, computeFuncInfoDefinition is not running on functions in the same order from run to run. This is easy to see by printing the function names as they are processed.

IMHO, the SCC sets should be vectors so they maintain order. Tarjan will never try to add duplicates anyway.

s3rvac · 2019-02-13T06:52:42Z

Thank you for the fix 👍 As for the non-determinism in our tools, we are aware of that (#209, #479) and we would like to gradually fix it. Several of the collections that we use should have been vectors, but we realized that long after we had used sets in the implementation.

s3rvac · 2019-03-06T06:46:05Z

Hi Daniel. We are planning to release a new version of RetDec. It would be cool if this PR could be part of that release. Can you please summarize the current status of this PR?

dberlin · 2019-03-07T06:19:04Z

Sure. I tested the updated version before I had to run away and it seems to work fine for regression tests/etc. It is, AFAICT, functionally correct at this point. I haven't had a chance to see if there are comment/other requested changes. and unfortunately i don't think i'll get back to it for a while :( (I had a bunch of free time and now i suddenly don't)

…

On Tue, Mar 5, 2019 at 10:46 PM Petr Zemek ***@***.***> wrote: Hi Daniel. We are planning to release a new version of RetDec. It would be cool if this PR could be part of that release. Can you please summarize the current status of this PR? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#495 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAT0a4ZygJ33PiiqkUwQmi4vuQLc_Hfuks5vT2QzgaJpZM4aqB6R> .

s3rvac · 2019-03-07T06:20:57Z

That's alright. All our tests pass. Is it OK if I merge the branch into master?

dberlin · 2019-03-07T06:21:41Z

It is perfectly fine by me!

…

On Wed, Mar 6, 2019 at 10:21 PM Petr Zemek ***@***.***> wrote: That's alright. All our tests pass. Is it OK if I merge the branch into master? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

s3rvac · 2019-03-07T08:17:51Z

Great. Thank you for the valuable contribution! 👍

dberlin force-pushed the rewrite-cfg-traversal branch from d0e7aee to 09c7bbe Compare February 7, 2019 22:56

s3rvac self-requested a review February 8, 2019 06:53

s3rvac self-assigned this Feb 8, 2019

s3rvac added enhancement C-llvmir2hll labels Feb 8, 2019

s3rvac requested changes Feb 8, 2019

View reviewed changes

Rewrite recursive cfg traversal to non-recursive

1484963

dberlin force-pushed the rewrite-cfg-traversal branch from 09c7bbe to 1484963 Compare February 11, 2019 02:50

s3rvac merged commit f197a43 into avast:master Mar 7, 2019

s3rvac added a commit that referenced this pull request Mar 7, 2019

Mention #495 in CHANGELOG.

c2e9cbb

s3rvac mentioned this pull request Apr 12, 2019

retdec-llvmir2hll fails in SimpleCopyPropagationOptimizer due to insufficient stack space #403

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite recursive cfg traversal to non-recursive #495

Rewrite recursive cfg traversal to non-recursive #495

dberlin commented Feb 7, 2019

dberlin commented Feb 7, 2019

s3rvac commented Feb 8, 2019 •

edited

Loading

s3rvac left a comment

dberlin commented Feb 8, 2019

dberlin commented Feb 8, 2019

s3rvac commented Feb 8, 2019

dberlin commented Feb 9, 2019

dberlin commented Feb 11, 2019 •

edited

Loading

s3rvac commented Feb 13, 2019

s3rvac commented Mar 6, 2019

dberlin commented Mar 7, 2019 via email

s3rvac commented Mar 7, 2019

dberlin commented Mar 7, 2019 via email

s3rvac commented Mar 7, 2019

Rewrite recursive cfg traversal to non-recursive #495

Rewrite recursive cfg traversal to non-recursive #495

Conversation

dberlin commented Feb 7, 2019

dberlin commented Feb 7, 2019

s3rvac commented Feb 8, 2019 • edited Loading

s3rvac left a comment

Choose a reason for hiding this comment

dberlin commented Feb 8, 2019

dberlin commented Feb 8, 2019

s3rvac commented Feb 8, 2019

dberlin commented Feb 9, 2019

dberlin commented Feb 11, 2019 • edited Loading

s3rvac commented Feb 13, 2019

s3rvac commented Mar 6, 2019

dberlin commented Mar 7, 2019 via email

s3rvac commented Mar 7, 2019

dberlin commented Mar 7, 2019 via email

s3rvac commented Mar 7, 2019

s3rvac commented Feb 8, 2019 •

edited

Loading

dberlin commented Feb 11, 2019 •

edited

Loading