You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our binary to LLVM IR decoding represents registers as global variables.
Our low-level analyses make heavy use of Reaching Definition Analysis (RDA), which halts the [register] tracking at function starts - i.e. it is not inter-procedural, and therefore all analyses using it are not inter-procedural as well.
LLVM IR analyses are very strict - they do not make simplified assumptions, and if they are not able to prove optimization correct, they do not do it. Some of them are inter-procedural, and therefore very complex and expensive.
Some of our high-level analyses are inter-procedural (e.g. -global-to-local, -dead-global-assign), and therefore very complex and expensive.
Backend (llvmir2hll) is also strict and take inter-procedural relations into account.
All of this have the following consequences:
Many analyses are very complex, expensive, and not even necessary correct (it is very error-prone).
A lot of clutter in the resulting decompilation.
Proposal:
Transform all registers to local variables at some point (i.e. localize them).
Do not translate (binary to LLVM IR) them like local variables, it would make translation less general.
The cleanest solution would probably be to localize them right after the decoding, so that all analyses (ours and LLVM's) work on the same register representation. This would however require modifications to all of our analyses, so don't do it right away.
Do the localization after our low-level passes, and before LLVM passes. LLVM does not care about the nature of our registers, and therefore no modifications are needed.
Reduce the number of our high-level passes - some will become obsolete after localization, others can be moved.
Pros:
Cleaner and more compact decompiled code.
Less complex analyses.
Less expensive (i.e. faster) analyses.
This will uncover some other RetDec problems -> more issues.
Cons:
Loss of info needed for inter-procedural register analysis - probably not really needed - see Hex-Rays below.
This will uncover some other RetDec problems -> more issues.
Hex-Rays experiments:
Experiments with Hex-Rays decompiler showed that they probably do a version of this and don't care about possible loss of inter-procedural relations on registers.
On ASM level I changed instruction to write to ecx instead of g2 in f1(), and read from ecx instead of g2 in f2().
Even though an inter-procedural (like RetDec is doing currently doing) analysis would find out that ecx = rand(); in f1() is used in a subsequently called function f2() and therefore should not be removed, Hex-Rays ignores this and throws the assignment away. It will use an uninitialized value representing ecx in f2().
This happens in for selective decompilation (functio-by-function) and full decompilation (Produce file -> Create C file...).
P.S.
Thanks to discrete LLVM passes system used in RetDec, the whole localization will be implemented as a single, independent, pass. By default, it will be enabled, but it will be no problem to disable it on demand if needed/wanted.
The text was updated successfully, but these errors were encountered:
The current state:
-global-to-local
,-dead-global-assign
), and therefore very complex and expensive.llvmir2hll
) is also strict and take inter-procedural relations into account.All of this have the following consequences:
Proposal:
Pros:
Cons:
Hex-Rays experiments:
ecx
instead ofg2
inf1()
, and read fromecx
instead ofg2
inf2()
.ecx = rand();
inf1()
is used in a subsequently called functionf2()
and therefore should not be removed, Hex-Rays ignores this and throws the assignment away. It will use an uninitialized value representingecx
inf2()
.P.S.
Thanks to discrete LLVM passes system used in RetDec, the whole localization will be implemented as a single, independent, pass. By default, it will be enabled, but it will be no problem to disable it on demand if needed/wanted.
The text was updated successfully, but these errors were encountered: