diff --git a/docs/cheatsheet.rst b/docs/cheatsheet.rst index 6b69b81be34a..7af444fc4579 100644 --- a/docs/cheatsheet.rst +++ b/docs/cheatsheet.rst @@ -168,3 +168,44 @@ Modifiers - ``override``: States that this function, modifier or public state variable changes the behavior of a function or modifier in a base contract. +Optimizer Step Dependencies +=========================== + +=========================================== =============================================================== =========================================== +Step Steps that should run before Steps that should run after +=========================================== =============================================================== =========================================== +:ref:`a ` ExpressionSplitter, CommonSubexpressionEliminator UnusedAssignEliminator +:ref:`C ` SSATransform, DeadCodeEliminator +:ref:`c ` ExpressionSplitter, SSATransform [#]_ UnusedPruner, UnusedAssignEliminator +:ref:`D ` +:ref:`d ` +:ref:`E ` ExpressionSplitter, SSATransform, CommonSubexpressionEliminator +:ref:`e ` +:ref:`F ` LiteralRematerialiser +:ref:`f ` +:ref:`g ` +:ref:`h ` +:ref:`I ` StructuralSimplifier +:ref:`i ` FunctionHoister, ExpressionSplitter +:ref:`j ` +:ref:`L ` SSATransform +:ref:`l ` +:ref:`M ` ExpressionSplitter, SSATransform, ForLoopConditionIntoBody +:ref:`m ` +:ref:`n ` +:ref:`O ` LiteralRematerialiser +:ref:`o ` +:ref:`p ` LiteralRematerialiser FullInliner, UnusedPruner +:ref:`r ` +:ref:`S ` SSATransform +:ref:`s ` ExpressionSplitter, SSATransform, CommonSubexpressionEliminator +:ref:`T ` +:ref:`t ` LiteralRematerialiser +:ref:`U ` +:ref:`u ` +:ref:`V ` CommonSubexpressionEliminator, UnusedPruner +:ref:`v ` UnusedPruner +:ref:`x ` ForLoopConditionIntoBody +=========================================== =============================================================== =========================================== + +.. [#] It is enough if SSATransform was used once at any point before running the step. diff --git a/docs/internals/optimizer.rst b/docs/internals/optimizer.rst index 604560554ad5..2c1be0ba9692 100644 --- a/docs/internals/optimizer.rst +++ b/docs/internals/optimizer.rst @@ -1,4 +1,4 @@ -.. index:: optimizer, optimiser, common subexpression elimination, constant propagation +.. index:: ! optimizer, ! optimiser, ! common subexpression elimination, ! constant propagation .. _optimizer: ************* @@ -385,6 +385,20 @@ The preprocessing components perform transformations to get the program into a certain normal form that is easier to work with. This normal form is kept during the rest of the optimization process. +Some of the steps in this group should be considered prerequisites for all other components. +Currently these are: + +- :ref:`disambiguator` +- :ref:`function-hoister` +- :ref:`function-grouper` +- :ref:`for-loop-init-rewriter` + +Each one establishes a certain property that all other steps preserve and implicitly assume +to be always satisfied by the code. +Not satisfying the property will not only make optimization less effective but in some cases may also +lead to the produced code being wrong. +For this reason the optimizer always runs these prerequisites before applying the main sequence. + .. _disambiguator: Disambiguator @@ -458,6 +472,12 @@ is transformed to This transformation can also be useful when paired with LoopInvariantCodeMotion, since invariants in the loop-invariant conditions can then be taken outside the loop. +Loops that already have a literal constant as iteration condition are not transformed. + +To avoid unnecessary rewriting, it is recommended to run this step after StructuralSimplifier. + +May destroy the :ref:`expression-split ` form. + .. _for-loop-init-rewriter: ForLoopInitRewriter @@ -832,10 +852,13 @@ value, one of them will always be unused. The UnusedPruner or the UnusedAssignEliminator will then be able to fully eliminate such variables. -This step is especially efficient if the ExpressionSplitter is run -before. If the code is in pseudo-SSA form, -the values of variables are available for a longer time and thus we -have a higher chance of expressions to be replaceable. +The longer the values of variables are available, the higher the chance of an expression being replaceable. +For this reason the step could potentially be more effective when used on code in pseudo-SSA form. +This property, however, is not easily destroyed once established, and in most cases it is enough +if SSATransform was used once at any point before running the step. +The difference may also not be very significant in practice. + +This step is especially efficient if the ExpressionSplitter is run before. The ExpressionSimplifier will be able to perform better replacements if the CommonSubexpressionEliminator was run right before it. @@ -853,19 +876,29 @@ It tries to match patterns like ``X + 0`` on each subexpression. During the matching procedure, it resolves variables to their currently assigned expressions to be able to match more deeply nested patterns even when the code is in pseudo-SSA form. +In fact, running CommonSubexpressionEliminator before helps this component track equivalent subexpressions. Some of the patterns like ``X - X -> 0`` can only be applied as long as the expression ``X`` is movable, because otherwise it would remove its potential side-effects. Since variable references are always movable, even if their current value might not be, the ExpressionSimplifier is again more powerful -in split or pseudo-SSA form. +in :ref:`expression-split ` or :ref:`pseudo-SSA ` form. + +Some of the simplifications destroy the :ref:`expression-split ` form. .. _literal-rematerialiser: LiteralRematerialiser ^^^^^^^^^^^^^^^^^^^^^ -To be documented. +If a variable referenced in an expression is known to have a literal value, replaces that use of +the variable with the literal. + +This is mostly used so that other components do not have to rely on the DataflowAnalyzer. + +While the step destroys the :ref:`expression-split ` form in the strict sense, +it does not introduce any nested function calls, which may be good enough for many steps that require +this property. .. _load-resolver: @@ -875,9 +908,10 @@ LoadResolver Optimisation stage that replaces expressions of type ``sload(x)`` and ``mload(x)`` by the value currently stored in storage resp. memory, if known. -Works best if the code is in SSA form. +Also evaluates simple ``keccak256(a, c)`` when the value at memory location ``a`` is known and ``c`` +is a constant ``<= 32``. -Prerequisites: Disambiguator, ForLoopInitRewriter. +Works best if the code is in SSA form. Statement-Scale Simplifications ------------------------------- @@ -898,7 +932,7 @@ ConditionalSimplifier The ConditionalSimplifier inserts assignments to condition variables if the value can be determined from the control-flow. -Destroys SSA form. +Destroys the :ref:`SSA ` form. Currently, this tool is very limited, mostly because we do not yet have support for boolean types. Since conditions only check for expressions being nonzero, @@ -914,9 +948,7 @@ Future features: - allow replacements by ``1`` - take termination of user-defined functions into account -Works best with SSA form and if dead code removal has run before. - -Prerequisite: Disambiguator. +Works best with SSA form and if DeadCodeEliminator has run before. .. _conditional-unsimplifier: @@ -948,10 +980,6 @@ performs similar tasks that do depend on data flow. The ControlFlowSimplifier does record the presence or absence of ``break`` and ``continue`` statements during its traversal. -Prerequisite: Disambiguator, FunctionHoister, ForLoopInitRewriter. - -Important: Introduces EVM opcodes and thus can only be used on EVM code for now. - .. _dead-code-eliminator: DeadCodeEliminator @@ -969,8 +997,6 @@ code and thus are considered reachable. Because variables declared in a ``for`` loop's init block have their scope extended to the loop body, we require ForLoopInitRewriter to run before this step. -Prerequisites: ForLoopInitRewriter, FunctionHoister, FunctionGrouper. - .. _equal-store-eliminator: EqualStoreEliminator @@ -985,7 +1011,8 @@ CommonSubexpressionEliminator, because SSA will make sure that the variables will not change and the CommonSubexpressionEliminator re-uses exactly the same variable if the value is known to be the same. -Prerequisites: Disambiguator, ForLoopInitRewriter. +Works best on code without literal arguments, which is the case when the code is in the +:ref:`expression-split ` form. .. _unused-pruner: @@ -1000,6 +1027,8 @@ but its value is discarded. All movable expression statements (expressions that are not assigned) are removed. +Does not remove circular references. + .. _structural-simplifier: StructuralSimplifier @@ -1018,6 +1047,8 @@ a structural level: This component uses the DataflowAnalyzer. +LiteralRematerialiser should be run before this step. + .. _block-flattener: BlockFlattener @@ -1066,7 +1097,8 @@ declarations inside conditional branches will not be moved out of the loop. ExpressionSplitter and SSATransform should be run upfront to obtain better results. -Prerequisites: Disambiguator, ForLoopInitRewriter, FunctionHoister. +The transformation will not move loop-invariant condition out of the condition block of the loop. +This can be addressed by running ForLoopConditionIntoBody beforehand. Function-Level Optimizations @@ -1093,9 +1125,7 @@ function ``f_1`` that takes only one argument, i.e., Other optimization steps will be able to make more simplifications to the function. The optimization step is mainly useful for functions that would not be inlined. -Prerequisites: Disambiguator, FunctionHoister. - -LiteralRematerialiser is recommended as a prerequisite, even though it's not required for +LiteralRematerialiser is recommended before, even though it's not required for correctness. .. _unused-function-parameter-pruner: @@ -1103,7 +1133,7 @@ correctness. UnusedFunctionParameterPruner ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -This step removes unused parameters in a function. +This step removes unused parameters from function definitions. If a parameter is unused, like ``c`` and ``y`` in, ``function f(a,b,c) -> x, y { x := div(a,b) }``, we remove the parameter and create a new "linking" function as follows: @@ -1114,12 +1144,10 @@ remove the parameter and create a new "linking" function as follows: function f2(a,b,c) -> x, y { x := f(a,b) } and replace all references to ``f`` by ``f2``. -The inliner should be run afterwards to make sure that all references to ``f2`` are replaced by +FullInliner should be run afterwards to make sure that all references to ``f2`` are replaced by ``f``. -Prerequisites: Disambiguator, FunctionHoister, LiteralRematerialiser. - -The step LiteralRematerialiser is not required for correctness. It helps deal with cases such as: +The step LiteralRematerialiser is recommended but not required for correctness. It helps deal with cases such as: ``function f(x) -> y { revert(y, y} }`` where the literal ``y`` will be replaced by its value ``0``, allowing us to rewrite the function. @@ -1167,8 +1195,6 @@ be read once we leave the function's scope, so the statement will be removed onl Best run in SSA form. -Prerequisites: Disambiguator, ForLoopInitRewriter. - .. _equivalent-function-combiner: EquivalentFunctionCombiner @@ -1178,7 +1204,6 @@ If two functions are syntactically equivalent, while allowing variable renaming but not any re-ordering, then any reference to one of the functions is replaced by the other. - The actual removal of the function is performed by the UnusedPruner. Function Inlining @@ -1207,8 +1232,6 @@ Example: The function to be inlined has the form of ``function f(...) -> r { r : The result of this inlining is always a single expression. -This component can only be used on sources with unique names. - .. _full-inliner: FullInliner @@ -1222,13 +1245,52 @@ code than more efficient code. In some cases, though, inlining a function can have positive effects on subsequent optimizer steps. This is the case if one of the function arguments is a constant, for example. -During inlining, a heuristic is used to tell if the function call -should be inlined or not. -The current heuristic does not inline into "large" functions unless -the called function is tiny. Functions that are only used once -are inlined, as well as medium-sized functions, while function -calls with constant arguments allow slightly larger functions. +The transform changes code of the form + +.. code-block:: none + function f(a, b) -> c { /* ... */ } + let z := f(x, y) + +into + +.. code-block:: none + + function f(a, b) -> c { /* ... */ } + let f_b := y + let f_a := x + let f_c + /* ... code of f, with replacements: a -> f_a, b -> f_b, c -> f_c */ + let z := f_c + +During inlining, a heuristic is used to tell if the function call should be inlined or not. + +A function call will never be inlined if: + +- The function definition contains the ``leave`` opcode. +- The function is large and inlining is not *aggressive* (details below). +- The function definition contains any direct, recursive calls to itself or the call being inlined is recursive. + Note that indirect recursive calls (going through other functions) do not disqualify the function from inlining. +- Argument expressions passed into the call can potentially have side effects. + This check is very broad, rejecting anything that is not simply a literal or an identifier, since + the condition is easy to satisfy by running ExpressionSplitter beforehand. + +Otherwise, if any of the following conditions is met, the call is inlined unconditionally: + +- The function is tiny. +- The function is only ever called once. + +In the remaining cases the decision to inline or not depends on the size of the function. +The size used here is not the actual bytecode size (which is not known to the optimizer), but rather +a score calculated at the Yul level, which approximates that size. +Inlining is performed if the size does not exceed a predefined threshold, which is a little higher +if the inlining is aggressive and yet higher when the call has some constant arguments. + +The aggressive inlining is performed when: + +- The :ref:`yul-memoryguard` is present. +- The function is not recursive (not even indirectly). +- The legacy Yul->EVM transform is not used (i.e. the target is not the ``homestead`` :ref:`EVM version `). In the future, we may include a backtracking component that, instead of inlining a function right away, only specializes it, @@ -1238,7 +1300,7 @@ we can run the optimizer on this specialized function. If it results in heavy gains, the specialized function is kept, otherwise the original function is used instead. -FunctionHoister and ExpressionSplitter are recommended as prerequisites since they make the step +FunctionHoister and ExpressionSplitter are recommended before this step since they make it more efficient, but are not required for correctness. In particular, function calls with other function calls as arguments are not inlined, but running ExpressionSplitter beforehand ensures that there are no such calls in the input. @@ -1263,8 +1325,24 @@ It does not make use of any information concerning the commutativity of the opco if moving the value of a variable to its place of use would change the order of any function call or opcode execution, the transformation is not performed. +Code of the form + +.. code-block:: none + + let a1 := mload(y) + let a2 := mul(x, 4) + sstore(a2, a1) + +is transformed into + +.. code-block:: none + + sstore(mul(x, 4), mload(y)) + Note that the component will not move the assigned value of a variable assignment or a variable that is referenced more than once. +The transformation is also not applied to loop conditions, because those are +evaluated with each loop. The snippet ``let x := add(0, 2) let y := mul(x, mload(2))`` is not transformed, because it would cause the order of the call to the opcodes ``add`` and @@ -1276,6 +1354,8 @@ Because of that, the snippet ``let x := add(0, 2) let y := mul(x, 3)`` is transformed to ``let y := mul(add(0, 2), 3)``, even though the ``add`` opcode would be executed after the evaluation of the literal ``3``. +Destroys the :ref:`expression-split ` form. + .. _ssa-reverser: SSAReverser @@ -1326,6 +1406,8 @@ by ``a`` (until ``a`` is re-assigned). The UnusedPruner will then eliminate the variable ``a_1`` altogether and thus fully reverse the SSATransform. +Destroys the :ref:`SSA ` form. + .. _stack-compressor: StackCompressor @@ -1362,6 +1444,8 @@ which are always movable. If the value is very cheap or the variable was explicitly requested to be eliminated, the variable reference is replaced by its current value. +Destroys the :ref:`expression-split ` form. + .. _for-loop-condition-out-of-body: ForLoopConditionOutOfBody diff --git a/docs/yul.rst b/docs/yul.rst index 58cfc992fd99..0d7de11ceae2 100644 --- a/docs/yul.rst +++ b/docs/yul.rst @@ -1023,6 +1023,8 @@ option. See :ref:`Using the Commandline Compiler ` for details about the Solidity linker. +.. _yul-memoryguard: + memoryguard ^^^^^^^^^^^ diff --git a/libyul/optimiser/ControlFlowSimplifier.h b/libyul/optimiser/ControlFlowSimplifier.h index 3ebbe7c9aeee..77953a10cca7 100644 --- a/libyul/optimiser/ControlFlowSimplifier.h +++ b/libyul/optimiser/ControlFlowSimplifier.h @@ -45,8 +45,6 @@ class TypeInfo; * and ``continue`` statements during its traversal. * * Prerequisite: Disambiguator, FunctionHoister, ForLoopInitRewriter. - * - * Important: Introduces EVM opcodes and thus can only be used on EVM code for now. */ class ControlFlowSimplifier: public ASTModifier { diff --git a/libyul/optimiser/StructuralSimplifier.h b/libyul/optimiser/StructuralSimplifier.h index 734b2e53af8e..295c92101b93 100644 --- a/libyul/optimiser/StructuralSimplifier.h +++ b/libyul/optimiser/StructuralSimplifier.h @@ -34,8 +34,6 @@ namespace solidity::yul * The LiteralRematerialiser should be run before this. * * Prerequisite: Disambiguator. - * - * Important: Can only be used on EVM code. */ class StructuralSimplifier: public ASTModifier {