You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improvements to the IR would solve a handful of outstanding issues, and also pave the way for much more optimal codegen than we can currently produce.
Signless Integral Types
Signless integral types, with explicit signed operations, is a more natural translation of Wasm semantics, that would allow us to omit a lot of unnecessary assertions that are emitted due to the semantics of our IR requiring stronger type safety than Wasm provides. This is a major priority for generation of efficient programs.
Structured Control Flow
Our current HIR representation is a hybrid CFG and dataflow graph in SSA form, with only unstructured control flow primitives. Both WebAssembly and Miden Assembly only have structured control flow, and so in the process of translating through our IR, we lower structured control flow to unstructured, then have to lift it back again. This is ripe for introducing bugs, and indeed that has been the case. The treeify pass we use to simplify CFGs for lowering to MASM has been a regular source of issues, and there is at least one major outstanding bug that I aim to fix with this refactoring.
To facilitate structured control flow in the IR, my plan is to refactor the IR into something a bit more RVSDG (Regionalized Value-State Dependency Graph) shaped, in the spirit of MLIR. The primary difference is that operations in the IR can have regions, which have one or more blocks, which in turn contain operations. Regions can have operands, which correspond to the block arguments of the entry block of that region; and they can have results. An example of a region is a function body - the function parameters are the "operands" of the region, it has one or more blocks, in which instructions representing primitive operations are sequenced, and are terminated with control flow operations.
A more pertinent example for our purposes, is if/then/else conditional branches, where each branch consists of a region (which have no operands), consisting of a single block. You can nest these arbitrarily deep. This has the effect of making control flow hierarchical and structured, ideal for lowering to Miden Assembly. The other structured operation we'd implement is a do/while loop construct, consisting of two regions, before and after. The before region has operands that represent loop-carried state, and is responsible for returning a condition that signals whether to transfer control out of the loop, or continue it. The after region acts as the loop body, and must yield operands to the before region on each iteration. Depending on how you represent the two regions, you can represent both do/while and while style loops.
This representational change will allow us to significantly simplify parts of the compilation pipeline, and emit more efficient code than we do today.
Additional Extensions
A few other items I'd like to set up infra for as part of this:
Make the set of IR operations open, rather than closed, and use traits/interfaces to abstract over families of operations (e.g. binary operators)
Make canonicalization a core part of defining an operation. Canonicalization would then be run as part of the pipeline, and allow all subsequent passes to rely on canonical forms for optimization/analysis.
Implement folding as part of the operation interface, this will allow us to fold constants into operations for tighter codegen, and makes further analysis and optimization simpler.
Some basic conversion/legalization infrastructure. This is to ensure that when we go to lower to Miden Assembly, or lower from some initial phase of the pipeline to a later one, that we guarantee that all ops after that point are converted/legal forms. Right now we just panic if we hit an illegal form that we expected to have been converted earlier in the pipeline, which is not ideal.
Some basic pattern matching infrastructure. This would allow us to express transformations as rewrite rules over patterns. The primary reason for this is to be able to implement a set of peephole optimizations to emit more efficient code, but in a way that is open-ended and doesn't require us to implement individual passes for each transformation (or try to cram them all into one pass).
Roadmap
The initial refactoring for the two main goals I plan to implement during the Beta 1 milestone, as it will be necessary to address some outstanding bugs/blockers in the current pipeline. The nice-to-have extras will be implemented as-needed.
This has to happen in concert with packaging support, and implementation of annotations/types in the assembler, so I don't have a precise timeline for when this will be done, but I consider it my main priority for the next couple weeks. I've spiked the implementation of this in my bitwalker/ir-refactoring branch, and will continue iterating on it there until it is in a state where a PR can be opened.
The text was updated successfully, but these errors were encountered:
Improvements to the IR would solve a handful of outstanding issues, and also pave the way for much more optimal codegen than we can currently produce.
Signless Integral Types
Signless integral types, with explicit signed operations, is a more natural translation of Wasm semantics, that would allow us to omit a lot of unnecessary assertions that are emitted due to the semantics of our IR requiring stronger type safety than Wasm provides. This is a major priority for generation of efficient programs.
Structured Control Flow
Our current HIR representation is a hybrid CFG and dataflow graph in SSA form, with only unstructured control flow primitives. Both WebAssembly and Miden Assembly only have structured control flow, and so in the process of translating through our IR, we lower structured control flow to unstructured, then have to lift it back again. This is ripe for introducing bugs, and indeed that has been the case. The treeify pass we use to simplify CFGs for lowering to MASM has been a regular source of issues, and there is at least one major outstanding bug that I aim to fix with this refactoring.
To facilitate structured control flow in the IR, my plan is to refactor the IR into something a bit more RVSDG (Regionalized Value-State Dependency Graph) shaped, in the spirit of MLIR. The primary difference is that operations in the IR can have regions, which have one or more blocks, which in turn contain operations. Regions can have operands, which correspond to the block arguments of the entry block of that region; and they can have results. An example of a region is a function body - the function parameters are the "operands" of the region, it has one or more blocks, in which instructions representing primitive operations are sequenced, and are terminated with control flow operations.
A more pertinent example for our purposes, is if/then/else conditional branches, where each branch consists of a region (which have no operands), consisting of a single block. You can nest these arbitrarily deep. This has the effect of making control flow hierarchical and structured, ideal for lowering to Miden Assembly. The other structured operation we'd implement is a do/while loop construct, consisting of two regions, before and after. The before region has operands that represent loop-carried state, and is responsible for returning a condition that signals whether to transfer control out of the loop, or continue it. The after region acts as the loop body, and must yield operands to the before region on each iteration. Depending on how you represent the two regions, you can represent both do/while and while style loops.
This representational change will allow us to significantly simplify parts of the compilation pipeline, and emit more efficient code than we do today.
Additional Extensions
A few other items I'd like to set up infra for as part of this:
Roadmap
The initial refactoring for the two main goals I plan to implement during the Beta 1 milestone, as it will be necessary to address some outstanding bugs/blockers in the current pipeline. The nice-to-have extras will be implemented as-needed.
This has to happen in concert with packaging support, and implementation of annotations/types in the assembler, so I don't have a precise timeline for when this will be done, but I consider it my main priority for the next couple weeks. I've spiked the implementation of this in my
bitwalker/ir-refactoring
branch, and will continue iterating on it there until it is in a state where a PR can be opened.The text was updated successfully, but these errors were encountered: