feat: implement rodata initialization #260

bitwalker · 2024-07-26T04:15:34Z

NOTE: This PR is a work in progress, and blends multiple issues together as I work through implementation and testing of both rodata init, as well as some related issues in our test suite.

A rough summary of what this contains:

Refactor the compiler driver to support multiple inputs
Refactor the compiler pipeline to support a mix of MASM and HIR inputs
Refactor the compiler pipeline to always output a midenc_codegen_masm::Program
Move responsibility for assembly of a miden_core::Program into midenc_codegen_masm::Program::assemble
Implement support for generating two different types of extra setup code for executable programs:
- Data segment initialization on program start, before invoking the entrypoint
- Test harness, which allows initializing memory of the program from the advice stack on startup. This is primarily intended to allow initializing shadow stack memory such that a Rust-compiled function can be called as if the caller is itself a Rust function, which is useful for testing, and necessary for testing non-immediate inputs.
Implement a richer Miden VM test executor, which allows initializing advice inputs in addition to the operand stack. It also captures much more information about the program under test, and can dump a variety of useful information when errors occur.
Make use of the fact that we parse DWARF debug info when parsing Wasm modules, to allow reifying actual source spans during translation to HIR. This depends on having DWARF debug info in the Wasm module, as well as the source paths contained in the Wasm, actually mapping to a file on disk, so that we can read it into memory in order to compute byte offsets (WIP)
Fix the blake3 and get_inputs tests in the integration test suite (WIP, trying to get debug info working so that it is easier to determine where an assertion is being raised when the test fails; would also like to test that the test harness itself is set up properly).

This PR will remain a draft until the blake3 and get_inputs tests are working, at which point I will try to rework the commits in this PR into more granular changes. At this point, there are debugging statements and other cruft present throughout the code, and the test suite may be broken due to some of the changes not being handled yet.

codegen/masm/src/masm/program.rs

hir/src/asm/isa.rs

sdk/stdlib-sys/src/stdlib/crypto/hashes.rs

tests/integration/src/rust_masm_tests/abi_transform/tx_kernel.rs

This commit introduces a new `CompilerTestBuilder` type, which is used to configure compiler tests before they are instantiated. This lets us build up the configuration that will be used to generate compiler inputs in multiple stages, and only instantiate it once we wish to run the test. Prior to this, we mixed these stages together, which made it difficult to configure the input generation step in more than one place, which is a problem in more complex tests, or edge case tests which do not neatly fit the mold of one of the default configurations. In addition to `CompilerTestBuilder`, there are also two configuration types introduced, `CargoTest` and `RustcTest`, which configure tests based on `cargo` or `rustc` outputs (as inputs to `midenc`). These are used in conjunction with `CompilerTestInputType` to instantiate a `CompilerTestBuilder` for one of the three input types currently used for compiler testing.

tests/integration/src/compiler_test.rs

greenhat

Hey, I think intrinsics::mem::store_dw silently doing nothing might cause troubles. See

compiler/codegen/masm/intrinsics/mem.masm

Lines 571 to 575 in 401845e

    
           export.store_dw # [waddr, index, offset, value] 
        
               # TODO: implement 
        
               # cleanup the operand stack 
        
               dropw 
        
           end

There are a few of them in both blake3 and get_inputs tests and I'm not so sure anymore that we're not hitting them on the happy path.
I suggest to crash(via assert) there until I have it implemented.

bitwalker · 2024-08-01T09:20:15Z

@greenhat Yeah I've been suspicious that we're hitting it and not aware of it, and unfortunately the way I stubbed out that intrinsic doesn't help! Good call on making it assert, I'll give it a unique error code too, so we'll know for sure if we hit it.

I've got the last debug info PR for the VM in review now, so I'm going to get back to this today and try to get it wrapped up now that I can actually see what's going on.

* Diagnostics infra is now implemented in miden-core * Unified source code abstraction and source spans * Compiler still retains the `DiagnosticsHandler`, with some minor improvements, but `CodeMap` is replaced with `SourceManager` * Use unified spans to emit precise debug information from the compiler * Use precise information in assembler to produce debug info useful for high-level languages * Numerous cleanups along the way to try and tidy up the way we represent and use diagnostics throughout the compiler

…ents Previously, the Br, CondBr, and Switch branch instructions all used slightly different representations for their successor information, for no real purpose. This commit unifies these, as well as provides a more uniform API (e.g. `analyze_branch`'s `BranchInfo` is also unified in the same manner). The Switch instruction type is also enhanced with the ability to pass block arguments to successor blocks in each arm, as well as the default case. These statements will be lowered using a binary search approach on the discriminant value, so as to lower to the least number of conditional branches possible. Branch weights would be exceedingly useful here, but we don't have any meaningful way to obtain that information from Rust currently. To better support the new Switch API, the `switch` builder has been updated to use a specialized builder that guarantees that the `Switch` is well-formed. The Wasm frontend has had its lowering reworked to make use of the new builder, and take advantage of the support for successor arguments.

greenhat

Great job! Just a couple of minor notes.

frontend-wasm/src/module/module_env.rs

greenhat · 2024-08-06T09:04:09Z

frontend-wasm/src/miden_abi/transform.rs

+            stdlib::mem::PIPE_DOUBLE_WORDS_TO_MEMORY => return TransformStrategy::ReturnViaPointer,
+            _ => (),
+        },
+        "std::crypto::hashes::blake3" => match function_id {


"std::crypto::hashes::blake3" string literal is also used in stdlib::crypto::hashes module. It'd be better to define in one place.

greenhat · 2024-08-06T09:14:35Z

I'm taking a closer look into the failing tests now. From the get-go, I see that some expected tests are failing, so UPDATE_EXPECT=1 should take care of those issues, and we will see what's failing next.

greenhat · 2024-08-06T13:00:44Z

After fixing the expected code tests I'm seeing a bunch of failures:

instructions::*_i64 due to stdlib is missing;
all Wasm CM tests due to outdated cargo-component (fixed in my chore: add test coverage reporting job to CI #262 which I'll ~~split~~ update the handling of the new rustc warning for dead code ASAP);
intrinsics::* due to MASM artifact being a library and not an executable;

I'm preparing a PR to fix the "not an executable" error.

greenhat · 2024-08-06T13:24:07Z

#271 is ready to merge

bitwalker added 2 commits July 26, 2024 00:01

wip: unify compilation, rodata init, test harness

1dac466

wip: derive source spans from dwarf debug info in wasm frontend

093d75f

bitwalker added bug Something isn't working codegen blocker This issue is one of our top priorities labels Jul 26, 2024

bitwalker requested a review from greenhat July 26, 2024 04:15

bitwalker self-assigned this Jul 26, 2024

greenhat reviewed Jul 29, 2024

View reviewed changes

codegen/masm/src/masm/program.rs Show resolved Hide resolved

greenhat reviewed Jul 29, 2024

View reviewed changes

hir/src/asm/isa.rs Show resolved Hide resolved

greenhat reviewed Jul 29, 2024

View reviewed changes

sdk/stdlib-sys/src/stdlib/crypto/hashes.rs Show resolved Hide resolved

greenhat reviewed Jul 29, 2024

View reviewed changes

tests/integration/src/rust_masm_tests/abi_transform/tx_kernel.rs Show resolved Hide resolved

bitwalker added 2 commits July 29, 2024 23:41

feat(codegen): propagate source spans from hir to masm

20b6022

greenhat reviewed Jul 30, 2024

View reviewed changes

tests/integration/src/compiler_test.rs Show resolved Hide resolved

greenhat reviewed Jul 31, 2024

View reviewed changes

bitwalker added 6 commits August 2, 2024 06:08

chore: bump rust toolchain

a7c4390

wip: update to latest miden vm patchset

e573066

wip: support compiled libraries, linker flags

5ec7db7

fix: clap error formatting, unused deps

453467c

fix(cli): improve help output, hide plumbing flags

34ed979

greenhat mentioned this pull request Aug 6, 2024

chore: add test coverage reporting job to CI #262

Draft

bitwalker added 7 commits August 6, 2024 02:09

chore: move miden-vm deps to latest commit included in 0.10 releasef

c3af0cc

test: use nextest test runner

1b17a2f

fix: bug in program.has_entrypoint

833a7e9

fix: handle behavior change between codemap and sourcemanager

768da77

test: temporarily disable i128 tests until emulator supports mast

2a61993

test: disable tx_kernel and stdlib tests until rodata issues fixed

651edfe

greenhat approved these changes Aug 6, 2024

View reviewed changes

bitwalker added 3 commits August 6, 2024 13:07

feat: improve inference of project type based on driver flags

e3538fc

fix: various tests, cli bugs, vm test executor, test builder api

fd91a8a

fix: linking of standard library in tests

e319222

bitwalker closed this Aug 6, 2024

bitwalker mentioned this pull request Aug 6, 2024

wip: alpha release #272

Merged

bitwalker deleted the bitwalker/rodata-init branch September 6, 2024 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement rodata initialization #260

feat: implement rodata initialization #260

bitwalker commented Jul 26, 2024 •

edited

Loading

greenhat left a comment

bitwalker commented Aug 1, 2024

greenhat left a comment

greenhat Aug 6, 2024

greenhat commented Aug 6, 2024

greenhat commented Aug 6, 2024 •

edited

Loading

greenhat commented Aug 6, 2024

	export.store_dw # [waddr, index, offset, value]
	# TODO: implement
	# cleanup the operand stack
	dropw
	end

feat: implement rodata initialization #260

feat: implement rodata initialization #260

Conversation

bitwalker commented Jul 26, 2024 • edited Loading

greenhat left a comment

Choose a reason for hiding this comment

bitwalker commented Aug 1, 2024

greenhat left a comment

Choose a reason for hiding this comment

greenhat Aug 6, 2024

Choose a reason for hiding this comment

greenhat commented Aug 6, 2024

greenhat commented Aug 6, 2024 • edited Loading

greenhat commented Aug 6, 2024

bitwalker commented Jul 26, 2024 •

edited

Loading

greenhat commented Aug 6, 2024 •

edited

Loading