Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guide for creating object files using wat2wasm #1658

Closed
mwilliamson opened this issue Apr 6, 2021 · 13 comments
Closed

Guide for creating object files using wat2wasm #1658

mwilliamson opened this issue Apr 6, 2021 · 13 comments

Comments

@mwilliamson
Copy link
Contributor

I'm writing a compiler that targets Wasm by generating .wat files, and then compiling those using wat2wasm. However, there are some parts of the runtime that I'd like to write using another language, such as C (or Rust), rather than having to hand-craft .wat files. It seems as though a potential route is to compile the .wat files using wat2wasm --relocatable, and to compile the C files to Wasm object files using Clang/LLVM, and then to link them together with wasm-ld.

Is there a guide describing how to do this? For instance, it'd be useful to have information on things like how to import functions from other modules, or have relocatable data (which doesn't seem possible at the moment since data can only be referenced by index rather than an identifier?)

So far, I've mostly pulled together information from:

Also, I realise this might not be the right place to ask, but I couldn't find an appropriate mailing list, forum or other community where this sort of thing can be discussed. Any pointers in that direction would be much appreciated!

@kripken
Copy link
Member

kripken commented Apr 6, 2021

Interesting question... I'm not sure if this is possible atm.

One option might be to write not wat files but LLVM .s files. Still text, and they can include relocations.

cc @aardappel who has experience with related topics of mixing language outputs.

@tlively
Copy link
Member

tlively commented Apr 6, 2021

Not exactly the guide you're looking for, but if you haven't seen it already, you might be interested in the specification of the Wasm object file binary format: https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md.

@mwilliamson
Copy link
Contributor Author

Thanks both! As it happens, I'm looking at that very link (the object file binary format) right now. Given I have something that currently generates the text format, I'm thinking the most straightforward thing to do would be to change it to emit in the binary format instead, and add in the custom sections.

@aardappel
Copy link
Contributor

Yes, working with wasm-ld is the best way to accomplish this, but I am not aware how well --relocatable works.

I've done this myself in the past, but rather than going through wat2wasm, I emitted a binary directly. That is not as scary as it sounds, here is a small stand-alone header-only C++ class that writes linkable .o files:

https://github.com/aardappel/lobster/blob/master/dev/src/lobster/wasm_binary_writer.h

(Note in particular the method Finish at the end which writes all the linking data specified by the doc that @tlively pointed to).

This is a doc that describes how all of that works, with an example of how to use the above class: https://github.com/aardappel/lobster/blob/master/docs/source/implementation_wasm.md
And a larger test: https://github.com/aardappel/lobster/blob/master/dev/src/lobster/wasm_binary_writer_test.h

Not sure what language you're working in, but should be easy to translate.

Linking to actual external runtime code is pretty easy, see e.g. here:

https://github.com/aardappel/lobster/blob/7fadff69303a7a6b2f90254ab949d04cedaa6718/dev/src/lobster/wasm_binary_writer_test.h#L41-L48

My talk at the Wasm Summit is going to cover this topic to some extent :)

@mwilliamson
Copy link
Contributor Author

Thanks for the advice! I've added in a writer for the Wasm binary format, and also for object files (so adding in symbol tables and relocations). However, I'm currently stuck on mutable globals. I've declared some globals as mutable, but when llvm-ld makes the final executable, all of the globals from my object file are immutable. I've tried changing the flags on the globals to local binding and hidden visibility, and also tried explicitly enabling the mutable globals feature, but that didn't seem to have any effect. Any idea how I get llvm-ld to preserve the mutability of globals?

@sbc100
Copy link
Member

sbc100 commented Apr 11, 2021

We do use mutable global in some places in llvm and in emscripten. For example __stack_pointer is a builtin mutable global.

We also declare some extra mutable globals that the linker doesn't have prior knowledge of. For example see https://github.com/emscripten-core/emscripten/blob/d5b212dd0c185d2cba6b08115c29228bdc4c38a8/system/lib/compiler-rt/stack_limits.S#L21.

You can see how clang compiles that code by looking at the resulting object in the libcompiler-rt library:

$ ar x ./cache/sysroot/lib/wasm32-emscripten/libcompiler_rt.a stack_limits.o
$ wasm-objdump -x stack_limits.o
...
Global[2]:
 - global[1] i32 mutable=1 <__stack_end> - init i32=0
 - global[2] i32 mutable=1 <__stack_base> - init i32=0
...

This mutability is the preserved in the final binaries (otherwise none of our programs would work).

@mwilliamson
Copy link
Contributor Author

Ah, sorry, I've just realised my error: I was writing the global index into the relocation entry, instead of writing the symbol index, so the relocation entries for globals were actually pointing to functions.

@mwilliamson
Copy link
Contributor Author

Right, I've got my example working (just creating an object file and running wasm-ld on that, I haven't integrated with any C code yet), but one thing I found was that relocations in the global section don't seem to do anything? Specifically, I've stored memory addresses in some immutable globals. Everything works fine (i.e. addresses are relocated as expected) when I use the code section to initialise the (now mutable) globals.

It's entirely possible that I've (again!) not produced the object file correctly, but I just wanted to check whether memory relocations in the global section would be expected to work?

@sbc100
Copy link
Member

sbc100 commented Apr 12, 2021

No, relocation don't work for the global section, relocations only apply to code, data and custom sections (used for debug into).

The reason is that all the other sections are created from scratch by the linker and not memcpy'd into place like the code and data sections (they are what we call synthetic sections).

I guess we should make it an errror to include relocations for any other section.

@mwilliamson
Copy link
Contributor Author

Got it, thanks! An error would be helpful, as would mentioning which sections can have relocations in the linking doc. If it's open to pull requests, I'm happy to try adding things I've learnt that might have been helpful? Alternatively, I'm happy to make issues if that's more useful (and certainly in some cases, even having gotten an example working, I wouldn't be sure what the right explanation would be!)

@sbc100
Copy link
Member

sbc100 commented Apr 12, 2021

An llvm bug report about the missing error message in wasm-ld would be great.

Pull request for tooling-conventions would also be great!

@mwilliamson
Copy link
Contributor Author

I've made a pull request here: WebAssembly/tool-conventions#164

@mwilliamson
Copy link
Contributor Author

I've written up (as best as I can remember!) the changes I had to make to my compiler to produce object files, on the off-chance it's useful to anyone else: https://mike.zwobble.org/2021/04/adventures-in-webassembly-object-files-and-linking/

I'll close this since it doesn't really have anything to do with wabt in the end. Thanks for all the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants