Conversation
of the monolithic DWARF patch
this is responsible for fulfilling the offset promises
|
In terms of gas, no changes are observed in 2 tests. |
By mentioning all three implicit filenames, the unit's name will appear at inndex 0
osa1
left a comment
There was a problem hiding this comment.
I didn't review the section code line-by-line (that'd require a few weeks of studying DWARF) but added some inline comments based on the question of "what would be the questions I would ask if I had to work on this code".
Other questions:
-
I think the sections generated in this patch (debug_line and debug_line_str) are as explained in DWARF section 6.2, right? Would be good to mention this somewhere so that a reader will know where to look for the specification of the format.
-
In the PR description:
Sometimes when stepping out of a function, one finds her/himself in assembly land
Is thought this is fixed? Is this still a problem with this PR?
-
I wonder if there's an existing tool that can generate this information from the source maps?
src/wasm-exts/dwarf5.ml
Outdated
| let line_range = 7 | ||
| let opcode_base = dw_LNS_set_isa | ||
|
|
||
| type state = int * (int * int * int) * int * (bool * bool * bool * bool) |
There was a problem hiding this comment.
Would it make sense to use a record type here so that it'll be clear what is what. This has 5 ints and 4 bools with no names and no documentation.
There was a problem hiding this comment.
I tried to add an extended commentary, but seeing above sentiment I may give it a try.
There was a problem hiding this comment.
I might actually be clearer since you can use record punning, field ommission and punning - maybe
It is a leftover from early times.
This should be renamed
|
So it looks like this code still supports the old sourcemap emission which was derived from the motoko source locations attached to wasm instructions. Does the DWARF format use that information too, or only the information in Meta instructions, or both? What added value do the various Meta instructions provide, since I guess they can get in the way of peephole optimization etc. I'm actually wondering if it would be better to put the dwarf information not in an extra instruction, but alongside every instruction like the existing source annotations - then the DWARF instructions wouldn't interfere with code opimization so easily. |
src/wasm-exts/customModuleEncode.ml
Outdated
| rel addr, (file', line, column + 1), 0, (stmt, false, false, false) in | ||
|
|
||
| let joining (prg, state) state' : int list * Dwarf5.Machine.state = | ||
| (* FIXME: quadratic *) |
There was a problem hiding this comment.
It should be possible to use difference-lists here which would give constant-time concat, thus linear complexity for the fold.
There was a problem hiding this comment.
For now we can live with this, I hope.
There was a problem hiding this comment.
If not too hard (can you just use an accumulator and reverse at the end?) it might be worth fixing this now - Looks like joining is done in a fold below - this could easily bite us later and might be hard to track down.
There was a problem hiding this comment.
Reflecting about this, I think a right-fold with prepending and seed [dw_LNS_advance_pc; 1; - dw_LNE_end_sequence] would exactly do the desired thing. Alas, there is no Seq. fold_right. I'll figure out something.
|
@nomeata Your input is always welcome, but optional. |
review feedback
@crusso As seen in #1546, eliminating the old-style names section has negative effects on certain tools that also run from the CI. So I won't do that.
Your doubts about |
I didn't mean the names sections, which is part of the wasm spec but the sourcemap itself. I actually don't wont to disable the latter since it may be useful for other tools that don't understand dwarf (e.g my old debugger but also V8/Firefox)
Ok, I'll defer to @nomeata's judgment on how this approach impacts the backend. I'm just talking from my experience with SML.NET where we also encoded the debug info as special instructions which just got in the way of the rest of the codegen. But if you've got something that works, we can revisit later, sure. |
|
Revisiting later sounds reasonable |
@crusso My fault, I confused stuff. The sourcemap functionality ( |
@osa1 I am not aware of any. Please note also that sourcemaps contain pro/epilogue information as well as statement and function boundaries. It also tracks redundancies due to inlining (what we don't have at present) and basic blocks (which I haven't tackled yet). So converting from sourcemaps to DWARF would be impoverished at best. |
Nowhere, it gets written out as a separate file |
looks better actually
src/wasm-exts/customModuleEncode.ml
Outdated
| rel addr, (file', line, column + 1), 0, (stmt, false, false, false) in | ||
|
|
||
| let joining (prg, state) state' : int list * Dwarf5.Machine.state = | ||
| (* FIXME: quadratic *) |
There was a problem hiding this comment.
If not too hard (can you just use an accumulator and reverse at the end?) it might be worth fixing this now - Looks like joining is done in a fold below - this could easily bite us later and might be hard to track down.
src/wasm-exts/customModuleEncode.ml
Outdated
| (write_opcodes u8 uleb128 sleb128 write32 | ||
| Dwarf5.(prg | ||
| @ [dw_LNS_advance_pc; 1] | ||
| @ (if stmt then [dw_LNS_negate_stmt] else []) (* FIXME: actually irrelevant *) |
There was a problem hiding this comment.
When the end_sequence flag is present, all other flags are ignored. After all, it marks the IP after the last instruction of the sequence.
| data_section m.data; | ||
| (* other optional sections *) | ||
| name_section em.name; | ||
| if !Mo_config.Flags.debug_info then |
src/wasm-exts/dwarf5.ml
Outdated
| type instr_mode = Regular | Prologue | Epilogue | ||
|
|
||
| type state = { ip : int | ||
| ; loc : int * int * int |
There was a problem hiding this comment.
Guess you could use a location record too {file;line;col} , but perhaps overkill. You decide.
There was a problem hiding this comment.
Yah. I thought about it, but it was late and I probably forgot. I'll see if that looks better.
src/wasm-exts/dwarf5.ml
Outdated
| | op :: tail when dw_LNS_negate_stmt = op -> if noisy then Printf.printf "~STMT\n"; standard op; chase tail | ||
| | op :: tail when dw_LNS_set_prologue_end = op -> if noisy then Printf.printf "<PRO\n"; standard op; chase tail | ||
| | op :: tail when dw_LNS_set_epilogue_begin = op -> if noisy then Printf.printf ">EPI\n"; standard op; chase tail | ||
| | op :: tail when - dw_LNE_end_sequence = op -> if noisy then Printf.printf "FIN\n"; extended1 op; chase tail |
There was a problem hiding this comment.
| op :: tail when - dw_LNE_end_sequence = op -> if noisy then Printf.printf "FIN\n"; extended1 op; chase tail
^ what does this negation do? Is it negation or some weird Ocaml pattern match extension?
There was a problem hiding this comment.
It is negation. It is an extra bit of information signifying extended opcode (these need to be written as several bytes). I'll add a comment explaining the scheme.
There was a problem hiding this comment.
Actually it prevents an ambiguity between standard opcodes (DW_LNS_*) and extended ones (DW_LNE_*), which otherwise would share the same overlapping ranges. I'll try to hide the minus in a somewhat more subtle way, to reduce the WTF! effect.
crusso
left a comment
There was a problem hiding this comment.
It kinda hard to review with almost zero knowledge of DWARF. I'd fix the quadratic code if you can foresee it'll be an issue.
9253305 to
31e1bcd
Compare
review comment
31e1bcd to
f007024
Compare
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Done in 0e8b05a. It was an issue: the QuickCheck tests now run in 4 min (vs. 15 min before this change). |
hiding it in one module
src/wasm-exts/customModuleEncode.ml
Outdated
| (write_opcodes u8 uleb128 sleb128 write32 | ||
| Dwarf5.(prg @ [dw_LNS_advance_pc; 1; - dw_LNE_end_sequence])) | ||
| let prg0, _ = Seq.fold_left joining ([], start_state) states_seq in | ||
| let prg = List.fold_left (Fun.flip (@)) Dwarf5.[dw_LNS_advance_pc; 1; - dw_LNE_end_sequence] prg0 in |
There was a problem hiding this comment.
Isn't this still pretty inefficient? I expect there's more gains here.
There was a problem hiding this comment.
I can't see how: try
fold_left (flip (@)) [x; y; z] [[h; i]; [d; e; f; g]; [a; b; c]]
--> [a; b; c; d; e; f; g; h; i; x; y; z]There was a problem hiding this comment.
It would be interesting to fuse the two lines, but the asymptotics are right (i.e. O(n) with n = length result).
Split out the line information related tables into a separate patch.
This won't yet produce statement boundaries, since the codegen is not inserting them yet into the AST.
Observed anomalies (fixing these only makes sense with the entire package) :
The main compilation unit's filename should be at position 0, currently it is at pos. 1. But no drawbacks are seenwasmtimebug, phew! Sometimes when stepping out of a function, one finds her/himself in assembly land. This might be due to incorrectproepilog markers. (Adding a very early prolog-ending marker improved the situation, but it still occurs in other functions. This gives me the hope that placing the marker after the filling of certain locals might eliminate the issue. -- Turns out, the emission of functions'endinstruction was messy.)Similarly, when stopping on a breakpoint the argument appears asI haven't seen this after placing prolog-ending marker after the locals.<unavailable>, but after stepping it appears. This is probably a prolog marker problem.0x000000000000047e below appears twice, this could be the reason(fixed in 8b84b44)Progress:
Needs to be done: