Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow invalid ranges in RngListIter #715

Merged
merged 3 commits into from
May 24, 2024

Conversation

jameysharp
Copy link
Contributor

I thought it would be easiest to describe the issue I'm encountering with a patch, but I'm not attached to this particular solution if folks would prefer some other approach.

I've encountered a toolchain which produces invalid ranges, where the end of the range is before the beginning.

I want to fix that toolchain, of course, but I also noticed that tools like llvm-addr2line and llvm-dwarfdump --lookup apparently silently skip the invalid ranges, so they're able to give partial answers. By contrast, gimli returns an error in this case. Then the addr2line crate, for example, bubbles the error all the way out and doesn't return any results.

It would be useful to be able to skip over the invalid ranges, instead of truncating a range-list at the first error. To that end I've deleted the check here in RngListIter::convert_raw, so the caller can decide what they want to do with such ranges. addr2line, for example, already checks that range.begin < range.end and silently ignores the range otherwise.

The specific toolchain where I've encountered this is TinyGo (which uses LLVM), when building WebAssembly with the wasip1 target. I tested TinyGo 0.32.0-dev and LLVM 17, and this was the shortest program I could find which produces invalid ranges:

package main
import "os"
func main() {
        os.Lstat("some-filename")
}

I've encountered a toolchain which produces invalid ranges, where the
end of the range is before the beginning.

I want to fix that toolchain, of course, but I also noticed that tools
like `llvm-addr2line` and `llvm-dwarfdump --lookup` apparently silently
skip the invalid ranges, so they're able to give partial answers. By
contrast, `gimli` returns an error in this case. Then the `addr2line`
crate, for example, bubbles the error all the way out and doesn't return
any results.

It would be useful to be able to skip over the invalid ranges, instead
of truncating a range-list at the first error. To that end I've deleted
the check here in `RngListIter::convert_raw`, so the caller can decide
what they want to do with such ranges. `addr2line`, for example, already
checks that `range.begin < range.end` and silently ignores the range
otherwise.

The specific toolchain where I've encountered this is TinyGo (which uses
LLVM), when building WebAssembly with the `wasip1` target. I tested
TinyGo 0.32.0-dev and LLVM 17, and this was the shortest program I could
find which produces invalid ranges:

```go
package main
import "os"
func main() {
        os.Lstat("some-filename")
}
```
@philipc
Copy link
Collaborator

philipc commented May 23, 2024

This seems fine to me, but I'll take the time to reproduce it and see if any other options are apparent.

@philipc
Copy link
Collaborator

philipc commented May 23, 2024

I couldn't reproduce this with the lastest dev version of tinygo (tinygo version 0.32.0-dev-ad0340d linux/amd64 (using go version go1.22.3 and LLVM version 17.0.1)).

Can you give me a file (email or attach here is fine).

@jameysharp
Copy link
Contributor Author

Here's the wasm I got from tinygo build -target=wasip1 /tmp/test.go, with tinygo version 0.32.0-dev linux/amd64 (using go version go1.21.9 and LLVM version 17.0.6), same tinygo commit ad0340d4 as you tested.

test.wasm.gz

If you run cargo run -p gimli-examples --bin dwarfdump -- test.wasm you should see several messages saying "Failed to dump attribute value: The end of an address range must not be before the beginning." In the dwarfdump example program this doesn't stop execution, unlike what the addr2line crate does, but it does truncate the range list prematurely. This PR changes the dwarfdump output on this test wasm this way:

18409c18409,18410
<                       [ 2] Failed to dump attribute value: The end of an address range must not be before the beginning.
---
>                       [ 2] <offset-pair 0x00003e9b, 0x00003e7c> [0x00003e9b, 0x00003e7c]
>                       [ 3] <offset-pair 0x00003f17, 0x000040f1> [0x00003f17, 0x000040f1]
18514c18515,18516
<                       [ 1] Failed to dump attribute value: The end of an address range must not be before the beginning.
---
>                       [ 1] <offset-pair 0x00002b15, 0x00002b11> [0x00002b15, 0x00002b11]
>                       [ 2] <offset-pair 0x000041fa, 0x00004216> [0x000041fa, 0x00004216]
28023c28025,28032
<                       [ 0] Failed to dump attribute value: The end of an address range must not be before the beginning.
---
>                       [ 0] <offset-pair 0x0000b2bd, 0x0000b2b8> [0x0000b2bd, 0x0000b2b8]
>                       [ 1] <offset-pair 0x0000b307, 0x0000b304> [0x0000b307, 0x0000b304]
>                       [ 2] <offset-pair 0x00000000, 0x00000001> [0x00000000, 0x00000001]
>                       [ 3] <offset-pair 0x0000b3e4, 0x0000b3df> [0x0000b3e4, 0x0000b3df]
>                       [ 4] <offset-pair 0x0000b44a, 0x0000b445> [0x0000b44a, 0x0000b445]
>                       [ 5] <offset-pair 0x0000b4b0, 0x0000b4ab> [0x0000b4b0, 0x0000b4ab]
>                       [ 6] <offset-pair 0x0000b51e, 0x0000b519> [0x0000b51e, 0x0000b519]
>                       [ 7] <offset-pair 0x0000b56a, 0x0000b571> [0x0000b56a, 0x0000b571]
35919c35928
<                       [ 1] Failed to dump attribute value: The end of an address range must not be before the beginning.
---
>                       [ 1] <offset-pair 0x0000d29c, 0x0000d283> [0x0000d29c, 0x0000d283]

@jameysharp
Copy link
Contributor Author

In case you're curious: After more investigation we believe these invalid ranges are being introduced by wasm-opt when tinygo runs the "asyncify" pass over LLVM's output, because we no longer see these invalid ranges if we run tinygo with the -scheduler=none flag, which disables asyncify.

There was a similar issue reported at WebAssembly/binaryen#6406, and actually we can reproduce that issue even with -scheduler=none, since tinygo still runs other wasm-opt passes. However, gimli doesn't try to verify that inlined address ranges fall within the address ranges of their parents, so it doesn't complain about that issue.

At any rate, while the producer of this DWARF is clearly buggy, I still think it's useful for gimli to return the address range as provided and let the caller decide if they want to validate it, so I'm still interested in merging this PR or something equivalent.

Copy link
Collaborator

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change it to return Ok(None) (same as what we do for tombstones), and also do a similar change for location lists.

@jameysharp
Copy link
Contributor Author

Sure, done!

@philipc philipc merged commit e095972 into gimli-rs:master May 24, 2024
20 checks passed
@jameysharp jameysharp deleted the allow-invalid-ranges branch May 24, 2024 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants