-
Notifications
You must be signed in to change notification settings - Fork 926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map delete not finding keys when value is largeish, key hash seems to be incorrect #3358
Comments
Digging through LLVM docs, I did come to understand that because get and delete create a temporarly alloca, it is conceivable for those to be different and then different addresses (though that would be an unfortunate deoptimization if copying the key struct from execution stack to c stack twice). I guess there is something about the real-world code that is causing the contents of the allocas to be different, which I still wasn't able to repro in the simpler example... |
For context, changing the map value type to pointer solved the issue, though is not ideal since we didn't want to heap allocate for this cache This is because it seems like the problematic behavior only happens with larger value size. Or perhaps just coincidence / luck |
I've done a lot of work on the hashing code and the hash maps. I'll take a look at this. |
Hmm... can you come up with a reproducer based on the mapbench code that doesn't require patching the runtime to demonstrate it? (I've patched my runtime but it's much easier if we have something that can if |
@dgryski Do you need a small amount of code or a small command that reliably reproduces? I can send the latter easily based on |
Both are nice. The small amount of code is better but the small command is useful until that arrives. |
Thanks @dgryski. You'll need your TinyGo built with #3302 and docker available.
|
Smaller reproducer:
Note that if the |
Oh duh. The hashmap code limits value size to That's an easy enough fix. |
Annoyingly while this is a bug I'm not certain this actually related to the issue at hand, given that the values in the test program are much smaller. |
I wonder if this is the same memory corruption from the garbage collector missing pointers that we've been seeing elsewhere? |
I'm having real trouble reproducing this on stock tinygo and wasmtime. :( |
@anuraaga it would be very helpful if you could reproduce this with stock TinyGo. If you can't, I have to assume it's related to the custom GC somehow. Some things that could help:
gcTotalAlloc += uint64(size)
gcMallocs++
+ runGC()
+
neededBlocks := (size + (bytesPerBlock - 1)) / bytesPerBlock
// Continue looping until a run of free blocks has been found that fits the |
I'll see if I can figure something out. One thing, from the original message
Is this expected? This is with TinyGo |
Yes. because the |
Thanks - I guess each operation is copying the key to the stack and its unfortunate those don't get coallesced but agree that's probably just a lack of optimization and not likely a bug (but still not 100% sure on that). Anyways, while it's still not a small code example, I updated the same branch as #3358 (comment) to unplug the custom GC and malloc, now it'll fail with just stock |
It's a stack address, so yeah there is a good chance it is different each time depending on what optimizations happen or don't happen. The address itself is not relevant to the hashmap though: it only looks at the value stored at the address. |
@anuraaga Have you gotten the |
@dgryski Unfortunately continuing tweaks to make mapbench look more like the WAF code has not repro'd the bug. However, that reminds me that the WAF code itself can actually be run with wazero. So instead of I know a smaller code example is important but not sure what else to try to get it to repro in mapbench. I do still want to challenge the stack address observations once more. The code is this
There is just one stack frame and one local variable
|
I'll double check the code we're generating for the map operations. |
When generating map operations, the LLVM code looks like
So, because the get/delete/set operations take a pointer to the key, we create an alloca and store the key into it. If llvm doesn't otherwise coalesce that store with the fact that the key value is already a stack value, then that probably is an optimization opportunity for us. However, while the extra key copy is a performance hit I don't think it's the source of this bug (which I have yet to replicate...) Edit: The punchline here of course is that the extra copy was the source of this bug... |
Got tinygo runtime logging enabled with
and of course a
|
The issue has to do with struct padding. The fields for the key struct look like:
The Rearranging the fields like this passes:
Explicitly account for those bytes with |
... oh no. This looks like a rather gnarly bug. One fix would be to:
But that's probably not the most efficient way to do this. |
"proof" via hexudmps:
So, again looks like memory corruption so it's tricky to replicate exactly once I add the debug logging, but in any event we can see that the 3 extra bytes |
It seems like there's going to be a bunch of places where we ask LLVM to "store" a value, and it copied over the bytes that are "live", but really we'd prefer to have just a straight |
Thanks for the investigation, looks promising! For reference found this issue in rust. Didn't read it all but the idea of synthetic padding fields generated by TinyGo for structs seems sort of ok I don't know if an alloca store would ever be conceptually a memcpy - the fields of the struct could just be in individual slots on the execution stack in wasm, or registers in a cpu, and would be individually copied onto the stack I guess. I guess that means that synthetic padding fields have the downside of likely using more execution stack / register. Otherwise probably need to memzero after each alloca. |
Yeah, I think a memzero is probably the best thing to do here. LLVM should be able to optimize it away when it isn't necessary, I believe. |
Do we need those memsets every time we do stores for structs with padding or just for map keys? Can we get away with just doing it the first time we create the value? Can we tag the hasher functions somehow so that llvm knows the padding needs to be cleared? |
I suppose it is just about map keys / general hashing - in that case, I wonder if fixing #3354's optimization issue with the reflection-based key computation and always using it instead of the memory-based hash is an alternative solution. It would be similar to Go's hash which always iterates over the fields Ideally the fields could be passed directly (like Go's |
So I looked at the LangRef and we may have a bit of a problem here:
(Emphasis mine). Structs are one of the aggregate types. What this says is that the padding bytes in a struct are undefined when stored. So even if the memory is cleared before the store, LLVM is allowed to fill the padding again with garbage (it probably won't, but it is allowed to).
No, only when the alloca itself is used as raw memory (like when it is hashed etc). Normal loads/stores (like
I'm not aware of anything like that.
We could, but it is always going to be slower than hashing the memory directly. Remember, reflect is slow (especially in TinyGo). |
I'm going to start on a "zero-the-undef-bytes-for-hashing" patch and see what that looks like. |
Fix for this is in dev. |
I am currently debugging this code that tries to delete the keys in a map (to empty it while retaining bucket storage)
https://github.com/corazawaf/coraza/blob/v3/dev/internal/corazawaf/rulegroup.go#L105
(for the below, note that I tried copying the keys into a slice before deleting to see if it was iterator+delete behavior problem but it was all the same so simplified back to normal map iteration).
I noticed a memory leak and after adding
if len(transformationCache) > 0 { panic }
after the deletion could verify that the map wasn't getting cleared with tinygo wasm. It works fine with Go.I have added this instrumentation to hashmap.go
https://github.com/anuraaga/tinygo/compare/custom-gc...anuraaga:tinygo:hashmapdelete-debug?expand=1
And shrunk down the test to similar one as last time
https://github.com/corazawaf/coraza-proxy-wasm/compare/main...anuraaga:coraza-1225?expand=1#diff-173fbfd8d8844658344b121461b4290d0a85230caae9825240705df8130e8b75R33
~/go/bin/tinygo build -scheduler=none -target=wasi -opt=2 -o mapbench.wasm
The debug output looks something like this
The key address for get is
0x0000ffd0
and delete is0x0000ffb8
. That being said the hash is the same in this example so it's being able to clear the map but with the same instrumentation when looking at the output for the original coraza code, the hash values were also different. I'm not sure why I wasn't able to reproduce this hash value difference, but either way, the key is in a local variablek
, which there is only one of, so intuitively I suppose those addresses must be the same and the difference is unexpected.One weird thing I found while making the repro is it seems the value struct needs to be more than 2 strings worth of size to exhibit the behavior - with three fields, get and delete have different addresses, while with two fields they are the same.
Looked through the code in
hashmap.go
andmap.go
(compiler) and couldn't see anything suspicious, the code paths for get/lookup vs delete look basically the same for both, but the difference does cause our real-world code to fail with the map not being cleared. With the code looking the same, the issue may be in IR reduction?Note the above simpleish test case approach is also applied to the real world code here (which is where I was observing the address+key value difference)
https://github.com/corazawaf/coraza/compare/v3/dev...anuraaga:scratch?expand=1
The output looks like this (we can see the different hash values)
The text was updated successfully, but these errors were encountered: