Skip to content

Conversation

@kripken
Copy link
Member

@kripken kripken commented Jul 29, 2025

Rather than NOTE() everywhere, we can do that in visit(). Also
simplify some other stuff like using Fatal and NDEBUG. Also
remove a totally unused global WASM.

@kripken kripken requested a review from tlively July 29, 2025 20:25
Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@kripken kripken merged commit b7b18a8 into WebAssembly:main Jul 29, 2025
16 checks passed
@kripken kripken deleted the nfc.interp.debug branch July 29, 2025 21:21
@tlively
Copy link
Member

tlively commented Jul 30, 2025

I'm seeing a stack overflow after this change that goes away if I revert this commit.

;; test.wast
(module
 (type $7 (func (result v128)))
 (type $30 (func (param i32 i32)))
 (type $2 (func (result i64)))
 (type $16 (func (param v128 i32) (result f64)))
 (type $0 (func))
 (type $26 (func (param f32) (result f32)))
 (type $41 (func (param f64) (result f64)))
 (type $158 (func (param v128) (result v128)))
 (import "fuzzing-support" "call-export" (func $fimport$3 (type $30) (param i32 i32)))
 (memory $0 16 17 shared)
 (table $0 160 funcref)
 (elem $0 (i32.const 0) $1 $1 $1 $1 $1 $15)
 (export "func_21_invoker" (func $17))
 (func $1 (type $2) (result i64)
  (unreachable)
 )
 (func $15 (type $7) (result v128)
  (if
   (i32.const 0)
   (then
    (drop
     (i32.const 407)
    )
    (unreachable)
   )
  )
  (block (result v128)
   (call $fimport$3
    (i32.rem_u
     (i32.trunc_f32_u
      (call $88
       (f32.add
        (call $88
         (f32.demote_f64
          (call $461
           (call_indirect $0 (type $16)
            (call $462
             (i32x4.lt_s
              (call $462
               (call $15)
              )
              (select
               (v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000)
               (v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000)
               (block (result i32)
                (if
                 (i32.const 0)
                 (then
                  (drop
                   (i32.const 407)
                  )
                  (unreachable)
                 )
                )
                (block (result i32)
                 (drop
                  (v128.load offset=4 align=4
                   (block (result i32)
                    (if
                     (i32.const 0)
                     (then
                      (drop
                       (i32.const 407)
                      )
                      (unreachable)
                     )
                    )
                    (unreachable)
                    (i32.const 0)
                   )
                  )
                 )
                 (i32.const 0)
                )
               )
              )
             )
            )
            (unreachable)
            (i32.const 0)
           )
          )
         )
        )
        (f32.const 0)
       )
      )
     )
     (i32.const 0)
    )
    (i32.const 0)
   )
   (block
    (if
     (i32.const 0)
     (then
      (drop
       (i32.const 407)
      )
      (unreachable)
     )
    )
    (unreachable)
   )
  )
 )
 (func $17 (type $0)
  (drop
   (call_indirect $0 (type $7)
    (i32.const 5)
   )
  )
 )
 (func $88 (type $26) (param $0 f32) (result f32)
  (unreachable)
 )
 (func $461 (type $41) (param $0 f64) (result f64)
  (unreachable)
 )
 (func $462 (type $158) (param $0 v128) (result v128)
  (unreachable)
 )
)

wasm-opt -all test.wast --code-folding --fuzz-exec

@kripken
Copy link
Member Author

kripken commented Jul 30, 2025

@tlively Hmm, do you mean a system stack overflow, or our trap on the stack limit? I get the same results on my machine, with or without this PR, on default builds (release+asserts):

[fuzz-exec] calling func_21_invoker
[host limit stack limit]
[fuzz-exec] calling func_21_invoker
[host limit stack limit]
ignoring comparison of ExecutionResults!
warning: no output file specified, not emitting output

If you're getting a system stack overflow, what system is that on, and what stack size? And is it on a debug or release build?

(I can imagine this alters stack frame sizes. Like perhaps changes to debugging code lead to more or less inlining. But we should hit our own "host limit stack error" trap before the system limit, unless the machine is very constrained.)

@tlively
Copy link
Member

tlively commented Jul 30, 2025

I mean a segfault from a system stack overflow. I see this both on a debug build and on an LTO release build. This is on linux with a stack size of 8MB.

@kripken
Copy link
Member Author

kripken commented Jul 30, 2025

Weird, I am on linux too.

Just to be sure, is this with or without WASM_INTERPRETER_DEBUG?

Do you see anything odd in valgrind or in a gdb stack trace?

@tlively
Copy link
Member

tlively commented Jul 30, 2025

This is without WASM_INTERPRETER_DEBUG. The fuzzer ran into at this after ~3k iterations with the LTO statically linked release build. The stack trace looks like what I would expect. It's deep in the interpreter and a push_back on a local std::vector segfaults. As I would expect, the recursive call is nested very deep inside the function body, so there are many native stack frames for each interpreter function frame.

@kripken
Copy link
Member Author

kripken commented Jul 30, 2025

We may need to reduce the max stack depth then. What value here fixes things for you?

static const Index maxDepth = 250;

@kripken
Copy link
Member Author

kripken commented Jul 30, 2025

Experimenting with ulimit -s, I get a stack overflow when I lower from the default 8192 to 4096, but 5000 and above still work.

Any chance ulimit -s shows something other than 8192 for you?

Otherwise, perhaps it is that you are on an LTO build, mine is normal.

@kripken
Copy link
Member Author

kripken commented Jul 30, 2025

Anyhow, we may just lower that 250. When I change it to 200, ulimit -s 4096 works (which crashed before).

@tlively
Copy link
Member

tlively commented Jul 30, 2025

200 works for my LTO build 👍

@kripken
Copy link
Member Author

kripken commented Jul 30, 2025

Ok, I'll open a PR with that.

kripken added a commit that referenced this pull request Jul 30, 2025
This is an arbitrary limit, and it seems some LTO builds may start to
hit the old limit after #7767

200 should still be enough for anybody
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants