Skip to content

Conversation

@eternaleclipse
Copy link

No description provided.

@sbc100
Copy link
Collaborator

sbc100 commented May 5, 2022

I'm curious to see if all the test pass.. one alternative would be to opt into BigInt specifically by asking for bigint in the type argument.

@sbc100
Copy link
Collaborator

sbc100 commented May 5, 2022

BTW it looks like we only have a single internal user of makeGetValue with i64:

src/library_wasmfs.js:      var length = {{{ makeGetValue('buf', '0', 'i64') }}};

Do you have external usage?

@eternaleclipse
Copy link
Author

eternaleclipse commented May 5, 2022

I agree, making the user explicitly ask for 'bigint' can be a good alternative. I'm not sure if there is currently a good way to extract larger-than-52-bit values (Number.MAX_SAFE_INTEGER) without BigInt though, so going this way a separate patch may be required for i64.

Not sure what is meant by "internal". My interest is in making these two lines from getValue (exposed externally to JavaScript) return BigInt:

case 'i64': return {{{ makeGetValue('ptr', '0', 'i64', undefined, undefined, undefined, undefined, /*noSafe=*/true) }}};

case 'i64': return {{{ makeGetValue('ptr', '0', 'i64') }}};

@sbc100
Copy link
Collaborator

sbc100 commented May 5, 2022

I agree, making the user explicitly ask for 'bigint' can be a good alternative. I'm not sure if there is currently a good way to extract larger-than-52-bit values (Number.MAX_VALUE) without BigInt though, so going this way a separate patch may be required for i64.

Not sure what is meant by "internal". My interest is in making these two lines from getValue (exposed externally to JavaScript) return BigInt:

case 'i64': return {{{ makeGetValue('ptr', '0', 'i64', undefined, undefined, undefined, undefined, /*noSafe=*/true) }}};

case 'i64': return {{{ makeGetValue('ptr', '0', 'i64') }}};

Are you calling getValue('i64') in your code?

@eternaleclipse
Copy link
Author

eternaleclipse commented May 5, 2022

@sbc100
Copy link
Collaborator

sbc100 commented May 5, 2022

Not exactly my code, but yes slightly_smiling_face

https://github.com/AlexAltea/capstone.js/blob/master/src/capstone-wrapper.js#L125

Interesting. Thanks for the link.

Out of interest do you know why that line isn't using i32 given that addresses are 32-bit in WebAssembly (at least for now)?

What should happen if a user uses i64 without WASM_BIGINT? Should we return a hi/lo pair of i32 Numbers?

I wonder if we should also allow "i53" for when we know the value fits in a single JS Number? See the recent work on library_int53.js.

@eternaleclipse
Copy link
Author

eternaleclipse commented May 5, 2022

Capstone is a library for disassembling machine code from binaries, so "address" in this context refers to address of an instruction in the parsed binary's code section.

For me at least, the current state is a bit confusing. If I understand the documentation correctly, WASM_BIGINT refers to the ability for the user to pass BigInt args (return value is not discussed in docs, as far as I know).

I'm not sure why we should tie BigInt args / return values with WASM use - JavaScript supports BigInt without WASM, so maybe we can have:

  • USER_BIGINT=1 - Accept args / return values / getValue / setValue as BigInt
  • USER_PAIR_INT=1 - Accept args / return values / getValue / setValue as pair of hi/low integers (Numbers)
  • INTERNAL_BIGINT=1 - Use BigInt internally for better performance (one might want this disabled for backwards compatibility)

All of these flags can be relevant both to WASM and non-WASM mode.

The hi/low pair is IMHO a good idea and already implemented for makeSetValue, so I see no reason not to use the same logic for return value.
https://github.com/emscripten-core/emscripten/blob/main/src/parseTools.js#L456

As for i53, not sure about this - It feels like like inviting unintended bugs, especially since there are so many instances for native APIs out there using higher bits set (ex. -1 == 0xffffffffffffffff).

@sbc100
Copy link
Collaborator

sbc100 commented May 5, 2022

We are getting off topic here but wouldn't make sense to model // Address (EIP) of this instruction as an I32? One reason for doing that is that it avoids the need to this immediate fix.. but other reason is that dealing with BitInts in JS is actually kind of pain: For example if you want to perform any arithmetic with constants you have to know ahead of time that value is a BigInt. For example, if you want a function and increments an address by 4 you would need to write it differently:

// wasm32 version
void addAddress(addr) { 
  return addr + 4;
}

vs

// wasm64 version
void addAddress(addr) {
  return add + BigInt(4);
}

For these reason we have tried to avoid widespread BitInt usage within emscripten except when actually needed. For example in many cases its convenient for us to model void* and size_t are Numbers / int53 when we know they cannot be outside of this range. For example, if I'm writing a strcpy functions in JS and I want it to work in both wam32 and wasm64.

@sbc100
Copy link
Collaborator

sbc100 commented May 5, 2022

As for i53, not sure about this - It feels like like inviting unintended bugs, especially since there are so many instances for native APIs out there using higher bits set (ex. -1 == 0xffffffffffffffff).

I agree in the general case it might work everything . But for emscripten we are often accepting an argument which is supposed to be a valid pointer (e.g. strcpy) or a valid size_t (e.g. size_t count in fread) then it seems reasonable to check a value is within the int53 change either return an error or accept is as a Number.

See my recent change: #16877

@sbc100
Copy link
Collaborator

sbc100 commented Jun 29, 2022

Sorry, I forgot about this PR when I created #17341. I think that will solve this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants