Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Value::I32/Value::I64 converters from unsigned ints #1624

Merged
merged 3 commits into from
Oct 12, 2020

Conversation

webmaster128
Copy link
Contributor

Description

Allow conversion of Rust u32/u64 to Wasm i32/i64 (which is something different than Rust's i32/i64.

At https://github.com/sunfishcode/wasm-reference-manual/blob/master/WebAssembly.md you see how i32/i64 are used to store both signed and unsigned values and are intepreted based in the operation.

All our imports and exports

// imports

extern "C" {
    fn db_read(key: u32) -> u32;
    fn db_write(key: u32, value: u32);
    fn db_remove(key: u32);

     // …
}

// exports
/// allocate reserves the given number of bytes in wasm memory and returns a pointer
/// to a Region defining this data. This space is managed by the calling process
/// and should be accompanied by a corresponding deallocate
#[no_mangle]
extern "C" fn allocate(size: usize) -> u32 {
    alloc(size) as u32
}

/// deallocate expects a pointer to a Region created with allocate.
/// It will free both the Region and the memory referenced by the Region.
#[no_mangle]
extern "C" fn deallocate(pointer: u32) {
    // auto-drop Region on function end
    let _ = unsafe { consume_region(pointer as *mut Region) };
}

// ...

compile from Rust u32 to Wasm i32.

I hope this is the right approach work with those flexible Wasm types. Let me know what you think.

Review

  • Add a short description of the the change to the CHANGELOG.md file

@webmaster128 webmaster128 changed the title Add i32/i64 converters from unsigned ints Add Value::I32/Value::I64 converters from unsigned ints Sep 15, 2020
@MarkMcCaskey
Copy link
Contributor

Thanks for the PR! Do the changes you've made fix the issue you were having?

When I first read this I expected this to be along the lines of #1602 ;

The one thing I'd say is that we probably want to reinterpret the bits so rather than using as which may truncate (I actually don't know), we may want to do something like i32::from_ne_bytes(u32::to_ne_bytes(num)) and that will reinterpret the bytes as the other type.

One other thing would be that it'd be nice to have a test exercising the new functionality. If you're not sure where to put the test, comment here and we'll help you find a place.

@webmaster128
Copy link
Contributor Author

Thanks @MarkMcCaskey.

Rust's as seems to be well defined to use the least significant bits of the input and reinterprete them as the desired type (https://doc.rust-lang.org/stable/rust-by-example/types/cast.html). So 255u8 as i8 is -1. This is what I suppose is what we want to bring the bits untouched to Wasm.

I added unit tests for the converters. However, I am unsure if and how to write tests for the Wasm VM boundary. I also don't know how this relates to the code in lib/wasmer-types/src/native.rs.

we may want to do something like i32::from_ne_bytes(u32::to_ne_bytes(num))

With this implementation the tests pass as well. We can use it if you want. Seems like a little overhead but is probably easier to review.

Do the changes you've made fix the issue you were having?

I'm in the process of trying to make my project compile and not running anything yet. But what it allows me to do is writing

                "humanize_address" => Function::new_with_env(&store, &i32i32_to_i32, env.clone(), move |mut env, args| {
                    let source_ptr = args[0].unwrap_i32() as u32;
                    let destination_ptr = args[1].unwrap_i32() as u32;
                    let ptr = do_humanize_address::<A, S, Q>(api, &mut env, source_ptr, destination_ptr)?;
                    Ok(vec![ptr.into()]) // ptr is an u32 linear memory offset
                }),

for implementing the import

extern "C" {
    fn humanize_address(source_ptr: u32, destination_ptr: u32) -> u32;
}

@webmaster128
Copy link
Contributor Author

we may want to do something like i32::from_ne_bytes(u32::to_ne_bytes(num))

Thinking about [u8; 4], what about chaing the data store in Value::I32 and Value::I64 in a way that makes clear this is raw data, not Rust signed integers?

pub enum Value<T> {
    /// A 32-bit integer.
    ///
    /// In Wasm integers are sign-agnostic, i.e. this can either be signed or unsigned.
    I32([u8; 4]),

    /// A 64-bit integer.
    ///
    /// In Wasm integers are sign-agnostic, i.e. this can either be signed or unsigned.
    I64([u8; 8]),

   // ...
}

@webmaster128
Copy link
Contributor Author

Updated to latest master. Couls you have another look, @MarkMcCaskey?

Copy link
Contributor

@MarkMcCaskey MarkMcCaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good!

My only real concern is that the tests are a bit more verbose than we need here.

If you could clean them up a bit, I think that would make the code easier to read.

}
assert_eq!(out, 0xaabbccddeeff0011);

let input: u64 = 0xffffffffffffffff;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using u64::max_value() for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the idea is to consistently use a byte notation for all tests. Also I need the same value for the input and expeted output. Can I put u64::max_value() in an u128? Maybe that works.

I can change that, but would favour the same test layout everywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably avoid any use of u128 in these tests. We can test the low level part separately but I think if the conversion from u32 -> i32 works in the context of Value then we can test that Value converts to the correct bytes in a u128 separately.

If there was more unsafe going on, then I think it would make sense to test these together, but as is, I just see this as a way to make sure that we don't break conversion logic.

Comment on lines 231 to 238
fn test_value_i32_from_u32() {
let input: u32 = 0x00000000;
let value = Value::<()>::from(input);
let mut out: i128 = 0;
unsafe {
value.write_value_to(&mut out as *mut i128);
}
assert_eq!(out, 0x00000000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like my other comment got lost somehow, I intended to comment on this and say I think we can just do:

        let value = Value::<()>::from(0u32);
        assert_eq!(value, Value::I32(0));

here. I think this tests all the same things and is easier to read in my opinion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true and easy for u32s in the signed integer range. But what about

        let value = Value::<()>::from(0xffffffffu32);
        assert_eq!(value, Value::I32(/* what goes here now? */));

The idea was to ensure the same bytes go in and out, avoiding decimal representation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a bit confused by the GitHub UI and responded in a top level comment! #1624 (comment)

Copy link
Contributor Author

@webmaster128 webmaster128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback.

I agree the tests are verbose, unfortunately. But I don't see a great way to avoid that, which would keep the bytes in, bytes out idea.

One line we can save for sure is the explicit input variable.

Comment on lines 231 to 238
fn test_value_i32_from_u32() {
let input: u32 = 0x00000000;
let value = Value::<()>::from(input);
let mut out: i128 = 0;
unsafe {
value.write_value_to(&mut out as *mut i128);
}
assert_eq!(out, 0x00000000);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true and easy for u32s in the signed integer range. But what about

        let value = Value::<()>::from(0xffffffffu32);
        assert_eq!(value, Value::I32(/* what goes here now? */));

The idea was to ensure the same bytes go in and out, avoiding decimal representation.

}
assert_eq!(out, 0xaabbccddeeff0011);

let input: u64 = 0xffffffffffffffff;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the idea is to consistently use a byte notation for all tests. Also I need the same value for the input and expeted output. Can I put u64::max_value() in an u128? Maybe that works.

I can change that, but would favour the same test layout everywhere.

@MarkMcCaskey
Copy link
Contributor

MarkMcCaskey commented Oct 9, 2020

That is true and easy for u32s in the signed integer range. But what about

        let value = Value::<()>::from(0xffffffffu32);
       assert_eq!(value, Value::I32(/* what goes here now? */));

The idea was to ensure the same bytes go in and out, avoiding decimal representation.

That should be:

let value = Value::<()>::from(u32::max_value());
assert_eq!(value, Value::I32(-1));

Or alternatively for other patterns of bytes:

let byte_pattern: u32 = 0xabc;
let value = Value::<()>::from(byte_pattern);
assert_eq!(value, Value::I32(byte_pattern as i32));

I think we can decouple the low level u128 byte part of this from the conversion part. If we test those pieces in isolation then we need less test code to cover everything and it's a bit clearer in my opinion.

@webmaster128
Copy link
Contributor Author

okay, then I will update with the first version and do the converstion from unsigned to signed in my calculator. Seems reasonable.

The second option uses the same code in tests and implementation, which is a bit scary. Especially here where we have tests because we don't want to rely on casting rules alone.

@MarkMcCaskey
Copy link
Contributor

okay, then I will update with the first version and do the converstion from unsigned to signed in my calculator. Seems reasonable.

The second option uses the same code in tests and implementation, which is a bit scary. Especially here where we have tests because we don't want to rely on casting rules alone.

That's a good point, you can copy and paste the constant twice as both types to be more explicit about the bit pattern if you want or use the from/to bytes methods.

Thanks!

@webmaster128
Copy link
Contributor Author

or use the from/to bytes methods.

Excellent idea! I forgot about this option for some reason. Updated.

Copy link
Contributor

@MarkMcCaskey MarkMcCaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great -- thanks!

@MarkMcCaskey
Copy link
Contributor

bors r+

@bors
Copy link
Contributor

bors bot commented Oct 12, 2020

@bors bors bot merged commit 038af91 into wasmerio:master Oct 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants