Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Memset+Memcpy to FutureFeatures.md #1057

Merged
merged 5 commits into from
May 11, 2017
Merged

Add Memset+Memcpy to FutureFeatures.md #1057

merged 5 commits into from
May 11, 2017

Conversation

binji
Copy link
Member

@binji binji commented May 9, 2017

See #236 and #977. I added both set_memory and zero_memory since there was discussion about this in #236.


* `move_memory`: Copy data from one memory region to another region, even if overlapping
* `zero_memory`: Set all bytes in a memory region to zero
* `set_memory`: Set all bytes in a memory region to a given byte
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Id it useful to distinguish zero?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but I figured I'd include both since both were discussed.

### Memset and Memcpy Operators

Copying and clearing large memory regions is very common. This can be done in
the MVP via `i32.load` and `i32.store`, but this is not very fast. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the speed argument is well stated: a fast mem* is dependent on architecture, and the load/store approach requires more bytecode while forcing VMs to recognize the loop.

Overall the reasons to have mem* aren't huge, but they seem at least worth considering.

I'd also mention that some things are expected from dev-side compilers such as LLVM. I don't expect to see small mem* operations in WebAssembly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

the MVP via `i32.load` and `i32.store`, but this is not very fast. The
following operators can be added to improve performance:

* `move_memory`: Copy data from one memory region to another region, even if overlapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"even if" doesn't state what happens if it is overlapping.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@RyanLamansky
Copy link

Would there be two variants of these: one which takes the size as an immediate, and another from the stack? move_memory is likely to be used for moving constant-sized data structures around. Loading a constant from the stack can be expected to be optimized by compilers, but it would save a byte to have its own opcode.

@binji
Copy link
Member Author

binji commented May 9, 2017

We haven't done that sort of thing with any other operators, so it feels a bit out of place to start doing it now. I'm guessing this wouldn't be a very common opcode anyway, so the byte savings may not make much of a difference.

to recognize the loops as well. The following operators can be added to improve
performance:

* `move_memory`: Copy data from one memory region to another region, copying backward if the regions overlap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of saying "copying backward", I think we should say what memmove() docs say which is that it works as-if a temporary buffer was used. (If there is an overlap and dst > src, that does entail copying backward, but if src > dst, then you want to copy forward.)

@lukewagner
Copy link
Member

@RyanLamansky We're expecting compilers to have optimized small/constant-sized memcpy/set (into unrolled loads/stores), saving the builtin operators only for the bulk operations.


* `move_memory`: Copy data from one memory region to another region, copying backward if the regions overlap
* `zero_memory`: Set all bytes in a memory region to zero
* `set_memory`: Set all bytes in a memory region to a given byte
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have set_memory (which I think we should), then I don't think there's much reason for a specialized zero form; we're all just going to be calling memset() anyhow.

@lukewagner
Copy link
Member

Thanks, lgtm

@jfbastien
Copy link
Member

I'm not sure this is good though... how it's implemented has observable effects with shared memory. There should be a note that we need to fix this, then lgtm.

@lukewagner
Copy link
Member

Since these operators would be performing a bunch of non-atomic reads and writes, it doesn't seem like specifying the order in more detail would allow one to write a correct program any more than specifying as if a temporary copy was made.

@binji
Copy link
Member Author

binji commented May 11, 2017

Well, this is for FutureFeatures.md, not an actual proposal just yet. So saying we're not sure here makes sense, I suppose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants