Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System/standard library for bignumbers #189

Open
axic opened this issue Apr 25, 2019 · 19 comments
Open

System/standard library for bignumbers #189

axic opened this issue Apr 25, 2019 · 19 comments

Comments

@axic
Copy link
Member

axic commented Apr 25, 2019

EVM supports 256-bit operations "natively". While that precision is not needed in a lot of cases it does come as very useful for certain operations, such as handling token balances, in Ethereum.

With wasm one would need to include a bignumber library to do that, potentially in a lot of contracts. This suggests a good opportunity to define a standard bignumber library, which can be used by contracts, but it is not mandatory to be used.

Due to the design of "host" functions in wasm (having a separation of namespaces) it is possible to design this in a forward compatible manner, but implement it right now in the clients, such as Hera.

Lets define a new namespace stdbn (jk, bignum or stdbignum) with the following

  • mul256(a: i32ptr, b: i32ptr, out: i32ptr)
  • mulmod256(a: i32ptr, b: i32ptr, mod: i32ptr, out: i32ptr)
  • others TBD

All of the arguments are 32-bit pointers to a memory location containing a 256-bit little endian number. The pointers can overlap or be the same.

In the future this could also be implemented as a (system) library: #17.

The main benefit of a "system library" is the predefined address it lives at and the client's ability to make a decision whether it uses a wasm blob or implements it natively.

@chfast
Copy link
Collaborator

chfast commented Apr 25, 2019

I'm not sure the std prefix makes any difference here.

The name mul256 is fine because it's the same for signed and unsigned number. But the second one should be rather umulmod256.

The naming conventions are not very consistent everywhere, but I believe there are 3 options:

  • assembly like: no prefix - signed or sign neutral, u prefix - unsigned.
  • EVM like: no prefix - unsigned or sign neutral, s prefix - signed.
  • pedantic like: no prefix - sign neutral, u prefix - unsigned, s prefix - signed.

@axic
Copy link
Member Author

axic commented Apr 26, 2019

I like the pedantic option.

@axic
Copy link
Member Author

axic commented May 10, 2019

I think we should add all 256-bit methods to the host interface: add, sub, mul, div, mulmod, cmp. Also need to consider support for signed numbers.

We could argue that some of them are fast enough if implemented in Wasm, but the secondary goal here is to reduce code duplication and it helps in that.

Any comments?

@jakelang
Copy link
Member

Personally I feel like it would complicate the design for the bignum library to only contain the arithmetic methods that are not fast enough in wasm.

Reducing code duplication is definitely another big benefit of having a "complete" bignum lib. As long as we can minimize host function overhead, this is great for reducing contract bloat.

@axic
Copy link
Member Author

axic commented Jun 8, 2019

Potentially we could make use of the references proposal for Wasm. Let assume the following example is for fixed 256-bit long bignums:

bignum.load(memOffset: i32) -> anyref
bignum.store(memOffset: i32, v: anyref)
bignum.mul(a: anyref, b: anyref)
bignum.add(a: anyref, b: anyref)
...

This assumes that opaque references (what anyref is) would be efficiently passed around in interpreters.

@axic
Copy link
Member Author

axic commented Sep 30, 2019

Let's agree on this first basic API, which isn't using a bignum stack:

  • addu256(a: i32ptr, b: i32ptr, c: i32ptr) where c = a + b
  • subu256(a, b, c) where c = a - b
  • mulu256(a, b, c) where c = a * b
  • divu256(a, b, c) where c = a / b
  • mulmodu256(a, b, c, d) where d = (a * b) % c
  • cmpu256(a: i32ptr, b: i32ptr) -> i32 where the returned value is less than zero if a < b, zero if a == b or larger than zero if a > b

@chfast @s1na what do you think?

@poemm
Copy link
Contributor

poemm commented Sep 30, 2019

How about the first argument is the output. so output position is consistent for different numbers of arguments, for example binary_op(output,input1,input2) and unary_op(output, input)?

@axic
Copy link
Member Author

axic commented Sep 30, 2019

I thought about having output as the first operand, but wasn't sure of the merit. Some POSIX functions do that, but find some of them confusing.

What would be the benefit we gain? Note, I expect this to change during the upcoming months. We also need to make a decision whether to replace (or add) a stack based version.

@poemm
Copy link
Contributor

poemm commented Sep 30, 2019

What would be the benefit we gain?

Convention. Gnu Multiple Precision and lots of crypto pass output first. But it doesn't matter.

We also need to make a decision whether to replace (or add) a stack based version.

Not sure which algorithms benefit from the stack based version, other than pathological benchmarks. But it may be simple enough to implement -- bin_op256_stack() can just call bin_op256(stack_ptr-32, stack_ptr-32, stack_ptr) and update the stack pointer.

@jakelang
Copy link
Member

I agree with @axic here.
Although it's somewhat common in POSIX and other legacy conventions to use op(dst, src),
It is far more linguistically intuitive writing op(src, dst) when lacking the muscle memory for the former.

@chfast
Copy link
Collaborator

chfast commented Oct 1, 2019

I agree with @axic here.
Although it's somewhat common in POSIX and other legacy conventions to use op(dst, src),
It is far more linguistically intuitive writing op(src, dst) when lacking the muscle memory for the former.

This actually go against many other conventions, e.g. Go and other OOP designs (to some distinct): https://golang.org/pkg/math/big/, RISC assembly, C calling convention.

Also a += 1 becomes add(a, 1, a) instead of add(a, a, 1).

@axic
Copy link
Member Author

axic commented Oct 1, 2019

@chfast can you clarify which convention you referred to or which one you prefer?

@chfast
Copy link
Collaborator

chfast commented Oct 1, 2019

I prefer the outputs to be the first arguments.

@s1na
Copy link

s1na commented Oct 28, 2019

A couple more discussion points:

@poemm
Copy link
Contributor

poemm commented Oct 28, 2019

What about returning a carry where necessary, e.g. add256?

Makes sense to return a carry. It may also make sense to have an extra argument for carry.

Whether montgomery multiplication should be part of the api.

It should. Modular multiplication is a bottleneck for elliptic curve crypto. MontgomeryMultiplication256 is very popular for this. We may also want MontgomeryReduction256 to transform to/from Montgomery form, and to speed up squaring. We may eventually consider BarrettMultiplication256 which is popular too. A problem is metering - some of these are faster with non-constant runtime, so we may have to meter conservatively, or break the algorithms into constant-runtime chunks, and meter each chunk individually.

How to handle arbitrarily large numbers

@chfast wisely said that anything bigger than mul256 will have register pressure and a slowdown. And that many operations can be built by composition of smaller algorithms. For example, we can build mul768 from mul256x256->512s. For this reason, we may eventually want mul64x64->128 and mul128x128->256 as building blocks, but this is less important for now.

@axic
Copy link
Member Author

axic commented Oct 28, 2019

What about returning a carry where necessary, e.g. add256?

We kind of agreed to have another set of methods with carry.

@s1na
Copy link

s1na commented Oct 31, 2019

Proposing this slightly modified API:

  • add(width: i32, a: i32ptr, b: i32ptr, c: i32ptr): bool where c = a + b, width is for both a and b, and return value indicates carry
  • sub(width: i32, a, b, c): bool where c = a - b, and return value indicates carry (c < 0)
  • mul(width: i32, a, b, c): void where c = a * b
  • div(width: i32, a, b, q, r): void where q = a / b and r = a % b
  • mulmod(a, b, c, d): void where d = (a * b) % c (not sure if width is needed here?)
  • cmp(width: i32, a: i32ptr, b: i32ptr): i32 where the returned value is less than zero if a < b, zero if a == b or larger than zero if a > b

Note: these changes only make sense if the overhead of additional parameters and return values is negligible.

@chfast
Copy link
Collaborator

chfast commented Oct 31, 2019

Carry for mul is not practical. It's a bit of trouble to get it, but is not useful in higher precision multiplication.

@s1na
Copy link

s1na commented Oct 31, 2019

@chfast Thanks for the input. Removed carry for mul and mulmod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants