Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility of making static buffer configurable #1471

Closed
cmichi opened this issue Nov 5, 2022 · 28 comments
Closed

Add possibility of making static buffer configurable #1471

cmichi opened this issue Nov 5, 2022 · 28 comments
Assignees
Labels
A-ink_env [ink_env] work item B-enhancement New feature or request

Comments

@cmichi
Copy link
Collaborator

cmichi commented Nov 5, 2022

We should make the size of the static buffer configurable. This is what #1279 tried to do. See here for an explanation what the static buffer is.

The ideal approach for implementing this would be to add a property to our Environment trait:

 /// The environmental types usable by contracts defined with ink!.
 pub trait Environment {
+    /// TODO
+    const CAPACITY: u8;
+

This will require adding the E: Environment bound in all kinds of places, so that accessing E::CAPACITY becomes possible:

+/// A static buffer with `E::CAPACITY` kB of capacity.
+pub struct StaticBuffer<E: Environment> {
     /// The static buffer with a total capacity of 16 kB.
-    buffer: [u8; Self::CAPACITY],
+    buffer: [u8; E::CAPACITY],
 }

Some necessary modifications:

impl OnInstance for EnvInstance {
-    fn on_instance<F, R>(f: F) -> R
+    fn on_instance<E, F, R>(f: F) -> R
     where
         F: FnOnce(&mut Self) -> R,
+        E: Environment,
     {
 /// The on-chain environment.
-pub struct EnvInstance {
+pub struct EnvInstance<E: Environment> {

There's a ton more places where we'll need to pass the type.

@cmichi cmichi added the B-enhancement New feature or request label Nov 5, 2022
@cmichi cmichi added the A-ink_env [ink_env] work item label Nov 5, 2022
@SkymanOne
Copy link
Contributor

@cmichi rust stable does not support generic parameters to be used in const generics. I have to fallback to passing the constant explicitly across the whole codebase:

pub struct StaticBuffer<const CAPACITY: usize>

@SkymanOne
Copy link
Contributor

The problem doesn't seem to be trivial because generic_const_exprs hasn't landed in stable. Might need to use this crate: https://crates.io/crates/generic_singleton as a workaround

@SkymanOne
Copy link
Contributor

After digging, what we can do to tackle the issue is to use constant generics for StaticBuffer

+/// A static buffer with `CAPACITY` kB of capacity.
+pub struct StaticBuffer<const CAPACITY: usize> {
     /// The static buffer with a total capacity of 16 kB.
-    buffer: [u8; Self::CAPACITY],
+    buffer: [u8; CAPACITY],
 }

then upon instantiation of StaticBuff in on_chain module, we can wrap it inside the macro that would take E::CAPACITY from the Environment trait, extract a value from it and paste it as a generic in StaticBuffer<VALUE> :

impl<E: Environment> OnInstance<E> for EnvInstance {
    fn on_instance<F, R>(f: F) -> R
    where
        F: FnOnce(&mut Self) -> R,
    {
       // not a final version, just an idea
       extract_value!(E::CAPACITY, {
            static mut INSTANCE: EnvInstance = EnvInstance {
                // `VALUE` would be replaced with am int literal specified in `E::CAPACITY`
                buffer: StaticBuffer::<VALUE>::new(),
            };
            f(unsafe { &mut INSTANCE })
        }
    }
}

@SkymanOne
Copy link
Contributor

Update.
After a discussion with the ink! team, it was decided to abandon configuration via the Environment in favour of adding a parameter buffer_size = <usize> to #[ink::contract]

@kvinwang
Copy link
Contributor

We have recently encountered the 16K limitation in several scenarios. Is there a development plan to address this issue?

@SkymanOne
Copy link
Contributor

We have recently encountered the 16K limitation in several scenarios. Is there a development plan to address this issue?

There are currently other more important issues on the roadmap such as OpenZeppelin issues. Once those are cleared, I plan to get back to tacking this issue.

Because of tight dependency coupling, most likely, the bus will have to be configured in the ink! smart contract directly via the macro. We need to assess security implications of that design first.

@justinfrevert
Copy link
Contributor

There are legitimate use cases of sending data of around 53Kb to ink! contracts. Would the configuration afforded by this change support that amount?

@SkymanOne
Copy link
Contributor

SkymanOne commented Aug 4, 2023

I started experimenting with the other possibilities to implement the feature and there are a few points:

  • All API functions are tightly coupled with the EnvInstance struct which is first blocker.
  • I have attempted to generalise the API functions by adding a type bound by OnInstance to the Environment trait and then using it across the API generically.
  • This has introduced another blocker such as
    • ink_storage crate depends on ink_env and is just re-exported to the contract by the main ink crate. This implies that we need to specify Environment manually in every Mapping and Lazy definition every time we want to modify the buffer size. This is prone to lots of human errors. The alternative would be to move the whole ink_storage crate to the codegen but it requires reversing lots of dependencies.
    • There is a similar story with the ChainExtension where we would also need to specify the Environment trait because the ChainExtension definition lives outside the contract's module, and we cannot imply the Environment definition from the contract. The alternative to this would require some nitty-gritty codegen.
  • Overall, it seems like the ink_env crate requires a massive rework in order to allow this feature to be implemented in an idiomatic way.

@SkymanOne
Copy link
Contributor

After some digging, I came up with 2 possible PRs to address the issues. The #1869 modifies the size via the environmental variable, and the #1872 uses features to select the size from the predefined set of values (originally used by @kvinwang).

I am interested in the opinions of which one is more idiomatic.

I discarded the option of messing with the codegen. The static buffer is created with static mut expression and adding any changes to that will either impact the performance or create unnecessary complexity in the codebase.

@kvinwang
Copy link
Contributor

kvinwang commented Aug 9, 2023

How about using a Vec<u8> instead of the static array? So that the size can be configured runtime or link time via an external global variable or function which can be generated by the ink::contract!.

@SkymanOne
Copy link
Contributor

How about using a Vec<u8> instead of the static array? So that the size can be configured runtime or link time via an external global variable or function which can be generated by the ink::contract!.

static buffer is frequently used in almost every part of ink! execution pipeline, we need to keep it optimised. Replacing it with anything else will cause performance degradation

@kvinwang
Copy link
Contributor

kvinwang commented Aug 9, 2023

static buffer is frequently used in almost every part of ink! execution pipeline, we need to keep it optimised. Replacing it with anything else will cause performance degradation

If that's the case, we can implement a buffer where the front half is a static array, and the second half is a Vec. Theoretically, this will not degrade performance.

something like the TinyVec

@kvinwang
Copy link
Contributor

kvinwang commented Aug 9, 2023

More clear:
we can make a buffer with a lower bound like LowerBoundVec<const AtLeast: usize, T> so that when user-site indexing elements lower than the AtLeast, the compiler can emit the same optimized code.

@SkymanOne
Copy link
Contributor

SkymanOne commented Aug 9, 2023

we can implement a buffer where the front half is a static array, and the second half is a Vec

This sounds like very complicated and inefficient solution. The whole idea of using static mut is that it is stored in the dedicated memory section (i.e. BSS), merging it with the heap-allocated data seems like a nightmare. That means that with 32kB of data, we need to load 16Kb from BSS, and the other 16kB from the heap. This defeat when the whole point of having a static buffer in the first place, and we might as well just turn the whole buffer into the heap allocated buffer.
https://github.com/paritytech/ink/blob/6452b89bf47eb1610ee6dc8417a7584f29875198/crates/env/src/engine/on_chain/impls.rs#L289 - This is an example of when the whole scope of the buffer is used, and, if the size is over 16 kB then we perform loading from both heap and BSS.

There is another issue I described earlier: the order of dependencies. Mapping and Lazy in ink_storage crate are tightly coupled with the ink_env API functions which uses the concrete EnvInstance with the hardcoded static buffer size. If we are to make EnvInstance generic over const SIZE: usize or Environment then we need it to propagate it across all API functions and any structure that uses them. So, that means that we would need to specify SIZE or Environment in the Mapping or Lazy.
To avoid that, we need to move those structures to the codegen, and automatically generate concrete structures with the set values to the generic, but by doing this we are introducing some magic to the user where Mapping and Lazy appear out of nowhere.

@kvinwang
Copy link
Contributor

kvinwang commented Aug 9, 2023

I didn't mean to store the buffer in two separate locations. What I meant is that from the perspective of the type system, we can create a type like:

pub struct LowerBoundVec<const L: usize, T> {
    buffer: Vec<T>,
}

impl<const L: usize, T> LowerBoundVec<L, T> {
    fn index(&self, i: usize) -> &T {
        if i < L {
            unsafe { self.buffer.get_unchecked(i) }
        } else {
            &self.buffer[i]
        }
    }
}

The inner buffer is logically divided into two parts: a fixed array at the beginning and a dynamic tail, all stored in continuous memory located on the heap. When the front part is accessed, such as when decoding/encoding small fixed-size data types, it provides the compiler with more information that an overflow won't occur. As a result, the compiler can optimize away the bound checking, leading to performance that is not much different from using a static buffer.

@SkymanOne
Copy link
Contributor

SkymanOne commented Aug 9, 2023

With the design you propose in places like https://github.com/paritytech/ink/blob/02a0bf92566acc61593c1bdad6c299d0fd6ff649/crates/env/src/api.rs#L69
Instead of getting the same buffer memory reference, we would instantiate new one on the heap as described here:
https://github.com/paritytech/ink/blob/02a0bf92566acc61593c1bdad6c299d0fd6ff649/crates/env/src/engine/on_chain/mod.rs#L49

What we can do is to use lazy static with the typemap to access the same instance from the map given the same constant generic parameter supplied, but that again adds a lot of overhead.
However, as I mentioned earlier, we still have another problem with the dependency order.

@kvinwang
Copy link
Contributor

kvinwang commented Aug 9, 2023

What we can do is to use lazy static with the typemap to access the same instance from the map given the same constant generic parameter supplied, but that again adds a lot of overhead.

It is not necessary to use lazy static. We can setup the static buffer with a non-const size in the two entry fn deploy and fn call.

However, as I mentioned earlier, we still have another problem with the dependency order.

It won't introduce any extra generic parameters. Changes just keep inside the static buffer.

It just change:

pub struct StaticBuffer {
    buffer: [u8; 16k],
}

to:

pub struct StaticBuffer {
    buffer: LowerBoundVec<16k, u8>,
}

@SkymanOne
Copy link
Contributor

As you said about this structure:

continuous memory located on the heap

This is the main problem. This is performance digression.

@kvinwang
Copy link
Contributor

kvinwang commented Aug 10, 2023

No matter the buffer is on heap or BSS, there should be not too much difference at the low level wasm asm. Both need to alloc a piece of zeroed memory.

However, I tried to bench it with patched ink with a Vec<u8> in the StaticBuffer, I get the result bellow:

Using StaticBuffer with [u8; 16k]:

runtimeVersion consumed: { refTime: '1,610,612,735', proofSize: '0' } required: { refTime: '65,766,686,719', proofSize: '0' }
onBlockEndCalled consumed: { refTime: '1,879,048,191', proofSize: '0' } required: { refTime: '79,456,894,975', proofSize: '0' }
parseUsd consumed: { refTime: '4,294,967,295', proofSize: '0' } required: { refTime: '81,604,378,623', proofSize: '0' }

Using StaticBuffer with Vec<u8>(code here):

runtimeVersion consumed: { refTime: '1,610,612,735', proofSize: '0' } required: { refTime: '65,766,686,719', proofSize: '0' }
onBlockEndCalled consumed: { refTime: '1,879,048,191', proofSize: '0' } required: { refTime: '79,456,894,975', proofSize: '0' }
parseUsd consumed: { refTime: '4,294,967,295', proofSize: '0' } required: { refTime: '81,604,378,623', proofSize: '0' }

I am confused by the result. Both get exectly the same gas consumed. There should be some problem in the gas metering.

@SkymanOne
Copy link
Contributor

Most likely there is a mistake in your benchmark. The reason is that it is practically unlikely that the compiler optimised Vec<u8> binary to the same binary as the static buffer due to the extra overhead with the heap allocation and runtime bounds checks. Using Vec<u8> should also force ink! contracts to use the heap memory allocator which should add an additional 2-4 kB.
Can you please provide the benchmark reproducer so I can verify the results?

@kvinwang
Copy link
Contributor

kvinwang commented Aug 11, 2023

Most likely there is a mistake in your benchmark. The reason is that it is practically unlikely that the compiler optimised Vec<u8> binary to the same binary as the static buffer due to the extra overhead with the heap allocation and runtime bounds checks. Using Vec<u8> should also force ink! contracts to use the heap memory allocator which should add an additional 2-4 kB.
Can you please provide the benchmark reproducer so I can verify the results?

Yes, I finally found it was a mistake in my test.

I have just temporarily modified Phala blockchain's e2e code for testing purposes, so there is no reproducer code available.

However, I found the reason of the identical result was that Phala's worker always mask the lower 5% bits of the returned gasConsumed in the query RPC for security reason. After turning off the masking, I got the correct result:

with [u8; 16k]:

runtimeVersion() gasConsumed: { refTime: '1,482,332,631', proofSize: '232,453' }
onBlockEndCalled() gasConsumed: { refTime: '1,777,971,855', proofSize: '46,729' }
parseUsd() gasConsumed: { refTime: '4,176,439,803', proofSize: '101,109' }

with vec![0; 16k]:

runtimeVersion() gasConsumed: { refTime: '1,521,554,426', proofSize: '236,008' }
onBlockEndCalled() gasConsumed: { refTime: '1,822,112,992', proofSize: '47,440' }
parseUsd() gasConsumed: { refTime: '4,205,264,073', proofSize: '101,820' }

where runtimeVersion and onBlockEndCalled are flip-like light methods:

        #[ink(message)]
        pub fn runtime_version(&self) -> (u32, u32) {
            pink::ext().runtime_version()
        }
        #[ink(message)]
        pub fn on_block_end_called(&self) -> bool {
            self.on_block_end_called
        }

and the parseUsd is a bit heavier that parsing a piece of json of content {"usd":1.23} with serde_json_core.

There are indeed some expected overheads. But the result looks acceptable in our projects.

We can place the dynamic buffer under a cargo feature gate named dynamic-io-buffer, so that most contracts will still use the default 16k static buffer. Contracts that require a larger buffer are likely already performing heavy tasks, so the little overhead would be trivial.
This can be an alternative way to issue #1872. I don't favor #1872 simply because it doesn't truly address the problem. People might desire more fine-grained control over the number. For example, 1.2MB would be the proper value for our contract quickjs. It's unable to set to that value with #1872.

As for #1869, if the environment var would be hidden behind the cargo-contract, it would be fine. Maybe cargo-contract can read it from some custom section in the Cargo.toml rather than introducing a new .env file. However, the problem is that cargo test won't read the config.
Otherwise, if it were exposed to users for direct configuration via an environment variable, it would feel strange and inappropriate and has tons of drawbacks for a pure Rust project.

@SkymanOne
Copy link
Contributor

People might desire more fine-grained control over the number.

This lies on a more subjective side of things. What I fear with introducing custom tuning of things like StaticBuffer is that ink! may become "over-configurable" framework like Substrate. The whole idea with ink! is simplicity in usage and its opinionated nature. If you need precisely 1.2MB buffer why can't you opt-in for 16MB buffer?

it is kinda similar to buying a laptop if you run an application that consumes 18GB of RAM, you don't request a custom build with 20GB of RAM from a manufacturer, you just buy a laptop with 32GB of RAM and has extra RAM for any other future use cases.

@kvinwang
Copy link
Contributor

If you need precisely 1.2MB buffer why can't you opt-in for 16MB buffer?

Because on Phala, a contract can use up to 4MB of RAM.

it is kinda similar to buying a laptop if you run an application that consumes 18GB of RAM, you don't request a custom build with 20GB of RAM from a manufacturer, you just buy a laptop with 32GB of RAM and has extra RAM for any other future use cases.

Can you imagine someone want to create a ramdisk on a laptop with 4GB RAM, he only have choices 512M, 1GB, 16GB ...?

@SkymanOne
Copy link
Contributor

Because on Phala, a contract can use up to 4MB of RAM.

I am happy to add it as one of the configuration options: i.e.

  • 1MB
  • 2MB
  • 4MB

@kvinwang
Copy link
Contributor

kvinwang commented Aug 11, 2023

I am happy to add it as one of the configuration options: i.e.

  • 1MB
  • 2MB
  • 4MB

Adding 2MB won't help anything. The main reason we want a larger buffer is that our contract support simple http request, and the response would be passed through the static buffer. If we config the buffer to 2MB, the heap would get less than 2MB, then the contract won't even be able to parse a 800K response. Because the raw response itself has to be copied to heap, and need more memory to parse the raw response.

@SkymanOne
Copy link
Contributor

Sorry, I fail to understand the logic here. Why do you exactly need 1.2MB but not 2MB? If 2MB is too small for you, configure the buffer to be 4MB?

@kvinwang
Copy link
Contributor

kvinwang commented Aug 11, 2023

Sorry, I fail to understand the logic here. Why do you exactly need 1.2MB but not 2MB? If 2MB is too small for you, configure the buffer to be 4MB?

2MB is too large because it would shrink the heap size to less than 2MB.

Suppose we have the following code in a ink method:

let response = http_get(url_of_the_json)?; // How large does the response we support?
let data: Data = Decode::decode(&mut &response.body)?;

If we config the buffer to 2MB, then the entire 4MB RAM would have the following layout:

                   the stack
The io buffer       ↓↓↓↓↓↓   the heap
↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓    ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓
|<----------------4MB----------------->|
|*********          |    |*********----|
 ^^^^^^^^^                ^^^^^^^^^
    |     ^^^^^^^^^^        |      ^^^^^
    |         |             |     the space used to parse the response (too small)
 the response |             |
              |          the response in the heap
           The wasted space

With this layout:

  • there is a gap wasted in the IO buffer
  • there isn't enough space to parse the response if it of size 1MB.

If we config the buffer to 1MB, we have this layout:

              the stack
The io buffer |     
          ↓↓↓↓↓↓    the heap
↓↓↓↓↓↓↓↓↓↓↓    ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓
|<----------------4MB----------------->|
|*********|   |*********---------      |
 ^^^^^^^^^              ^^^^^^^^^
    |          ^^^^^^^^^   |     ^^^^^^^
    |            |         |     wasted space here
 the response    |         |
                 |      the space used to parse the response (enough)
              the response in the heap

With this layout, there is a wasted space in the heap.

This is why we set the io buffer size to about 1.2 MB, because it would support max of 1.2M http response.

@SkymanOne
Copy link
Contributor

Resolved by #1869

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ink_env [ink_env] work item B-enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants