Skip to content

Conversation

@anzoman
Copy link
Contributor

@anzoman anzoman commented Oct 1, 2025

Oasis Core 25.5 has brought support for stateless client nodes, which are particularly suitable for ROFL nodes.
We are documenting that here.

@anzoman anzoman requested a review from peternose October 1, 2025 18:48
@netlify
Copy link

netlify bot commented Oct 1, 2025

Deploy Preview for oasisprotocol-docs ready!

Name Link
🔨 Latest commit 8b4be35
🔍 Latest deploy log https://app.netlify.com/projects/oasisprotocol-docs/deploys/68ff79e2682c4300085d2a63
😎 Deploy Preview https://deploy-preview-1472--oasisprotocol-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@anzoman anzoman requested a review from gw0 October 3, 2025 09:55
@anzoman anzoman force-pushed the anzoman/run-your-node-client-stateless branch from 42b2095 to fb19dcf Compare October 6, 2025 10:50
@anzoman anzoman self-assigned this Oct 6, 2025
Copy link
Contributor

@peternose peternose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, plain and simple. We will update this latter once we implement levels of statelessness (consensus yes/no, runtimes yes/no).

```yaml
mode: client-stateless
# ... sections not relevant are omitted ...
consensus:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to define trust root. We already have this somewhere, not sure where 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martintomazic Where did you put docs for trust root and suggested trust period?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.oasis.io/node/run-your-node/advanced/sync-node-using-state-sync (src)

Fixed some stuff here recently. As promised yesterday, will add a cli command that calculates suggested trust and augment the docs again.

In the long run we could consider moving it out of "Advance"...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have in the example only simple config, e.g. consensus: light_client: trust: # visit light client for light client configuration, and a link to light client config, so that we don't repeat this stuff multiple times.

Copy link
Contributor Author

@anzoman anzoman Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, added a rough light client config part and mentioned state sync docs for more info.

Copy link
Contributor

@martintomazic martintomazic Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented a bit with stateless client and don't think this is the appropriate reference here as it tells you how to set sufficiently old trust so that consensus checkpoints are available.

With stateless client, there is no need for "sufficiently old" trust since the node is stateless (i.e. no checkpoints will be used).

Further experiments with the paratime client using the stateless client consensus backend showed that taking around 10 days old trust, runtime history reindex takes few hours (5h+). Strangely had to restart in between. Given that only after reindex is finished, your runtime client worker is ready, and that with stateless client, runtime storage worker is disabled (no checkpoints on the runtime side as well), there is little point in using this configuration.

So my suggestion is to take the latest, or 1000 blocks old trust height for the stateless client node. update: The only reason for non-latest is if you provider is behind... Otherwise as finality with cometbft is instant the latest is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related comment.

Nevertheless, my suggestion is to reflect this requirement under Setting trust root for stateless client node - "configured trust should be younger than your providers last retained height, taking their pruning and your speed of reindexing into account". See existing discussion #1472 (comment). Realistically if you configure a recent/latest trust as #1472 (comment) this is not a problem at all.

Probably we may also want to note that stateless client is still in early stage, and thus not suitable for use cases where you need 99.9%+ uptime/liveness?

# ... sections not relevant are omitted ...
consensus:
providers:
- <node-address-1>
Copy link
Member

@ptrus ptrus Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably mention some examples on how to the obtain these address.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could also mention that the address could be an IP of a node on the network, or a path to the socket of a local node.

Copy link
Member

@ptrus ptrus Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also enable consensus.WatchBlock on our public endpoints (e.g. tesntet.grpc.oasis.io?). Currently it's not whitelisted: https://github.com/oasisprotocol/internal-ops/blob/9af8ae8e752e3e26b60df6c96049629fc84f4b54/ansible/roles/oasis-node/defaults/main.yml#L616-L630

Probably no reason not to whitelist it, for example the runtime.WatchBlocks is allowed: https://github.com/oasisprotocol/internal-ops/blob/9af8ae8e752e3e26b60df6c96049629fc84f4b54/ansible/roles/oasis-node/defaults/main.yml#L672

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is intentional as producer (server) stores per subscriber data in memory. Meaning if you have many slow subscribers this becomes prone to memory leak... oasisprotocol/oasis-core#6315 is somehow related. Not sure why we don't deem runtime watch blocks "dangerous"

Copy link
Member

@ptrus ptrus Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the stateless client mode to not rely on WatchBlocks then? (or have a fallback to polling if its not available)

Copy link
Member

@ptrus ptrus Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complementary to that is to start running dedicated "provider" nodes ourselves

I don't think we need separate dedicated "provider" nodes. We already run public nodes that provide the data.

drop policy (e.g. max 1h of blocks in memory per subscriber

No need to worry about this for now. Since our public nodes are behind Cloudflare, it has hard cap connection limit of ~100 seconds. So clients will be dropped and need to reconnect way sooner.

Copy link
Contributor

@martintomazic martintomazic Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm we should check then everything works as expected with disconnections/missing block gaps. I believe initial versions didn't take this into consideration, but maybe this was fixed already. update: or again use poling so this is not an issue at all.

Copy link
Contributor

@peternose peternose Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn’t optimize prematurely. Let’s expose the WatchBlocks method, and if things go south, we can disable/limit it and then optimize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll do that in https://github.com/oasisprotocol/internal-ops/pull/1182.

I have updated the part of the docs about addresses.
Is there a nice way to obtain node addresses that can be providers for stateless clients?

Copy link
Member

@ptrus ptrus Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a nice way to obtain node addresses that can be providers for stateless clients?

Could use existing OPF hosted nodes at grpc.oasis.io/testnet.grpc.oasis.io* (are there any other known public nodes run by anyone else?). Otherwise, the user can also use their own nodes.

* once we update them to expose the WatchBlocks method, and test it to ensure it works.

@anzoman anzoman requested review from peternose and ptrus October 27, 2025 13:47
@anzoman anzoman force-pushed the anzoman/run-your-node-client-stateless branch from 1f87108 to 1f475a1 Compare October 27, 2025 13:49
@anzoman anzoman force-pushed the anzoman/run-your-node-client-stateless branch from 1f475a1 to 8b4be35 Compare October 27, 2025 13:55
hash for the light client.

To ensure compatibility, all provider nodes specified must be running the latest version of Oasis Core. The provider
address contains the `oasis1` prefix. It can also be an IP of a node on the network, or a path to the socket of a local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provider address contains the oasis1 prefix

I don't think this is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see, will drop that part.

We can also state the (only?) two possible ways of specifying addresses:

  • {{ node_entity_id }}@{{ node_ip_address }}
  • /node/run/internal.sock

Does that seem correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{{ node_entity_id }}@{{ node_ip_address }}

I don't think it's the entity id? I think it's the public key of the TLS certificate? And it's optional, If included, it specifies the expected peer identity for TLS authentication.
Also doesn't need to be an IP address. Address like grpc.oasis.io:443 also work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants