-
Notifications
You must be signed in to change notification settings - Fork 40
docs/node/run-your-node: Document use of client-stateless mode #1472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for oasisprotocol-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
42b2095 to
fb19dcf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, plain and simple. We will update this latter once we implement levels of statelessness (consensus yes/no, runtimes yes/no).
| ```yaml | ||
| mode: client-stateless | ||
| # ... sections not relevant are omitted ... | ||
| consensus: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to define trust root. We already have this somewhere, not sure where 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martintomazic Where did you put docs for trust root and suggested trust period?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.oasis.io/node/run-your-node/advanced/sync-node-using-state-sync (src)
Fixed some stuff here recently. As promised yesterday, will add a cli command that calculates suggested trust and augment the docs again.
In the long run we could consider moving it out of "Advance"...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably have in the example only simple config, e.g. consensus: light_client: trust: # visit light client for light client configuration, and a link to light client config, so that we don't repeat this stuff multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, added a rough light client config part and mentioned state sync docs for more info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I experimented a bit with stateless client and don't think this is the appropriate reference here as it tells you how to set sufficiently old trust so that consensus checkpoints are available.
With stateless client, there is no need for "sufficiently old" trust since the node is stateless (i.e. no checkpoints will be used).
Further experiments with the paratime client using the stateless client consensus backend showed that taking around 10 days old trust, runtime history reindex takes few hours (5h+). Strangely had to restart in between. Given that only after reindex is finished, your runtime client worker is ready, and that with stateless client, runtime storage worker is disabled (no checkpoints on the runtime side as well), there is little point in using this configuration.
So my suggestion is to take the latest, or 1000 blocks old trust height for the stateless client node. update: The only reason for non-latest is if you provider is behind... Otherwise as finality with cometbft is instant the latest is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related comment.
Nevertheless, my suggestion is to reflect this requirement under Setting trust root for stateless client node - "configured trust should be younger than your providers last retained height, taking their pruning and your speed of reindexing into account". See existing discussion #1472 (comment). Realistically if you configure a recent/latest trust as #1472 (comment) this is not a problem at all.
Probably we may also want to note that stateless client is still in early stage, and thus not suitable for use cases where you need 99.9%+ uptime/liveness?
| # ... sections not relevant are omitted ... | ||
| consensus: | ||
| providers: | ||
| - <node-address-1> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably mention some examples on how to the obtain these address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could also mention that the address could be an IP of a node on the network, or a path to the socket of a local node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also enable consensus.WatchBlock on our public endpoints (e.g. tesntet.grpc.oasis.io?). Currently it's not whitelisted: https://github.com/oasisprotocol/internal-ops/blob/9af8ae8e752e3e26b60df6c96049629fc84f4b54/ansible/roles/oasis-node/defaults/main.yml#L616-L630
Probably no reason not to whitelist it, for example the runtime.WatchBlocks is allowed: https://github.com/oasisprotocol/internal-ops/blob/9af8ae8e752e3e26b60df6c96049629fc84f4b54/ansible/roles/oasis-node/defaults/main.yml#L672
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is intentional as producer (server) stores per subscriber data in memory. Meaning if you have many slow subscribers this becomes prone to memory leak... oasisprotocol/oasis-core#6315 is somehow related. Not sure why we don't deem runtime watch blocks "dangerous"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update the stateless client mode to not rely on WatchBlocks then? (or have a fallback to polling if its not available)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complementary to that is to start running dedicated "provider" nodes ourselves
I don't think we need separate dedicated "provider" nodes. We already run public nodes that provide the data.
drop policy (e.g. max 1h of blocks in memory per subscriber
No need to worry about this for now. Since our public nodes are behind Cloudflare, it has hard cap connection limit of ~100 seconds. So clients will be dropped and need to reconnect way sooner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm we should check then everything works as expected with disconnections/missing block gaps. I believe initial versions didn't take this into consideration, but maybe this was fixed already. update: or again use poling so this is not an issue at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn’t optimize prematurely. Let’s expose the WatchBlocks method, and if things go south, we can disable/limit it and then optimize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll do that in https://github.com/oasisprotocol/internal-ops/pull/1182.
I have updated the part of the docs about addresses.
Is there a nice way to obtain node addresses that can be providers for stateless clients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a nice way to obtain node addresses that can be providers for stateless clients?
Could use existing OPF hosted nodes at grpc.oasis.io/testnet.grpc.oasis.io* (are there any other known public nodes run by anyone else?). Otherwise, the user can also use their own nodes.
* once we update them to expose the WatchBlocks method, and test it to ensure it works.
1f87108 to
1f475a1
Compare
1f475a1 to
8b4be35
Compare
| hash for the light client. | ||
|
|
||
| To ensure compatibility, all provider nodes specified must be running the latest version of Oasis Core. The provider | ||
| address contains the `oasis1` prefix. It can also be an IP of a node on the network, or a path to the socket of a local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provider address contains the
oasis1prefix
I don't think this is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see, will drop that part.
We can also state the (only?) two possible ways of specifying addresses:
{{ node_entity_id }}@{{ node_ip_address }}/node/run/internal.sock
Does that seem correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{{ node_entity_id }}@{{ node_ip_address }}
I don't think it's the entity id? I think it's the public key of the TLS certificate? And it's optional, If included, it specifies the expected peer identity for TLS authentication.
Also doesn't need to be an IP address. Address like grpc.oasis.io:443 also work.
Oasis Core 25.5 has brought support for stateless client nodes, which are particularly suitable for ROFL nodes.
We are documenting that here.