Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose additional statistics about validators in gaia-light #1506

Closed
jackzampolin opened this issue Jul 2, 2018 · 11 comments
Closed

Expose additional statistics about validators in gaia-light #1506

jackzampolin opened this issue Jul 2, 2018 · 11 comments
Labels

Comments

@jackzampolin
Copy link
Member

When building explorers there is a bunch of additional data that Tendermint core tracks that would be useful to expose to clients. Currently if you want to track uptime for validators, you need to parse through whole chain and store data for each block. This is how figment.network is providing that extra data. They have built a separate database that tracks this.

It would be nice if tendermint, gaiad or gaia-light exposed this data to third party clients. It would make explorers significantly easier to build as well as give them more features. Ideally we want to expose this in ICS0 (TendermintAPI), since this will be common across all tendermint based chains whether PoS or PoA.

cc @nylira @adrianbrink @ebuchman

@rfunduk
Copy link

rfunduk commented Jul 3, 2018

Hey! Ryan from Figment here 👋

We currently use the RPC server (and /validators/stake from gaia-light in order to get the moniker and other metadata) to process every block in the chain. We do this each minute (simply because cron is 1 minute, I think we'd like to do more frequently in the future) to update our local height to the node's height.

For reference on:

you need to parse through whole chain and store data for each block

We don't necessarily need to store every block locally, and we currently store very light objects compared to what is in the RPC data. We just need very recent ones so that we can do things like build hourly uptime snapshots and things like that. The rest of the stuff we calculate happens right after syncing, and we could easily discard the blocks immediately. We could even architect it so that we don't store them and simply query them from RPC as needed (there is 20 block page limit, so this is kinda cumbersome -- it sure would be nice to be able to configure this somehow so we can have our local node for this application allow unlimited/huge number of blocks per call).

This issue seems to mostly be about validator statistics, and so here's a rough list of high level stuff we want to know (and currently calculate via block/commit/validators RPC endpoints):

  • Every change to validator voting power
  • Every time a validator enters or exists the active set
  • How many blocks were between two timestamps, and how many blocks a given validator precommitted on (we store these separately so we can make hourly and daily averages)
  • Validator uptime for last N blocks (like above, # precommitted on out of N to get 'current' snapshot)
  • If a validator has not precommitted on N of the last M blocks (rolling window of M blocks, with 'latch' so that you have to exit this state before it can be triggered again)
  • If a validator has not precommitted on N consecutive blocks (again with a 'latch')

I'm not sure how gaia-light can provide some of this in a real-time way that would be useful. It's not really appealing to be hammering the API every second to try and keep up with voting power changes. We experimented with using the consensus_state endpoint on RPC, but it's just too churny, for instance. We want to be able to get historical info. Essentially if we can't recreate our database from scratch using API data then that's a big concern for us.

With the block-by-block approach we just consider all the new blocks and record/do the right things, then wait a bit and do it again for new blocks. If we mess up, we just reprocess those blocks. It feels pretty elegant. With pruning of old blocks we no longer need, it doesn't feel clunky at all.

What would be helpful for us to build more would be:

  • info about slashing of validators
  • info about specific individual bondings/unbondings
  • more details about voting (prevote/precommit/etc)
  • uniform addresses/public keys in gaia-light vs RPC so we can more easily map these data sources together

I hope this is helpful feedback on this issue. Let me know here or ping me on Riot if I can elaborate or clarify anything here.

@faboweb
Copy link
Contributor

faboweb commented Jul 11, 2018

Brain dump:

  • Slashings are transactions that will be indexed by nodes and can be queried (in progress).
  • bondings/unbondings will be queryable transactions and are already on the latest develop (I disagree with the endpoint structure and hope there will be some change still).
  • prevote/precommit/etc: We can provide this data in light client. (How about creating an individual issue to keep the scope small?)
  • uniform addresses: The goal is, that external integrators only use the gaia light client as an interface to the network. The gaia light client should only show bech32 encoded addresses for safety. We are in the conversion phase for this change. If you find places where addresses delivered by the light client are not encoded in bech32, I encourage you to create an issue to change this. For now you can use bech32 libraries (like https://github.com/bitcoinjs/bech32, used here: https://github.com/cosmos/voyager/blob/develop/app/src/renderer/scripts/b32.js) to convert the addresses to hex-addresses.

@fedekunze fedekunze added the lcd label Aug 21, 2018
@fedekunze
Copy link
Collaborator

Hi @rfunduk, we are in process of updating our Gaia-lite API on the SDK side (See #2113). Do you think we should include something else ?

@rfunduk
Copy link

rfunduk commented Sep 6, 2018

Seems pretty good to me. One thing I'm not sure I see here is info on validators being revoked/unrevoked (unjail?). That would be useful for us since currently, as I understand it, we would have to (we don't handle it at all) more or less guess/assume a validator was revoked by its voting power going to 0.

@alexanderbez
Copy link
Contributor

@rfunduk you can use the stake/validators/{address} endpoint to get a specific validator. That result will have a Jailed boolean attribute. Does this help you?

@rfunduk
Copy link

rfunduk commented Sep 6, 2018

Unfortunately no I don't think so. I don't think that endpoint takes a block height? We need/desire our system to be resilient to downtime/missed syncings etc so we effectively need our database to be recreatable at any time from scratch using rpc/lcd.

@alexanderbez
Copy link
Contributor

I see. I believe the CLI allows for this. I guess the LCD endpoints should be more flexible in their query functionality.

@rfunduk
Copy link

rfunduk commented Sep 6, 2018

Yea we can't call CLI because the node isn't even on the same instance (we sync various chains/networks and our web-related servers are separate).

@jackzampolin
Copy link
Member Author

cc @fedekunze thoughts on adding this information to other endpoints?

@fedekunze
Copy link
Collaborator

@rfunduk thanks for the feedback. There's actually an open issue for that (#2202). I will sync with @jackzampolin to increase the priority of that issue once we merge #2249

@fedekunze
Copy link
Collaborator

Moving what's left to #2202. Please open a separate issue if someone feels that we're missing something

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants