Skip to content
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.

feat: gossiping the latest valset in P2P layer in case they get pruned #560

Merged
merged 28 commits into from
Oct 27, 2023

Conversation

rach-id
Copy link
Member

@rach-id rach-id commented Oct 25, 2023

Overview

Closes #556

Checklist

  • New and updated code has appropriate documentation
  • New and updated code has new and/or updated testing
  • Required CI checks are passing
  • Visual proof for any user facing features like CLI or documentation updates
  • Linked issues closed with keywords

@rach-id rach-id added p2p p2p network related orchestrator orchestrator related relayer relayer related labels Oct 25, 2023
@rach-id rach-id self-assigned this Oct 25, 2023
@rach-id rach-id requested a review from evan-forbes as a code owner October 25, 2023 23:34
p2p/dht.go Outdated Show resolved Hide resolved
p2p/dht.go Outdated Show resolved Hide resolved
p2p/dht.go Outdated Show resolved Hide resolved
p2p/dht.go Outdated Show resolved Hide resolved
p2p/dht.go Outdated Show resolved Hide resolved
p2p/validators.go Outdated Show resolved Hide resolved
p2p/validators_test.go Outdated Show resolved Hide resolved
p2p/validators_test.go Outdated Show resolved Hide resolved
p2p/validators_test.go Outdated Show resolved Hide resolved
types/latest_valset.go Outdated Show resolved Hide resolved
@@ -3,6 +3,8 @@ package orchestrator
import (
"context"

types2 "github.com/celestiaorg/celestia-app/x/qgb/types"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the standard thing is naming this qgbtypes instead of types2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely not blocking, just for future reference, celestiatypes is better, but imo we should be specific with our imports to improve readability. for instance, what happens if we need to import "github.com/celestiaorg/celestia-app/x/blob/types" or "github.com/celestiaorg/celestia-core/types" as well? personally I'm not a fan of naming everything types, but the sdk and tendermint don't agree lol

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concerning this, what do you think of making it a rule in CI? I looked for existing tools that can do this but found none, so maybe we can do a simple bash script that takes all the imports in the project, filter those we want to have the same name across all repos, then check whether the right name is used or not.
I don't think it's worth it tbh, it would just make it harder to commit code, but if there is a good reason for it, I'm in

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I'm 100% down for adding CI to look at import names. For packages with types I think we could define a linter that is generic enough to be useful. Not sure about other package names that frequently require renaming

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p2p/dht.go Outdated
Comment on lines 181 to 194
// GetLatestValset looks for the latest valset in the DHT.
// The key will be returned by the `GetValsetKey` method.
// Returns an error if it fails.
func (q BlobstreamDHT) GetLatestValset(ctx context.Context) (types2.Valset, error) {
encoded, err := q.GetValue(ctx, GetLatestValsetKey()) // this is a blocking call, we should probably use timeout and channel
if err != nil {
return types2.Valset{}, err
}
valset, err := types.UnmarshalValset(encoded)
if err != nil {
return types2.Valset{}, err
}
return valset, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc, you mentioned that the the main concern with this approach is that we aren't verifying the valdiator set that is gossipped. Would that be possible though (in a futuer PR ofc)? We should be able to query the current validator set, and from that we should be able to measure the diff in power between the one we recieve that this one afaiu.

does that make sense or am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we could, but I didn't want to add that check because it will be checked implicitly when sending the update to Blobstream. So, no need to do it explicitly here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mind pointing out that check really quickly for my own understanding? thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but is it possible to gossip and invalid value? also, isn't the issue for folks deploying a new contract? would that check still be hit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, an invalid value can be gossiped, but will be rejected by the contract when relaying events.

For the deployment part, I didn't implement the ability to use the P2P valset to deploy because it's super risky and there is no way to trust it.

types/latest_valset_test.go Outdated Show resolved Hide resolved
@evan-forbes
Copy link
Member

only some minor comments, but otherwise LGTM

@rach-id
Copy link
Member Author

rach-id commented Oct 26, 2023

Actually, I am facing an issue with the serialization of date to json. If you check the CI, you will see the tests failing, however, they're not failing locally. But when I use a different environment, they still fail expecting different time. So, I am thinking of actually omitting the time during serialization since it's not needed to create the hash or calculate the threshold etc.

@rach-id
Copy link
Member Author

rach-id commented Oct 26, 2023

@evan-forbes I did 755ff25 to fix the issue of serialization via creating a struct that doesn't have the time field which was troublesome.

The time field is only used for pruning in the state machine. However, it's not part of the hash nor the threshold calculations. So, it made sense to remove it instead of dealing with serialization issues.

What do you think?

evan-forbes
evan-forbes previously approved these changes Oct 27, 2023
Copy link
Member

@evan-forbes evan-forbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still thinking about the extra gossiping check. I think we might need something like it (see that thread), but that shouldn't block this PR

@evan-forbes
Copy link
Member

the getting rid of the time makes sense.

in the past with serializing time that needed to be kept, I've often had to overwrite marshaling methods to fix that one annoying type.

@rach-id
Copy link
Member Author

rach-id commented Oct 27, 2023

I've often had to overwrite marshaling methods to fix that one annoying type.

Can you link me to some code that does this? I'm curious

@rach-id rach-id mentioned this pull request Oct 27, 2023
@rach-id
Copy link
Member Author

rach-id commented Oct 27, 2023

Added the changes discussed async. Will wait until we test this using an archive node to merge.

@codecov-commenter
Copy link

Codecov Report

Merging #560 (e5a05a5) into main (77134ee) will increase coverage by 0.56%.
Report is 4 commits behind head on main.
The diff coverage is 37.12%.

@@            Coverage Diff             @@
##             main     #560      +/-   ##
==========================================
+ Coverage   34.23%   34.80%   +0.56%     
==========================================
  Files          26       27       +1     
  Lines        2100     2221     +121     
==========================================
+ Hits          719      773      +54     
- Misses       1295     1354      +59     
- Partials       86       94       +8     
Files Coverage Δ
p2p/keys.go 83.33% <100.00%> (+0.98%) ⬆️
p2p/validators.go 86.25% <100.00%> (+4.25%) ⬆️
orchestrator/broadcaster.go 82.35% <40.00%> (-17.65%) ⬇️
evm/evm_client.go 50.31% <0.00%> (-1.95%) ⬇️
p2p/querier.go 72.85% <0.00%> (-2.15%) ⬇️
rpc/app_querier.go 0.00% <0.00%> (ø)
p2p/dht.go 65.78% <62.50%> (-6.26%) ⬇️
orchestrator/orchestrator.go 2.04% <0.00%> (+0.12%) ⬆️
types/latest_valset.go 28.94% <28.94%> (ø)
relayer/relayer.go 0.00% <0.00%> (ø)

@rach-id
Copy link
Member Author

rach-id commented Oct 27, 2023

I will merge this and test it once we have an archive node ready for testing. And if something is broken, we can fix it in a separate PR.

@rach-id rach-id merged commit 791453d into celestiaorg:main Oct 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
orchestrator orchestrator related p2p p2p network related relayer relayer related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deploying the Blobstream contract if no valset is in store
3 participants