Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: Fork Detector + Evidence (proof of fork) for discussion. #479

Merged
merged 27 commits into from
Aug 7, 2020

Conversation

josef-widder
Copy link
Member

@josef-widder josef-widder commented Jul 27, 2020

This document (rendered) can be a basis for follow-up discussions contains the following sections:

  • the outcome of recent discussion
  • a sketch of the light client supervisor to provide the context in
    which fork detection happens
  • a discussion about lightstore semantics
  • a draft of the light node fork detection including "proof of fork"
    definition, that is, the data structure to submit evidence to full
    nodes.

Addresses point 2. in #461.
Defines "proof of fork" as required by tendermint/tendermint#5149

@OStevan OStevan mentioned this pull request Jul 28, 2020
5 tasks
@cmwaters
Copy link

Hey josef, I just skimmed through the document yesterday. I think the biggest question I have outstanding is in regards to the the PoF with the lightblock traces. I feel there's no much mentioned of the reasoning behind how the PoF actually proves that there is a fork (this might be discussed in more depth in a document that I haven't yet seen) or more specifically what I feel is the second half of the problem which is what the full node does with the PoF, how it verifies it and extracts out the respective evidence that it will need to commit on the chain (assuming that is the ultimate goal of fork accountability). I thought I'd just drop my thoughts in this PR now so that we can discuss this for our call later today.

@josef-widder
Copy link
Member Author

Thanks, @cmwaters for looking at the document! Indeed, this document does not capture the questions you raised. The plan is that that together with @milosevic we will address these questions in a different specification. We already created the following issue for that tendermint/tendermint#5149.

Copy link
Member

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just initial comments from starting review. Will keep working through it.

> to the secondary. However, assuming that in this scenario the
> primary is faulty it may not respond.

As the main goal is to catch misbehavior of the primary,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be indented?

Also why do we say main goal is to catch the primary? Don't we want to catch any of them, even if primary is correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to catch all of them. I guess this remark came out of discussion with @milosevic that also underly the comments in #494: we may use whatever is verified, and roll back if we discover a problem. In this view, the attack on the light client we have to worry about is done by the primary

docs/spec/lightclient/detection/detection.md Outdated Show resolved Hide resolved
different from the one the full node knows (we do not send the trace
the primary gave us back to the primary)

- The light client attack is via the primary. Thus we try to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confusing. the primary may be correct and the attack may be coming from one or many of the secondaries

Copy link
Member Author

@josef-widder josef-widder Aug 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said above, it may only lead to a safety issue, if it comes from the primary, and we gave the verified header to the user. If the verified header that comes from the primary is OK, the secondary may just force the lightclient to shut down and report evidence, but it will not make the light client have a bad verified block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key property detection should ensure is the following: if a bad light block is "consumed" (either verified in the light node case or installed on chain in IBC case), it will be detected within bounded time. In case of the light node, this is ensured by checking verified block against witnesses, while in IBC case it done by a correct relayer checking every installed header. This ensures that exposed misbehaviour is always slashable which incentivise correct behaviour. In the scenario with correct primary and fork coming from the secondary, we will as Josef pointed out, also submit fork and shut down, but this scenario is less severe as "consumed" block is from the correct branch.

it with a flag "trusted" that can be used
- internally for efficiency reasons (to maintain
[LCD-INV-TRUSTED-AGREED.1] until a fork is detected)
- by light client based on the "one correct full node" assumption
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also comment in #494 (comment) about introducing a local time parameter which the client waits before trusting (where during this time its cross referencing the header with existing and new peers, assuming a more dynamic peer set and peer exchange)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea a lot. In some sense, it makes the requirements about the trusting and unbonding periods more concrete. The unbonding period must contain the following times:

  • download and verify a block
  • cross check a block (= the time parameter you are suggesting)
  • compute and report proof of fork
  • process proof of fork, and punish misbehavior.

Right now, all this is somewhat hidden in the assumption that the trusting period << unbonding period.

Copy link
Contributor

@milosevic milosevic Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this time parameter should be part of the model and API. In that sense it does not make StateTrusted obsolete: we just need to change its semantic. The right semantic could be the following: light block is in StateTrusted if it was StateVerified and Fork_Detection_Timeout has passed without finding a valid fork. Note that one possible implementation of this is what we currently have with always assuming correct node in the peer list as it provides bounded detection time. Other possible implementation would involve dynamic peer set, and assumptions on periodic presence of the correct nodes in the peer set. For efficiency reasons we could imagine also probabilistic variants where not every light block will be checked but only the random one.

> [LCV-DATA-POF.1] mirrors the definition [TMBC-SIGN-FORK.1]:
> *TrustedBlock* corresponds to *a*, and *PrimaryTrace* and *SecondaryTrace*
> are traces to two blocks *b* and *c*. The traces establish that both
> *skip-root(a,b,t)* and *skip-root(a,c,t)* are satisfied.
Copy link

@cmwaters cmwaters Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to enforce that the heights used in verification via skipping must be the same for both skip-root(a,b,t) and skip-root(a,c,t)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Just the blocks b and c need to be the same height. The other blocks on the traces need not be at the same height. Due to the dynamic changes in the validator set, in general skipping via the same heights might not work on different branches.

Actually, if there would be earlier two conflicting blocks with the same height on two different traces, we could use those earlier blocks as proof of fork. Therefore, attackers will most likely try to avoid earlier conflicting block.

@@ -169,7 +357,7 @@ false* then we have a *fork on the chain*.
> fork.


#### **[TMBC-LC-FORK]**
#### **[TMBC-LC-FORK.1]**
Let *a*, *b*, *c*, be light blocks and *t* a time. We define
*light-client-fork(a,b,c,t)* iff
- *sign-skip-match(a,b,c,t) = false* and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the condition should be slightly different: sign-skip-match(a,b,c,t) = true, sequ-rooted(b), b is unique and b != c.

@josef-widder
Copy link
Member Author

I have added a file (rendered) with sketches of functions for fork detection in the IBC context. They contain suggestions to updates to the on-chain IBC component, and for the relayer.

@xla xla changed the title Fork Detector + Evidence (proof of fork) for discussion. spec: Fork Detector + Evidence (proof of fork) for discussion. Aug 5, 2020
@xla xla added enhancement New feature or request spec Specifications labels Aug 5, 2020

> The above requires us that before we pick a new secondary, we have to
> query the secondary for the header of height
> *lightStore.LatestTrusted().Height*.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this doesn't ensure that trusted store of the secondary is consistent with the primary, i.e., only the latest trusted light block must be consistent. Could this potentially be problematic if secondary (with invalid store except latest trusted light block) is later promoted to a primary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is subtle. First, this current supervisor is simplified as it considers only going forward (VerifyToTarget only goes forward). In this case, it is not a problem.

If we want the supervisor to be more general it need to use Main instead of VerifyToTarget. Then we would need to query the (new) primary for an older trusted block as reference. This older trusted block then also needs to be passed into Forkdetection (which also assumes going forward). This is less critical as we discuss to anyway only pass in one block in the future. @OStevan has prepared a Rust code snippet and will create a PR on data structures and what information to pass around.

generated during detection.

```go
type PoFStore struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need this; using array of LightNodeProofOfFork should be sufficient.

```
- Expected postcondition
- returns a lightstore that contains all lightblocks *b* from *ls*
that satisfy: *from < b.Header.Height <= to*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need returned light blocks to satisfy that every next block is verified from the previous one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, actually subtrace should be replaced by something more clever, as you suggest. The problem with consequent blocks not supporting each other, occurred to me only later, and I didn't want to change anything before we agree on (that there is a problem) and the solution.

**TODO:** formalize conditions
- Implementation remark
- the primary is replaced by a secondary, and lightblocks above
trusted blocks are removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe those blocks just need to go through fork detection procedure instead of removing them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that. It will just make the control flow a bit more complicated.

result := VerifyToTarget(s, auxLS, sh.Header.Height)
if result = (_,ResultSuccess) || (_,EXPIRED) {
LS,result := VerifyToTarget(secondary, auxLS, sh.Header.Height)
if (result = ResultSuccess || result = EXPIRED) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case result = EXPIRED that means that secondary is not responsive. But as this does not allow us to verify if conflicting header is a fork, we shouldn't report it. Note that this is not a problem as the important scenario is faulty-primary, correct secondary, in which case we will not have timeout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, EXPIRED referred to some older version of verification.md. The Error code does not exist anymore, and we should remove it and just check for ResultSuccess

milosevic
milosevic previously approved these changes Aug 7, 2020
@josef-widder josef-widder merged commit 0657d27 into master Aug 7, 2020
@josef-widder josef-widder deleted the josef/light-evidence branch August 7, 2020 14:04
@konnov
Copy link
Contributor

konnov commented Aug 7, 2020

Dr. Widder held a gun against my head and asked me to approve it %)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request spec Specifications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants