Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[breacharbiter] Split justice tx in case of delayed confirmation. #5036

Merged

Conversation

halseth
Copy link
Contributor

@halseth halseth commented Feb 17, 2021

This PR attempts to solve the theoretical pinning issue that can occur for anchor type channels in the breach case: lightning/bolts#803

If 4 blocks pass without the justie tx confirming (it is created using a confirmation target of 2), we will attempt to sweep the commitment outputs separately from the HTLC outputs.

TODO

  • Additional test

@halseth halseth added safety General label for issues/PRs related to the safety of using the software anchors v0.13 labels Feb 17, 2021
@halseth halseth added this to the 0.13.0 milestone Feb 17, 2021
@Roasbeef Roasbeef added the P1 MUST be fixed or reviewed label Feb 17, 2021
@Roasbeef Roasbeef self-requested a review February 17, 2021 19:37
breacharbiter.go Show resolved Hide resolved
breacharbiter.go Show resolved Hide resolved
breacharbiter.go Show resolved Hide resolved
breacharbiter.go Outdated Show resolved Hide resolved
breacharbiter.go Show resolved Hide resolved
breacharbiter.go Outdated Show resolved Hide resolved
breacharbiter.go Outdated Show resolved Hide resolved
breacharbiter.go Show resolved Hide resolved
breacharbiter.go Show resolved Hide resolved
@halseth halseth force-pushed the breacharbiter-justice-splitting branch from de91e70 to 3990efa Compare March 3, 2021 12:21
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this! The high level algo is much clearer from my PoV after another pass.

I definitely think we should bolster our test coverage to exercise these cases just to ensure we're not missing an edge case. It also might be helpful to document the state/transaction tree structure if something exceptional does happen.

With the current param of a delay of 4 blocks, I'd expect this to in practice be the default way lnd handles breaches. We end up spending more on chain fees, but could say the extra assurance that no funny businesses is happening may be worth it....

breacharbiter.go Show resolved Hide resolved
breacharbiter.go Outdated
for _, inp := range breachInfo.breachedOutputs {
// If the spent outpoint is not among the ouputs that
// were breached , we can ignore it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extra space before the comma.

breacharbiter.go Show resolved Hide resolved
Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solid changes! only thing i see with this PR is that it may never actually broadcast the split justice transaction if lnd doesn't stay up long enough to observe 4 blocks

breacharbiter.go Outdated Show resolved Hide resolved
breacharbiter.go Show resolved Hide resolved
breacharbiter.go Outdated Show resolved Hide resolved
breacharbiter.go Outdated
// was published with a 2-block fee estimate, so it's
// not unexpected that four blocks without confirmation
// can pass.
if blocksPassed < 4 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that we may never pass this check on mobile, since lnd needs to observe 4 block consecutively without restarting. not sure what the best way to handle that is, is there a way we can load the broadcast height on startup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah this is a good point...since we no longer save the finalized transaction, we can't use the locktime there to mark our last broadcast height. We do have the CloseHeight as part of the close summary though, which tracks the height the breaching transaction was confirmed. I think that should work as a replacement?

@Roasbeef
Copy link
Member

Tests still failing (to compile):

Error: ./breacharbiter_test.go:416:15: frs.rs.Finalize undefined (type RetributionStore has no field or method Finalize)
Error: ./breacharbiter_test.go:425:2: not enough arguments to return
Error: ./breacharbiter_test.go:425:15: frs.rs.GetFinalizedTxn undefined (type RetributionStore has no field or method GetFinalizedTxn)

@halseth halseth force-pushed the breacharbiter-justice-splitting branch 3 times, most recently from 8b1bc9d to 9818b6e Compare April 20, 2021 19:39
breacharbiter.go Outdated Show resolved Hide resolved
@halseth halseth force-pushed the breacharbiter-justice-splitting branch from 9818b6e to c2e9a7d Compare April 21, 2021 11:55
@Roasbeef
Copy link
Member

Do we need #5229 if the first two commits are now included in this diff?

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice iteration between the current and prior diff! Need to do one additional round on my end to review the diff as a hole to fully grok the new possible state transitinos in the main brar handing sate machine.

}

log.Infof("Breached channel=%v marked pending-closed",
c.cfg.chanState.FundingOutpoint)
// Hand the retribution info over to the breach arbiter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah this makes total sense in retrospect: we always need to ensure that we've persisted any state that might affect the downstream consumer of our message before we send it.

breacharbiter_test.go Outdated Show resolved Hide resolved
signDesc.DoubleTweak.PubKey(),
).SerializeCompressed()

// If the HTLC output was spent using the revocation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen otherwise, we'd have some empty iterations in the sweeper until it removed the input due to hitting the failed sweep counter ceiling?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I now from below that this will be a new case since we'll at times just sweep the HTLCs in isolation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier we were only calling this if we detected a spend that wasn't ours, now we use it for our own spends as well, which is why we need this.

breacharbiter.go Outdated Show resolved Hide resolved
// published with a 2-block fee estimate, so it's not
// unexpected that four blocks without confirmation can
// pass.
splitHeight := breachInfo.breachHeight +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// smaller variants of the justice tx. We do this to mitigate an attack
// the channel peer can do by pinning the HTLC outputs of the
// commitment with low-fee HTLC transactions.
blocksPassedSplitPublish = 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels hard to properly tune these values in a way that can handle sustained fee bursts like we're seeing now, but nevertheless, this is a start to patch an unlikely, but possible attack vector.

@@ -1798,6 +1798,190 @@ func testBreachSpends(t *testing.T, test breachTest) {
assertBrarCleanup(t, brar, alice.ChanPoint, alice.State().Db)
}

// TestBreachDelayedJusticeConfirmation tests that the breach arbiter will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@halseth halseth force-pushed the breacharbiter-justice-splitting branch 2 times, most recently from 597d415 to a0d7fae Compare April 26, 2021 13:33
@halseth
Copy link
Contributor Author

halseth commented Apr 26, 2021

Do we need #5229 if the first two commits are now included in this diff?

Not really, I just separated it in case we wanted to merge it in isolation, as it is really an existing issue.

@halseth
Copy link
Contributor Author

halseth commented Apr 26, 2021

Added an itest in a separate PR, since we depend on a change to btcd to drive the test: #5242

PTAL

@halseth halseth requested a review from Roasbeef April 26, 2021 13:39
@halseth
Copy link
Contributor Author

halseth commented May 5, 2021

ping @cfromknecht @Roasbeef

@Roasbeef
Copy link
Member

Roasbeef commented May 6, 2021

Needs a rebase w/ the first two commits merged in!

Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another pass, looks pretty good. Only outstanding question was this one re: mobile #5036 (comment)

Any ideas of how to handle this?

if len(txIn.Witness) == 3 &&
bytes.Equal(txIn.Witness[1], revokeKey) {
break
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this section about use the IsHtlcSpendRevoke method created in the last commit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see, the commits are out of order on GH it seems


if err == lnwallet.ErrDoubleSpend {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still want to silence the log message if the error is ErrDoubleSpend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still want to log it? (we also logged any error before, even double spend)

brarLog.Infof("Attempting another justice tx "+
"with %d inputs",
len(breachInfo.breachedOutputs))

wg.Wait()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we wait for this towards the end? or can it also be done as soon as the channel fires?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's because for the case where we break the loop, we go straight into wg.Wait(). In this particular case we are using the goto, so we add the wait before we go back to launch a new go routine.

breacharbiter_test.go Outdated Show resolved Hide resolved
breacharbiter_test.go Outdated Show resolved Hide resolved
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💿

breacharbiter.go Outdated Show resolved Hide resolved
breacharbiter.go Show resolved Hide resolved
@Roasbeef
Copy link
Member

Roasbeef commented May 7, 2021

Did another pass, looks pretty good. Only outstanding question was this one re: mobile #5036 (comment)

The latest diff now uses the stored breach height compared to the current block height, and proceed w/ the new logic if that delta exceeds our target time window.

@halseth halseth force-pushed the breacharbiter-justice-splitting branch 2 times, most recently from 86cf235 to 303ce6d Compare May 7, 2021 10:17
@halseth
Copy link
Contributor Author

halseth commented May 7, 2021

Did another pass, looks pretty good. Only outstanding question was this one re: mobile #5036 (comment)

The latest diff now uses the stored breach height compared to the current block height, and proceed w/ the new logic if that delta exceeds our target time window.

Yeah, exactly.

Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🏆

halseth added 12 commits May 12, 2021 12:32
We split the method waitForSpendEvent into two, such that we can reuse
it in case the commitment is spent by various transactions.
inputs

Since we want to test more complex combinations of spends of the
breached outputs, we use two maps tracking

1. which transaction will spend the outpoint
2. which outpoints we expect the breacharbiter to include in the justice
   tx

This let us trigger spends of the individual outputs, and depending on
what we want to test check whether the breacharbiter sweeps the expected
outpoints.
Since we want to potentially broadcast multiple versions of the justice
TX, instead of waiting for confirmation of a specific TXID, we instead
wait for the breached outputs to be spent.
Now that we don't rely on the justice tx TXID anymore, we can remove
finalization of it. Instead we'll recreate the transaction when needed
from the persisted retribution info.
We define a new struct justiceTxVariants, which holds three different
justice transactions:

1. The "normal" justice tx that spends all breached outputs
2. A tx that spends only the breached to_local output and to_remote output
   (can be nil if none of these exist)
3. A tx that spends all the breached HTLC outputs (can be nil if no HTLC
   outputs exist)

This will later be used to sweep the time sensitive outputs separately,
in case the normal justice tx doesn't confirm in time.
confirming

In case 4 block passes without our justice tx confirming, we'll "split"
it up, and separately sweep the commitment outs, and HTLC outs.
Since we also must count revoked funds swept from second level revoked
outputs, we move the funds counting into the updateBreachInfo method,
where we already are checking whether the spend is by us or the remote.

We also clean up the logs a bit, to log the incremental sweep of funds
that now can happen.
@halseth halseth force-pushed the breacharbiter-justice-splitting branch from 303ce6d to 02268b8 Compare May 12, 2021 10:32
@Roasbeef Roasbeef merged commit c0acdd8 into lightningnetwork:master May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
anchors P1 MUST be fixed or reviewed safety General label for issues/PRs related to the safety of using the software v0.13
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants