-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[breacharbiter] Split justice tx in case of delayed confirmation. #5036
[breacharbiter] Split justice tx in case of delayed confirmation. #5036
Conversation
de91e70
to
3990efa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on this! The high level algo is much clearer from my PoV after another pass.
I definitely think we should bolster our test coverage to exercise these cases just to ensure we're not missing an edge case. It also might be helpful to document the state/transaction tree structure if something exceptional does happen.
With the current param of a delay of 4 blocks, I'd expect this to in practice be the default way lnd handles breaches. We end up spending more on chain fees, but could say the extra assurance that no funny businesses is happening may be worth it....
breacharbiter.go
Outdated
for _, inp := range breachInfo.breachedOutputs { | ||
// If the spent outpoint is not among the ouputs that | ||
// were breached , we can ignore it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: extra space before the comma.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solid changes! only thing i see with this PR is that it may never actually broadcast the split justice transaction if lnd doesn't stay up long enough to observe 4 blocks
breacharbiter.go
Outdated
// was published with a 2-block fee estimate, so it's | ||
// not unexpected that four blocks without confirmation | ||
// can pass. | ||
if blocksPassed < 4 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that we may never pass this check on mobile, since lnd needs to observe 4 block consecutively without restarting. not sure what the best way to handle that is, is there a way we can load the broadcast height on startup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah this is a good point...since we no longer save the finalized transaction, we can't use the locktime there to mark our last broadcast height. We do have the CloseHeight
as part of the close summary though, which tracks the height the breaching transaction was confirmed. I think that should work as a replacement?
Tests still failing (to compile):
|
8b1bc9d
to
9818b6e
Compare
9818b6e
to
c2e9a7d
Compare
Do we need #5229 if the first two commits are now included in this diff? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice iteration between the current and prior diff! Need to do one additional round on my end to review the diff as a hole to fully grok the new possible state transitinos in the main brar handing sate machine.
contractcourt/chain_watcher.go
Outdated
} | ||
|
||
log.Infof("Breached channel=%v marked pending-closed", | ||
c.cfg.chanState.FundingOutpoint) | ||
// Hand the retribution info over to the breach arbiter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah this makes total sense in retrospect: we always need to ensure that we've persisted any state that might affect the downstream consumer of our message before we send it.
signDesc.DoubleTweak.PubKey(), | ||
).SerializeCompressed() | ||
|
||
// If the HTLC output was spent using the revocation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen otherwise, we'd have some empty iterations in the sweeper until it removed the input due to hitting the failed sweep counter ceiling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I now from below that this will be a new case since we'll at times just sweep the HTLCs in isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier we were only calling this if we detected a spend that wasn't ours, now we use it for our own spends as well, which is why we need this.
// published with a 2-block fee estimate, so it's not | ||
// unexpected that four blocks without confirmation can | ||
// pass. | ||
splitHeight := breachInfo.breachHeight + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// smaller variants of the justice tx. We do this to mitigate an attack | ||
// the channel peer can do by pinning the HTLC outputs of the | ||
// commitment with low-fee HTLC transactions. | ||
blocksPassedSplitPublish = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels hard to properly tune these values in a way that can handle sustained fee bursts like we're seeing now, but nevertheless, this is a start to patch an unlikely, but possible attack vector.
@@ -1798,6 +1798,190 @@ func testBreachSpends(t *testing.T, test breachTest) { | |||
assertBrarCleanup(t, brar, alice.ChanPoint, alice.State().Db) | |||
} | |||
|
|||
// TestBreachDelayedJusticeConfirmation tests that the breach arbiter will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
597d415
to
a0d7fae
Compare
Not really, I just separated it in case we wanted to merge it in isolation, as it is really an existing issue. |
Added an itest in a separate PR, since we depend on a change to btcd to drive the test: #5242 PTAL |
ping @cfromknecht @Roasbeef |
Needs a rebase w/ the first two commits merged in! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did another pass, looks pretty good. Only outstanding question was this one re: mobile #5036 (comment)
Any ideas of how to handle this?
if len(txIn.Witness) == 3 && | ||
bytes.Equal(txIn.Witness[1], revokeKey) { | ||
break | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this section about use the IsHtlcSpendRevoke
method created in the last commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i see, the commits are out of order on GH it seems
|
||
if err == lnwallet.ErrDoubleSpend { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still want to silence the log message if the error is ErrDoubleSpend
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still want to log it? (we also logged any error before, even double spend)
brarLog.Infof("Attempting another justice tx "+ | ||
"with %d inputs", | ||
len(breachInfo.breachedOutputs)) | ||
|
||
wg.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason we wait for this towards the end? or can it also be done as soon as the channel fires?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's because for the case where we break the loop, we go straight into wg.Wait()
. In this particular case we are using the goto, so we add the wait before we go back to launch a new go routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 💿
The latest diff now uses the stored breach height compared to the current block height, and proceed w/ the new logic if that delta exceeds our target time window. |
86cf235
to
303ce6d
Compare
Yeah, exactly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🏆
We split the method waitForSpendEvent into two, such that we can reuse it in case the commitment is spent by various transactions.
inputs Since we want to test more complex combinations of spends of the breached outputs, we use two maps tracking 1. which transaction will spend the outpoint 2. which outpoints we expect the breacharbiter to include in the justice tx This let us trigger spends of the individual outputs, and depending on what we want to test check whether the breacharbiter sweeps the expected outpoints.
Since we want to potentially broadcast multiple versions of the justice TX, instead of waiting for confirmation of a specific TXID, we instead wait for the breached outputs to be spent.
Now that we don't rely on the justice tx TXID anymore, we can remove finalization of it. Instead we'll recreate the transaction when needed from the persisted retribution info.
We define a new struct justiceTxVariants, which holds three different justice transactions: 1. The "normal" justice tx that spends all breached outputs 2. A tx that spends only the breached to_local output and to_remote output (can be nil if none of these exist) 3. A tx that spends all the breached HTLC outputs (can be nil if no HTLC outputs exist) This will later be used to sweep the time sensitive outputs separately, in case the normal justice tx doesn't confirm in time.
confirming In case 4 block passes without our justice tx confirming, we'll "split" it up, and separately sweep the commitment outs, and HTLC outs.
Since we also must count revoked funds swept from second level revoked outputs, we move the funds counting into the updateBreachInfo method, where we already are checking whether the spend is by us or the remote. We also clean up the logs a bit, to log the incremental sweep of funds that now can happen.
303ce6d
to
02268b8
Compare
This PR attempts to solve the theoretical pinning issue that can occur for anchor type channels in the breach case: lightning/bolts#803
If 4 blocks pass without the justie tx confirming (it is created using a confirmation target of 2), we will attempt to sweep the commitment outputs separately from the HTLC outputs.
TODO