Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable watchtower plugins via the commitment_revocation hook #3601

Closed
wants to merge 11 commits into from

Conversation

cdecker
Copy link
Member

@cdecker cdecker commented Mar 23, 2020

Implements the commitment_revocation hook for watchtowers. Commitments are
forwarded from openingd and channeld, and tracked in lightningd. Upon
receiving a revocation lightningd generates the penalty transaction and
calls the hook so plugins can postprocess it, e.g., encrypt it, and then push
it to the watchtower.

The result of the hook is currently not used, and we don't actually wait for
the hook call to complete before telling channeld that it can continue. This
matches the save path in the DB, since we'd simply be replaying the
commitment/revocation if we reconnect anyway. Should synchronous operation be
required we can move the channel_got_revoce_reply to hook callback, but for
the sake of simplicity I left it were it is for the time being.

The penalty transaction is generated in lightningd, not channeld which
required forwarding information from channeld about the
commitment_transaction to lightningd, and requires lightningd to call
out to hsmd for the signature, instead of channeld. The decision to have
lightningd create the penalty transaction has the following reasons:

  • channeld may not have all the information about the previous commit,
    i.e., in case we come from openingd the first commitment is negotiated in
    openingd so we'd have to funnel that information through lightningd
    anyway.
  • lightningd will also deliver the hook call, and it can avoid building and
    signing the penalty tx if there are no subscribers, avoiding the tx
    generation and the request to hsmd altogether (not yet implemented).

Included in the commitment_revocation hook call is the fully signed penalty
transaction, as well as the commitment transaction hash the penalty
transaction is spending from, i.e., the transaction whose broadcast should
trigger the release of the penalty transaction. The commitment transaction
hash could also be extracted from the penalty transaction (it's its only input
after all), but this allows treating the penalty transaction as an opaque blob
and avoids parsing the transaction in the plugin.

Not included are the htlc_success and htlc_failure transactions that
may also be spending commitment_tx outputs. This is because these
transactions are much more dynamic and have a predictable timeout, allowing
wallets to ensure a quick checkin when the CLTV of the HTLC is about to
expire.

Open Questions

  • Naming: as always finding good names is hard, and I wanted to make the hook
    name descriptive for what it does, not what it can be used for, so calling
    it watchtower seemed inapropriate. I'm open for better suggestions though
    😉
  • What extra information would be good to have?
  • Would it make sense to also include htlc_success and htlc_timeout
    transactions, despite not having the full information regarding fees and
    aggregations available?
  • Since we don't use any of the specialties of hooks in this case, would it
    make sense to define a notification instead? I went with the hook for now
    mainly so we can eventually make it synchronous.

Closes #3422
Closes #1353
Closes #3645
Closes #3659

@cdecker
Copy link
Member Author

cdecker commented Mar 23, 2020

This is an initial proposal to brainstorm around, and I am happy to see what else is missing from this.

Ping @ZmnSCPxj, @NicolasDorier and @sr-gi since you were involved in either #1353 or #3422.

doc/PLUGINS.md Show resolved Hide resolved
The `commitment_revocation` hook is a chained hook, i.e., multiple plugins can
register it, and they will be called in the order they were registered in.
Plugins should always return `{"result": "continue"}`, otherwise subsequent
hook subscribers would not get called.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the use-case be for interactivity of hooks? Is there any reason for a plugin to somehow affect state update? Would not htlc_accepted (and maybe a future ptlc_accepted) be sufficient as a hook, since we will largely be manipulating HTLC/PTLCs in the future only? It seems to me more sensible to use a notification for commitment_revocation event, and use a hook for htlc_accepted.

We can only emit this event (whether notification or hook) after the counterparty sends a revoke_and_ack to us. This is pretty much the last step in the update dance (I think we send our own revoke_and_ack even before we know of the counterparty revoke_and_ack, as otherwise it would be a chicken-and-egg problem), so I do not see what a hook can affect for this state: it would be "too late" at this point to do anything about this state, so what can a plugin affect for this state? Even if we disconnect at this point, as far as the counterparty is concerned the new state is now the latest (and it will re-send the revocation data on channel re-establish).

Copy link
Collaborator

@ZmnSCPxj ZmnSCPxj Mar 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.

What it could affect would be dependent processing: i.e. we do not consider any HTLCs introduced by the new state to be "irrevoably committed" until the most recent commitment_revocation hook returns. This implies adding a new state in the database: we know the revocation of a previous state, but have not yet confirmed with commitment_revocation that all the plugins have confirmed they have processed this revocation correctly, in which case on a restart we will re-invoke commitment_revocation on all plugins. It seems safer to implement this first if we are going to make this a hook anyway.

So processing goes this way:

  • We receive revoke_and_ack.
  • We set the channel to state "revocation known but commitment_revocation not yet done."
  • Send commitment_revocation down the hook chain.
  • Once all hooked plugins say {"result": "continue"} set the state to the "new state" and handle HTLC forwardings and HTLC incoming payments as previous.

In case of a hard crash while commitment_revocation is still being propagated/handled by plugin(s), then on restart we redo the commitment_revocation to all plugins, to ensure that it is handled.

I am unsure what we should do in case any hooked plugin says anything that is not {"result": "continuie"}.... maybe fail all incoming HTLCs on the affected channel with temporary_node_failure instead of doing any further processing?

Copy link
Collaborator

@ZmnSCPxj ZmnSCPxj Mar 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in case of a failure in commitment_revocation (i.e. it returns anything other than a non-error result {"result": "continue"}), then what we should do is to store the failed commitment_revocation on-database, then set the channel to "always fail" mode for now.

Then future updates on the channel, we:

  • We receive revoke_and_ack.
  • Append the new commitment_revocation to the list in he database.
  • For every commitment_revocation in the in-database list (in order of appending!):
    • Send commitment_revocation to the plugin hook chain.
    • If it succeeds, remove that from the in-database list and continue with the next item in the list.
    • If it fails, exit this loop and mark all live incoming HTLCs as failing with temporary_node_failure.
  • If the above loop completely succeeded, the in-database list would be empty.
  • If the above loop completely succeeded, then we can safely handle (forward or claim) any incoming HTLCs.

On startup we perform the above loop as well, and handle incoming HTLCs as normal if the loop completes.

This allows us to keep the node "alive" in some way if its watchtower dies.

If we would want to send out over that channel while its list of pending commitment_revocations is not empty, then we refuse to perform that send out and fail it immediately (temporary_node_failure).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the use-case be for interactivity of hooks? Is there any reason for a plugin to somehow affect state update? Would not htlc_accepted (and maybe a future ptlc_accepted) be sufficient as a hook, since we will largely be manipulating HTLC/PTLCs in the future only? It seems to me more sensible to use a notification for commitment_revocation event, and use a hook for htlc_accepted.

The reason I like it being a hook is that if we decide that notifying watchtowers is as critical as it's assumed to be, we can make sure that the watchtower gets the penalty_tx before we make progress. This can be achieved by splitting the peer_got_revoke function, then sending the towire_channel_got_revoke_reply and save to DB only after the watchtower plugin gave us the all-clear. If the watchtower is currently unreachable it is perfectly fine for a plugin to return a {"result": "fail"} signalling that we must abort since proceeding could mean that our counterparty publishes this specific state when we're offline, and the watchtower cannot save us. Aborting and restarting means we get another chance later, since our peer will resend the last round of commitment and revocation, at which point the watchtower hopefully is operational again.

I'm not sure myself whether the watchtower hook is that important, but it's nice to have the option :-) 'm not proposing any other state modification, since those are better suited for htlc_accepted and ptlc_accepted as you correctly point out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason for an intermediate state. If we receive the revocation and don't complete all the hooks, we will ask the peer for it again, just as if we'd never received it.

@sr-gi
Copy link
Contributor

sr-gi commented Mar 24, 2020

[...] but this allows treating the penalty transaction as an opaque blob
and avoids parsing the transaction in the plugin. [...]

What extra information would be good to have?

If we are treating the penalty transaction as a blackbox, then the to_self_delay requested to the peer would be nice to have. It may be useful if accountability is to be offered to define early rejections. I'm open to discuss this as well since a minimum to_self_delay could be defined by the protocol and everything below that won't count towards accountability, allowing us to get rid of this field altogether.

- Would it make sense to also include htlc_success and htlc_timeout
transactions, despite not having the full information regarding fees and
aggregations available?

I had two approaches in mind regarding second stage htlc transactions:

  • Sending them as an additional field when sending an appointment to the tower (part of the encrypted blob).

If we sent them within the encrypted blob then the tower could start monitoring them after a breach + decryption. The main issue I can see with this approach is that a big chunk of data will be potentially replicated in the tower. Every new commitment will have it's pending htlcs attached to it, so several commitments may share htlcs. This makes the user waste some of the tower space he paid for.

  • Treating them in the same way commitment transactions are.

Second stage htlc transaction could be sent to the tower following the same approach as commitment-penalty transactions pairs are. The htlc_sucess / htlc_timeout transaction would follow the commitment_tx role, whereas a transaction spending from it (refund_tx?) will follow the penalty_tx role. Therefore, refund_tx can be encrpyted using a locator derived from htlc_sucess / htlc_timeout, and the tower will react if that's seen on the chain.

This approach has the advantage that each htlc is backed up only once.

@ZmnSCPxj
Copy link
Collaborator

ZmnSCPxj commented Mar 24, 2020

This approach has the advantage that each htlc is backed up only once.

Not quite: since the htlc_timeout / htlc_success commits to a particular commitment_tx via its input spend, the htlc_timeout/htlc_success transactions will change per commitment transaction, and if you do not want to leak too much info to the watchtower then we would have to re-give them at each commitment tx as well, so ---

This disappears if we are able to switch to SIGHASH_NOINPUT/SIGHASH_ANYPREVOUT for the second-stage HTLCs, but now you need to change your locator to be based on what SIGHASH_* signs instead of the txid that appears onchain.

@sr-gi
Copy link
Contributor

sr-gi commented Mar 24, 2020

Not quite: since the htlc_timeout / htlc_success commits to a particular commitment_tx via its input spend, the htlc_timeout/htlc_success transactions will change per commitment transaction,

Oh yeah, you're right. I forgot about the htlc_sigs exchanged on every update.

The first approach also has the issue of how the blobs grow between different updates, that may leak the number of htlcs in the blob though.

and if you do not want to leak too much info to the watchtower then we would have to re-give them at each commitment tx as well, so ---

Not sure I follow you here, this seems an issue for the first approach given how the blob grows, but not for this one. Even if an htlc is added/removed between updates and the user sends one appointment more/less in the next round that it did in the last round, given that all locators will be new, that should be indistinguishable from multiple channels being updated at the same time.

@ZmnSCPxj
Copy link
Collaborator

Not sure I follow you here

I was still commenting on your "This approach has the advantage that each htlc is backed up only once.", i.e. you still need to re-send the HTLC-timeout / HTLC-success at each commitment update, so you still need to re-send a lot of data per update, regardless if each HTLC has its own appointment or not.

In any case giving each HTLC-commitment pair its own appointment seems better than putting them all in a single var-length appointment, we can give an upper bound on blob size and the watchtower can charge based on that.

I was trying to devise a safe alternative where we use synthetic locators for the HTLC blobs as well but there is no safe alternative, sigh; it would require giving the watchtower the ability to sign for any penalty on the HTLC, which would allow the watchtower to be bribed by the counterparty by making the penalty on the HTLC split between the coutnerparty and the watchtower, thus not trustless.

@cdecker
Copy link
Member Author

cdecker commented Mar 24, 2020

If we are treating the penalty transaction as a blackbox, then the to_self_delay requested to the peer would be nice to have. It may be useful if accountability is to be offered to define early rejections. I'm open to discuss this as well since a minimum to_self_delay could be defined by the protocol and everything below that won't count towards accountability, allowing us to get rid of this field altogether.

Is having access to this really useful at all? It's a deadline by which the transaction has to be broadcast, and there is little to no leeway in how the transaction is being broadcast (there is no way of adding more fees at the moment). So isn't broadcasting as soon as the revoked commitment is being seen the only thing that can possibly be done? Even if it's too late, what else can a watchtower do?

I had two approaches in mind regarding second stage htlc transactions:

  • Sending them as an additional field when sending an appointment to the tower (part of the encrypted blob).

From this I gather that your watchtower would indeed work if we provide only the penalty_tx for now, and we can discuss including the htlc_* second stage txs in a later update, correct? I'm happy to do this as a first simple step, and then add incremental changes as we learn more about the uses and behaviours.

This disappears if we are able to switch to SIGHASH_NOINPUT/SIGHASH_ANYPREVOUT for the second-stage HTLCs, but now you need to change your locator to be based on what SIGHASH_* signs instead of the txid that appears onchain.

If we have either of those we get almost perfect watchtowers for free. eltoo's update_tx are constant in size, and allow late fee-binding and fee-bumping. You'll know this, but I just thought I'd reiterate this for the benefit of the casual reader 😉

We could also emulate SIGHASH_NOINPUT by just sending a new signature along with a reference to a prior htlc_* appointment. The watchtower could then just rebind the transaction and plug in the signature. It's not free like the SIGHASH_NOINPUT variant, but at least we'd minimize the transferred and stored data.

The first approach also has the issue of how the blobs grow between different updates, that may leak the number of htlcs in the blob though.

A simple "solution" would be to pad. However that's uncompressible data that the watchtower needs to store in addition to the simple penalty. We could always just treat htlc_success and htlc_penalty transactions identically to the penalty_tx, assuming the watchtower just treats the penalty_tx as a pre-canned reaction to a commitment tx (htlc_timeout would require a delay to be passed along though).

@sr-gi
Copy link
Contributor

sr-gi commented Mar 24, 2020

Is having access to this really useful at all? It's a deadline by which the transaction has to be broadcast, and there is little to no leeway in how the transaction is being broadcast (there is no way of adding more fees at the moment). So isn't broadcasting as soon as the revoked commitment is being seen the only thing that can possibly be done? Even if it's too late, what else can a watchtower do?

Yeah, not much to do for now. I was specially thinking on fee bumping and priority queues here, but I guess we could leave that for the future.

A simple "solution" would be to pad. However that's uncompressible data that the watchtower needs to store in addition to the simple penalty. We could always just treat htlc_success and htlc_penalty transactions identically to the penalty_tx, assuming the watchtower just treats the penalty_tx as a pre-canned reaction to a commitment tx (htlc_timeout would require a delay to be passed along though).

Agreed, I think that treating them in the same way as commitment/penalties is the best way to go for now.

@cdecker
Copy link
Member Author

cdecker commented Mar 24, 2020

Yeah, not much to do for now. I was specially thinking on fee bumping and priority queues here, but I guess we could leave that for the future.

If you expect it to be set I'll just add it, since it is quite easy to add anyway. Just seemed like a rather random fact to use in this interface 😉

Agreed, I think that treating them in the same way as commitment/penalties is the best way to go for now.

If you're using the commitment txid as a locator and decryption key all HTLCs could still get collated this way, but we could spread the HTLCs over a number of watchtowers, since they're likely less valuable than the their_unilateral/to_them output which'd be claimed by the penalty tx,

@ZmnSCPxj
Copy link
Collaborator

If we have either of those we get almost perfect watchtowers for free. eltoo's update_tx are constant in size, and allow late fee-binding and fee-bumping. You'll know this, but I just thought I'd reiterate this for the benefit of the casual reader 😉

Yes, but some might strongly prefer the punitive Poon-Dryja mechanism rather than the Decker-Russell-Osuntokun mechanism even post SIGHASH_ANYPREVOUT.

We could also emulate SIGHASH_NOINPUT by just sending a new signature along with a reference to a prior htlc_* appointment. The watchtower could then just rebind the transaction and plug in the signature. It's not free like the SIGHASH_NOINPUT variant, but at least we'd minimize the transferred and stored data.

That seems like a workable approach as well.

For a consistent interface, we could divide blobs into two classes:

  • Commitment-tx blobs.
  • HTLC-timeout/-success tx blobs.

An HTLC-timeout/-success tx blob would:

  • Have a randomly-generated 128-bit locator, which is a truncation of a randomly-generated 256-bit encryption key.
    • The node can use a hash of the HTLC details (hash, timelock, amount), as encryption key, then truncate to 128-bits for the locator.
  • Contain an HTLC-timeout/-success tx with a single input:
    • A "dummy" input txid of 00...
    • A "dummy" input txout of 65535 (or whatever maximum number of outputs is allowed on the commitment tx).
    • A "dummy" input witness stack containing a DER-encoding of a (0xFF..., 0xFFF...) invalid signature (or something that can be DER-compressed-encoded) plus the HTLC script.
  • Contain a penalty tx with a single input:
    • A "dummy" input txid of 00...
    • This will be filled in with the HTLC-timeout/-success txid.
    • A 0 input txnum.
    • A "dummy" input witness stack containing an invalid signature, plus the script output of the HTLC-timeout/HTLC-success.

A commitment-tx blob would:

  • Have a 128-bit locator which is the truncation of a 256-bit encryption key, which is the commitment txid.
  • Contain a penalty tx blah blah.
  • Contain a set of HTLC references, each of which is:
    • A 256-bit random encryption key that matches an existing HTLC-success/timeout blob locator.
    • A signature that would make the corresponding HTLC-success/timeout transaction valid.
    • A signature that would make the corresponding HTLC-success/timeout penalty transaction valid.

Still, this is less elegant than the simple transaction-reaction that we could do with the proposed scheme where every commitment-tx and every HTLC-success/timeout-tx gets its own blob stored in the watchtower.

@sr-gi
Copy link
Contributor

sr-gi commented Mar 25, 2020

Still, this is less elegant than the simple transaction-reaction that we could do with the proposed scheme where every commitment-tx and every HTLC-success/timeout-tx gets its own blob stored in the watchtower.

Agreed. Furthermore, from the tower perspective using the same approach for both makes the design cleaner, since any kind of transaction goes trough the same logic, and triggers (either old commitments or htlc-success/timeout) are threaten the same way.

@cdecker
Copy link
Member Author

cdecker commented Mar 26, 2020

Yes, but some might strongly prefer the punitive Poon-Dryja mechanism rather
than the Decker-Russell-Osuntokun mechanism even post SIGHASH_ANYPREVOUT.

I've definitely heard a few people expressing that preference, however I think
it is rather short-sighted, since the cost for all participants is complexity
(state updates need to be carefully choreographed, and fees commitments months
ahead of time), safety (backups are dangerous with LN-penalty), and inherent
limitations (limited to 2 parties). But I guess it's not you I need to be
convincing here :-)

@cdecker cdecker force-pushed the watchtower branch 2 times, most recently from 51b0edc to e087459 Compare March 30, 2020 18:53
@cdecker cdecker marked this pull request as ready for review March 30, 2020 18:53
@cdecker cdecker requested a review from ZmnSCPxj March 30, 2020 18:53
@cdecker
Copy link
Member Author

cdecker commented Mar 30, 2020

Ok, since no objections have been raised I think we can consider this a working solution. Rebased on top of master and marked ready for review 😉

@cdecker cdecker added this to the 0.8.2 milestone Mar 30, 2020
@cdecker cdecker force-pushed the watchtower branch 3 times, most recently from 80396ce to c27e01c Compare April 1, 2020 12:29
@rustyrussell
Copy link
Contributor

I was playing around with trying to neaten this a little, and I think your implementation is wrong?

Seems like openingd needs to tell lightnignd two things for the penalty tx: the remote tx, and which output is the to-self output for that. However, you didn't add the remote tx, you're still just giving lightningd the local tx, which doesn't work.

The penalty tx creation just needs remote commit txid, outpoint, and amount, and we need to add this to the db if we want to replay penalty creation (or the penalty tx directly, but saving this information is cheaper).

  1. I simplified your infra work: we only need the to-local out which makes things simpler.
  2. I implemented a penalty_base struct which we can hand around.

From here:

  1. Get openingd to return that struct (iff remote has an output).
  2. Hand it to channeld.
  3. Have channeld return that struct (iff remote has an output) for each revoke_and_ack.
  4. Save it to the database.

Now, once we've done all this, we might as well get channeld to generate the penalties as well, which means it needs the penalty feerate (and it needs to be kept up-to-date like the normal feerate).

This avoids some of your patches, and I think keeps it simple.

I'll push my work for you to look at...

Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've based this on your work; it's unfinished, but I think the result looks neater?

https://github.com/rustyrussell/lightning/tree/guilt/pr-3601

@@ -95,6 +95,7 @@ struct bitcoin_tx *commit_tx(const tal_t *ctx,
const struct htlc **htlcs,
const struct htlc ***htlcmap,
u64 obscured_commitment_number,
int output_index[NUM_SIDES],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit which adds this, you don't actually populate it. Why?

channeld/commit_tx.c Show resolved Hide resolved
* next finds the previous one to build the penalty transaction. */
tal_free(channel->prev_commitment);
channel->prev_commitment = channel->next_commitment;
channel->next_commitment = tal(channel, struct commitment_state);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird indent (spaces instead of tabs)?

But this commit leaves next_commitment non-NULL but invalid, which is weird.

@@ -109,6 +109,9 @@ msgdata,channel_got_funding_locked,next_per_commit_point,pubkey,

# When we send a commitment_signed message, tell master.
msgtype,channel_sending_commitsig,1020
msgdata,channel_sending_commitsig,commit_txid,bitcoin_txid,
msgdata,channel_sending_commitsig,commit_to_them_outnum,s16,
msgdata,channel_sending_commitsig,commit_to_them_amount,amount_sat,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use ?u16 and ?amount_msat here IMHO, to mean optional fields?

@cdecker
Copy link
Member Author

cdecker commented Apr 3, 2020

Seems like openingd needs to tell lightnignd two things for the penalty tx: the remote tx, and which output is the to-self output for that. However, you didn't add the remote tx, you're still just giving lightningd the local tx, which doesn't work.

That's a dumb mistake on my end, I'll fix that asap.

The penalty tx creation just needs remote commit txid, outpoint, and amount, and we need to add this to the db if we want to replay penalty creation (or the penalty tx directly, but saving this information is cheaper).

I agree that wrapping this information in a struct would cut down on the changes in the inter-daemon protocol and it'd be a much nicer bundle to hand around. I'll cherry pick the struct into the PR.

I'm not sure limiting the output categorization to the to_local output is too specific, and we'd need to re-add the to_remote categorization. Would it maybe make sense to encapsulate the commit_tx into a struct that fully annotates the transaction, and we can just unwrap the transaction once we don't need that information anymore. That would have avoided all the state drilling through function layers I had to do to surface these annotations to where we needed them. I'm ok with going the ad-hoc way for now, but let's keep the wrapper in mind should we need to shuttle more information to the outside.

From here:

  1. Get openingd to return that struct (iff remote has an output).
  2. Hand it to channeld.
  3. Have channeld return that struct (iff remote has an output) for each revoke_and_ack.
  4. Save it to the database.

Now, once we've done all this, we might as well get channeld to generate the penalties as well, which means it needs the penalty feerate (and it needs to be kept up-to-date like the normal feerate).

Like I wrote in the PR message I chose to construct the penalty tx in lightningd for two reasons:

  • Let information flow up the daemon tree, aggregating it in lightningd where we can make an informed decision about whether the penalty tx needs to be built (and signed) in the first place. This potentially saves us interactions with the hsmd if there is no watchtower plugin. It also minimizes shuttling information back and forth.
  • Don't have channeld require the hsmd capability to sign on-chain transactions.

Also if we need to store the commitment to_local outpoint we'd need to pass that back to channeld on init.

Re: storing the information in the database: I was unsure whether we'd require this, but I see now that there is a gap in my coverage, namely when the remote peer sent the revoke, but we didn't process it, then we won't get the previous commitment on reconnect, which I is what I had mistakenly assumed. Too bad I was hoping to avoid having to drag state around further.

@cdecker
Copy link
Member Author

cdecker commented Apr 3, 2020

Clean rebase on top of master, now cherry-picking things from Rusty's proposal (keeping penalty creation on lightningd for now, while we discuss the tradeoffs).

@cdecker cdecker force-pushed the watchtower branch 2 times, most recently from 3a56d42 to 06d4e0b Compare April 8, 2020 11:48
@cdecker
Copy link
Member Author

cdecker commented Apr 8, 2020

Rebased on top of #3626 since that broke some of the commit_txs calls.

Should be good to go now, tested it against all configurations in my build matrix, no new issues (just the ones addressed in #3636).

@ZmnSCPxj
Copy link
Collaborator

ZmnSCPxj commented Apr 9, 2020

The problem with this is that we only ever get a half-signed commitment TX

We could provide the commitment tx without witness data, all we needed was to see the commitment tx outputs in detail.

Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is confusing, because we do too much work in lightningd. Moving the work into channeld makes things simpler, IMHO.

But we definitely need a restart test to make sure...

msg, msg, &pbase))
channel_internal_error(channel,
"bad channel_resending_commitsig %s",
tal_hex(channel, msg));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think technically it's OK if we continue after calling channel_internal_error, but I'd prefer a return here...

# When we reconnect and resend the commitment we also need to tell the master
# the commitment details so it can create the penalty once we get the revoke.
msgtype,channel_resending_commitsig,1130
msgdata,channel_resending_commitsig,pbase,?penalty_base,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is in the wrong place, leading to us having to keep two values inside lightningd.

See 74c8a21

Have channeld keep the two values, give it at the right time (when we receive the revoke), and we don't need that logic in lightningd. We just need lightningd to save (the single, last pbase) to the db so it can hand it back to channeld when channeld restarts.

@cdecker cdecker removed this from the 0.8.2 milestone Apr 9, 2020
@cdecker
Copy link
Member Author

cdecker commented Apr 9, 2020

Removed the milestone since this isn't ready yet.

@cdecker
Copy link
Member Author

cdecker commented Apr 14, 2020

There is now an alternative implementation of this (without changing the public interface) in PR #3645

@cdecker cdecker force-pushed the watchtower branch 2 times, most recently from 44505c3 to 171d017 Compare April 21, 2020 14:43
@cdecker cdecker added this to the Next Release milestone Apr 27, 2020
The main daemon, and only the main daemon, needs the ability to sign a
transaction on behalf of a channel for the penalty transaction creation.
Changelog-Added: plugin: Added a new `commitment_revocation` hook that provides the plugin with penalty transactions for all revoked transactions.
Added a couple of invariants:

 - Commitment numberss are incremented by 1 on every commitment, so either two
   consecutive commitments differ by exactly 1 or one of the commitments isn't
   set (due to lack of funds on the other side).
 - Revocations must match the prev_commitment number if one was stashed away.
There might be a way we can get away with storing just one, but I couldn't
convince myself that either can be dropped.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants