Endorse htlc and local reputation #2716

thomash-acinq · 2023-07-21T13:53:16Z

We do not yet drop HTLCs, the purpose is to collect data first.

We add

An endorsement bit to UpdateAddHtlc. This follows blip-0004: experimental endorsement signaling in update_add_htlc lightning/blips#27.
A local reputation system: For each pair (origin node, endorsement value), we compute its reputation as total fees that were paid divided by total fees that would have been paid if all HTLCs had fulfilled. When considering an HTLC to relay, we only forward it if the reputation of its source is higher than the occupancy of the outgoing channel.
A limit on the number of small HTLCs per channel. We allow just very few small HTLCs per channel so that it's not possible to block large HTLCs using only small HTLCs (similar to Add a channel congestion control mechanism #2330 but continuous).

codecov-commenter · 2023-07-21T14:16:35Z

Codecov Report

Merging #2716 (deab085) into master (12adf87) will increase coverage by 0.04%.
Report is 1 commits behind head on master.
The diff coverage is 100.00%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##           master    #2716      +/-   ##
==========================================
+ Coverage   85.82%   85.86%   +0.04%     
==========================================
  Files         216      218       +2     
  Lines       18126    18209      +83     
  Branches      771      749      -22     
==========================================
+ Hits        15556    15636      +80     
- Misses       2570     2573       +3

Files	Coverage Δ
...re/src/main/scala/fr/acinq/eclair/NodeParams.scala	`93.47% <100.00%> (+0.08%)`	⬆️
...ir-core/src/main/scala/fr/acinq/eclair/Setup.scala	`75.29% <100.00%> (+0.14%)`	⬆️
...in/scala/fr/acinq/eclair/channel/ChannelData.scala	`100.00% <ø> (ø)`
...in/scala/fr/acinq/eclair/channel/Commitments.scala	`96.93% <100.00%> (+0.11%)`	⬆️
...ain/scala/fr/acinq/eclair/channel/Monitoring.scala	`96.15% <100.00%> (+0.23%)`	⬆️
...in/scala/fr/acinq/eclair/channel/fsm/Channel.scala	`85.80% <100.00%> (+0.15%)`	⬆️
...ain/scala/fr/acinq/eclair/payment/Monitoring.scala	`98.30% <100.00%> (+0.09%)`	⬆️
.../scala/fr/acinq/eclair/payment/PaymentPacket.scala	`90.82% <100.00%> (ø)`
...a/fr/acinq/eclair/payment/relay/ChannelRelay.scala	`96.03% <100.00%> (+0.16%)`	⬆️
...fr/acinq/eclair/payment/relay/ChannelRelayer.scala	`100.00% <100.00%> (ø)`
... and 9 more

... and 3 files with indirect coverage changes

eclair-core/src/main/resources/reference.conf

eclair-core/src/main/scala/fr/acinq/eclair/channel/ChannelData.scala

eclair-core/src/main/scala/fr/acinq/eclair/wire/protocol/HtlcTlv.scala

t-bast

Thanks, it's now more clear to me how we assign reputation to our peers. I've made a few comments on the code itself, some of which (the easy ones) I fixed in #2893.

The reputation algorithm itself looks good to me, let's try it out and see what results we get in practice and during simulations.

However, I don't think the way we interact with the reputation recorder makes the most sense.
You are storing a relay attempt as soon as we start relaying, before we know whether we actually send HTLCs out or not.
This leads to the weird CancelRelay command and an inconsistency between channel relay and trampoline relay.
In the trampoline case, if we can't find a route or can't send outgoing HTLCs, we will treat this as a failure, which is incorrect.
This can probably even be used to skew our reputation algorithm.
It's also pretty invasive, especially in the NodeRelay component...

It seems to me that it would make more sense if we implemented the following flow:

Once we start relaying (ChannelRelay / NodeRelay), we obtain the confidence value with GetConfidence and will include it in CMD_ADD_HTLC.
At that point, we DON'T update the reputation to take this payment into account, because we don't know yet if it will be relayed.
In Channel.scala, when we actually send an outgoing UpdateAddHtlc, we emit an OutgoingHtlcAdded event to the event stream, that contains the outgoing HTLC and its Origin.Hot.
In Channel.scala, when an outgoing HTLC is failed or fulfilled, we emit an OutgoingHtlcFailed / OutgoingHtlcFulfilled event to the event stream.
The reputation recorder listens to those events, and updates the internal reputation state accordingly.
We don't use the relayId but rather the outgoing channel_id and htlc_id, combined with the origin to group HTLCs.
For trampoline payments, since the reputation recorder has the Origin information, it can wait for all outgoing HTLCs to be settled to correctly account for the fees / timestamps.

I believe this better matches what we're trying to accomplish: the only thing the reputation recorder actually needs to know to update reputation is when outgoing HTLCs are sent and when they're settled.
It also provides more accurate relay data to ensure we're updating the reputation correctly, and has much less impact on the ChannelRelay / NodeRelay actors (which should simplify testing).

Can you try that, or let me know if you think that it wouldn't be as good as the currently implemented flow?

eclair-core/src/main/resources/reference.conf

eclair-core/src/main/scala/fr/acinq/eclair/reputation/Reputation.scala

eclair-core/src/main/scala/fr/acinq/eclair/payment/relay/ChannelRelay.scala

eclair-core/src/main/scala/fr/acinq/eclair/Setup.scala

eclair-core/src/main/scala/fr/acinq/eclair/reputation/ReputationRecorder.scala

eclair-core/src/main/scala/fr/acinq/eclair/channel/Commitments.scala

thomash-acinq · 2024-08-02T13:08:29Z

The reason for the weird CancelRelay is that we need to take into account pending HTLCs in the reputation. If we receive two HTLCs at once we don't want both of them to enjoy the same reputation, the second one should be penalized. If we only update the reputation after we took our decision to relay, we can get a data race.

t-bast · 2024-08-02T13:14:53Z

The reason for the weird CancelRelay is that we need to take into account pending HTLCs in the reputation. If we receive two HTLCs at once we don't want both of them to enjoy the same reputation, the second one should be penalized. If we only update the reputation after we took our decision to relay, we can get a data race.

But you're not doing this for trampoline relay, so that race can already be exploited anyway? I don't think this matters much in practice though, because:

that race is hard to exploit, because between the call to the ReputationRecorder and the outgoing HTLC, there will be at most a few milliseconds
exploiting that race requires ensuring that we receive the incoming update_add_htlc at exactly the same time, and network delays cannot be trivially manipulated
at some point we will add a randomized delay before forwarding HTLCs, because it's good for privacy (and was discussed in Oakland) which will make this race almost impossible to exploit

thomash-acinq · 2024-08-08T15:17:42Z

1. Once we start relaying (`ChannelRelay` / `NodeRelay`), we obtain the confidence value with `GetConfidence` and will include it in `CMD_ADD_HTLC`.

2. At that point, we DON'T update the reputation to take this payment into account, because we don't know yet if it will be relayed.

3. In `Channel.scala`, when we actually send an outgoing `UpdateAddHtlc`, we emit an `OutgoingHtlcAdded` event to the event stream, that contains the outgoing HTLC and its `Origin.Hot`.

4. In `Channel.scala`, when an outgoing HTLC is failed or fulfilled, we emit an `OutgoingHtlcFailed` / `OutgoingHtlcFulfilled` event to the event stream.

5. The reputation recorder listens to those events, and updates the internal reputation state accordingly.

6. We don't use the `relayId` but rather the outgoing `channel_id` and `htlc_id`, combined with the origin to group HTLCs.

7. For trampoline payments, since the reputation recorder has the `Origin` information, it can wait for all outgoing HTLCs to be settled to correctly account for the fees / timestamps.

I've tried doing that in #2897.
For channel relays it works fine, however for trampoline I'm running into some problems:

We can't wait for for outgoing HTLCs to be settled to be able to compute the fees, we need to update the reputation as soon as we start relaying and for that we need to know the fees.
Even when all HTLCs associated to a trampoline relay fail, it's not necessarily the end of this relay because we may retry.

It seems to me that solving this would require adding more complexity than this refactoring was removing.

t-bast · 2024-08-09T12:32:06Z

We can't wait for for outgoing HTLCs to be settled to be able to compute the fees, we need to update the reputation as soon as we start relaying and for that we need to know the fees.

It seems to me that we're trying to make trampoline fit into a box where it actually doesn't fit. One important aspect to trampoline is that the sender does not choose the outgoing channels and does not choose the fees, they allocate a total fee budget for the trampoline node which tries to relay within that fee budget. The trampoline node will ensure that it earns at least its usual channel routing fees, otherwise it won't relay the payment. If the trampoline node is well connected, or the sender over-allocated fees, the trampoline node earns more fees than its usual routing fees: but I'm not sure that this extra fee should count in the reputation?

So I think we could handle trampoline relays in a simplified way that gets rid of those issues, by using the channel routing fees instead of trying to take the extra trampoline fees into account: when sending an outgoing HTLC with a trampoline origin, the fees we allocate to it in the reputation algorithm should just be this outgoing channel's routing fees (which can be included in the relay event since we have access to our channel update in the channel data).

If the payment succeeds, if we want to give a bonus reputation if we earned more fees than our channel routing fees, this should be easy to do as well, by splitting the extra fee between all the outgoing channels? But I'm not sure we should do this, because we can't really match an outgoing HTLC to a specific incoming channel 1-to-1, so it's probably better to just count our channel routing fees?

Do you think that model would make sense, or am I missing something?

thomash-acinq · 2024-08-09T12:46:43Z

It seems like a good solution indeed, I'll try it.

…sement payments

t-bast

I haven't looked at the logic inside Reputation.scala and ReputationRecorder.scala yet, but I've reviewed the interaction with the existing actors and this is nicely non-invasive, looks mostly good to me 👍

docs/release-notes/eclair-vnext.md

eclair-core/src/main/scala/fr/acinq/eclair/channel/ChannelData.scala

eclair-core/src/main/scala/fr/acinq/eclair/channel/Commitments.scala

eclair-core/src/main/scala/fr/acinq/eclair/reputation/package.scala

eclair-core/src/main/scala/fr/acinq/eclair/channel/ChannelData.scala

eclair-core/src/test/scala/fr/acinq/eclair/channel/states/e/NormalStateSpec.scala

eclair-core/src/test/scala/fr/acinq/eclair/payment/relay/RelayerSpec.scala

eclair-core/src/test/scala/fr/acinq/eclair/payment/relay/ChannelRelayerSpec.scala

t-bast

Looks good on the concept, comments are mostly about code and architecture. During yesterday's spec meeting, Carla asked that you write a gist detailing the steps of your reputation tracking algorithm in english/pseudo-code, which will let them compare it to what they're doing and verify that the implementation correctly matches the high-level algorithm. Can you create a public gist for this?

eclair-core/src/main/scala/fr/acinq/eclair/reputation/Reputation.scala

eclair-core/src/main/scala/fr/acinq/eclair/reputation/ReputationRecorder.scala

eclair-core/src/test/scala/fr/acinq/eclair/reputation/ReputationRecorderSpec.scala

eclair-core/src/test/scala/fr/acinq/eclair/reputation/ReputationSpec.scala

eclair-core/src/main/scala/fr/acinq/eclair/reputation/Reputation.scala

thomash-acinq force-pushed the endorse-htlc branch from 7d53a40 to 525f547 Compare July 24, 2023 14:11

thomash-acinq force-pushed the endorse-htlc branch from 525f547 to 14b6dd1 Compare August 10, 2023 12:12

t-bast added the Optech Make Me Famous! label Sep 11, 2023

thomash-acinq force-pushed the endorse-htlc branch from 14b6dd1 to deab085 Compare November 3, 2023 14:17

thomash-acinq force-pushed the endorse-htlc branch from deab085 to b8b5cb1 Compare November 14, 2023 16:11

thomash-acinq force-pushed the endorse-htlc branch from b8b5cb1 to 3b4b506 Compare April 17, 2024 09:43

thomash-acinq force-pushed the endorse-htlc branch from 3b4b506 to 7ae571d Compare June 20, 2024 16:01

thomash-acinq force-pushed the endorse-htlc branch 2 times, most recently from ff97580 to 73efca3 Compare July 4, 2024 14:59

thomash-acinq force-pushed the endorse-htlc branch 2 times, most recently from 755ba80 to fa2e536 Compare July 5, 2024 13:49

thomash-acinq marked this pull request as ready for review July 5, 2024 14:23

thomash-acinq requested a review from t-bast July 5, 2024 14:23

thomash-acinq force-pushed the endorse-htlc branch from 870796d to b79b298 Compare July 17, 2024 07:46

t-bast reviewed Jul 17, 2024

View reviewed changes

t-bast mentioned this pull request Jul 19, 2024

Add a channel congestion control mechanism #2330

Closed

thomash-acinq force-pushed the endorse-htlc branch from b79b298 to 83a5627 Compare July 31, 2024 14:45

t-bast mentioned this pull request Aug 2, 2024

Allow disabling the reputation recorder #2893

Merged

t-bast reviewed Aug 2, 2024

View reviewed changes

thomash-acinq force-pushed the endorse-htlc branch 2 times, most recently from 527cc06 to 59d312b Compare June 23, 2025 14:47

thomash-acinq requested a review from t-bast June 23, 2025 14:48

thomash-acinq added 2 commits June 30, 2025 10:53

Add local reputation

ae4bce2

Reputation is recorded from channel events

3cc68f0

thomash-acinq added 3 commits June 30, 2025 10:53

Use lower endorsement payments to increase confidence of higher endor…

a2fa398

…sement payments

Bidirectional reputation

96b5205

Polish

a4d9560

thomash-acinq force-pushed the endorse-htlc branch from 59d312b to a4d9560 Compare June 30, 2025 08:57

t-bast reviewed Jun 30, 2025

View reviewed changes

t-bast reviewed Jul 1, 2025

View reviewed changes

thomash-acinq added 3 commits July 4, 2025 16:29

comments

1f890a5

Immutable reputation

8755a5c

comments

6c5899e

t-bast reviewed Jul 7, 2025

View reviewed changes

thomash-acinq added 3 commits July 7, 2025 19:06

nits

4625e1e

Remove outgoing reputation

38801c5

imports

4dc871d

thomash-acinq requested a review from t-bast July 9, 2025 16:30

t-bast approved these changes Jul 10, 2025

View reviewed changes

thomash-acinq merged commit 43c3986 into master Jul 11, 2025
1 of 2 checks passed

average-gary mentioned this pull request Jul 21, 2025

DC BitDevs #18 Topic Suggestion dcbitdevs/dcbitdevs.github.io#39

Closed

t-bast deleted the endorse-htlc branch July 31, 2025 09:22

Endorse htlc and local reputation #2716

Endorse htlc and local reputation #2716

Uh oh!

Conversation

thomash-acinq commented Jul 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Jul 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

t-bast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thomash-acinq commented Aug 2, 2024

Uh oh!

t-bast commented Aug 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomash-acinq commented Aug 8, 2024

Uh oh!

t-bast commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomash-acinq commented Aug 9, 2024

Uh oh!

t-bast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

t-bast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thomash-acinq commented Jul 21, 2023 •

edited

Loading

codecov-commenter commented Jul 21, 2023 •

edited

Loading

t-bast commented Aug 2, 2024 •

edited

Loading

t-bast commented Aug 9, 2024 •

edited

Loading