Skip to content

bugfix payment lifecycle payment attempts#10125

Merged
guggero merged 4 commits intolightningnetwork:masterfrom
ziggie1984:fix-payment-send-local-chan
Aug 6, 2025
Merged

bugfix payment lifecycle payment attempts#10125
guggero merged 4 commits intolightningnetwork:masterfrom
ziggie1984:fix-payment-send-local-chan

Conversation

@ziggie1984
Copy link
Collaborator

@ziggie1984 ziggie1984 commented Aug 2, 2025

Problem description:

Looking through some logs of bigger node-runners I saw a lot of these messages:

 policy for local forward not satisfied

Making some local testing I realized that LND will easily end up in a 1 min endless loop always trying the same channel because local failures are currently not registered in the Mission-Control setup.

See here:

 2025-08-02 16:04:49.472 [ERR] HSWC: Link 182:1:0 policy for local forward not satisfied
2025-08-02 16:04:49.472 [ERR] CRTR: Failed sending attempt 1724 for payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745 to switch: htlc exceeds maximum policy amount
2025-08-02 16:04:49.472 [ERR] CRTR: Channel update of ourselves received
2025-08-02 16:04:49.472 [WRN] CRTR: Routing failure for local channel {<nil> 200111116320768} occurred
2025-08-02 16:04:49.472 [WRN] CRTR: Attempt 1724 for payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745 failed: htlc exceeds maximum policy amount
2025-08-02 16:04:49.534 [DBG] CRTR: Payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745: status=In Flight, active_shards=0, rem_value=200000000 mSAT, fee_limit=10000000 mSAT
2025-08-02 16:04:49.534 [DBG] CRTR: PaymentSession(c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745): pathfinding for amt=200000000 mSAT
2025-08-02 16:04:49.535 [DBG] CRTR: Pathfinding absolute attempt cost: 300 sats
2025-08-02 16:04:49.535 [DBG] CRTR: Found route: probability=0.95, hops=1, fee=0 mSAT
2025-08-02 16:04:49.535 [DBG] CRTR: Pathfinding perf metrics: nodes=1, edges=1, time=99.334µs
2025-08-02 16:04:49.576 [DBG] CRTR: Sending HTLC attempt(id=1725, total_amt=200000000 mSAT, first_hop_amt={200000000}) for payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745
2025-08-02 16:04:49.576 [WRN] HSWC: ChannelLink(ff0e5465034e8184a841bb7fe910d532100cd9a95952355896b0a702016d55fb:0): outgoing htlc(c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745) is too large: max_htlc=20000000 mSAT, htlc_value=200000000 mSAT
2025-08-02 16:04:49.576 [ERR] HSWC: Link 182:1:0 policy for local forward not satisfied
2025-08-02 16:04:49.576 [ERR] CRTR: Failed sending attempt 1725 for payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745 to switch: htlc exceeds maximum policy amount
2025-08-02 16:04:49.576 [ERR] CRTR: Channel update of ourselves received
2025-08-02 16:04:49.576 [WRN] CRTR: Routing failure for local channel {<nil> 200111116320768} occurred
2025-08-02 16:04:49.576 [WRN] CRTR: Attempt 1725 for payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745 failed: htlc exceeds maximum policy amount
2025-08-02 16:04:49.640 [DBG] CRTR: Payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745: status=In Flight, active_shards=0, rem_value=200000000 mSAT, fee_limit=10000000 mSAT
2025-08-02 16:04:49.640 [DBG] CRTR: PaymentSession(c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745): pathfinding for amt=200000000 mSAT
2025-08-02 16:04:49.640 [DBG] CRTR: Pathfinding absolute attempt cost: 300 sats
2025-08-02 16:04:49.640 [DBG] CRTR: Found route: probability=0.95, hops=1, fee=0 mSAT
2025-08-02 16:04:49.640 [DBG] CRTR: Pathfinding perf metrics: nodes=1, edges=1, time=96.125µs
2025-08-02 16:04:49.683 [DBG] CRTR: Sending HTLC attempt(id=1726, total_amt=200000000 mSAT, first_hop_amt={200000000}) for payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745
2025-08-02 16:04:49.683 [WRN] HSWC: ChannelLink(ff0e5465034e8184a841bb7fe910d532100cd9a95952355896b0a702016d55fb:0): outgoing htlc(c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745) is too large: max_htlc=20000000 mSAT, htlc_value=200000000 mSAT

Problem was introduced here:

#9049

but only since #8390 (LND 19) this bug would be triggered.

Code part:

if bandwidthHints.firstHopCustomBlob().IsNone() &&

Codewise the following happens:

  1. Payment is sent (we do not check for the policy for the payment)
    if bandwidthHints.firstHopCustomBlob().IsNone() &&

because the EndorsementBit is always set in the payment firstHop data here:

if r.ShouldSetExpEndorsement() {
if payIntent.FirstHopCustomRecords == nil {
payIntent.FirstHopCustomRecords = make(
map[uint64][]byte,
)
}
t := uint64(lnwire.ExperimentalEndorsementType)
if _, set := payIntent.FirstHopCustomRecords[t]; !set {
payIntent.FirstHopCustomRecords[t] = []byte{
lnwire.ExperimentalUnendorsed,
}
}
}

  1. When the payment fails we do not update any mission control so we keep trying the same channel again and again:

func (i *interpretedResult) processPaymentOutcomeSelf(rt *mcRoute,
failure lnwire.FailureMessage) {
switch failure.(type) {
// We receive a malformed htlc failure from our peer. We trust ourselves
// to send the correct htlc, so our peer must be at fault.
case *lnwire.FailInvalidOnionVersion,
*lnwire.FailInvalidOnionHmac,
*lnwire.FailInvalidOnionKey:
i.failNode(rt, 1)
// If this was a payment to a direct peer, we can stop trying.
if len(rt.hops.Val) == 1 {
i.finalFailureReason = &reasonError
}
// Any other failure originating from ourselves should be temporary and
// caused by changing conditions between path finding and execution of
// the payment. We just retry and trust that the information locally
// available in the link has been updated.
default:
log.Warnf("Routing failure for local channel %v occurred",
rt.hops.Val[0].channelID)
}
}

  1. => eventually failing the payment after the timeout period.

req.TimeoutSeconds = DefaultPaymentTimeout

This is just a preliminary commit to discuss potential fixes:

  1. Make sure we only skip the amount check when the TrafficShaper signals it
  2. Remove Policy check for sends
  3. Other options ?

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @ziggie1984, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a bug in the payment lifecycle related to payment attempts. The core change involves modifying the HTLC (Hashed Timelock Contract) forwarding logic to bypass a specific policy check for locally forwarded payments, which was presumably causing incorrect behavior or failures.

Highlights

  • HTLC Policy Check Bypass: I've identified that the CheckHtlcTransit policy validation for outgoing HTLCs on local links within htlcswitch/switch.go has been commented out. This change, affecting lines 925-935, effectively disables the outgoing channel policy check for locally forwarded HTLCs. This is likely a targeted fix to address a bug where this specific check was causing issues or was incorrectly applied to local payment attempts.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request comments out a block of code responsible for validating locally-initiated payments against the outgoing channel's policy. While the title suggests this is a bugfix, the lack of a description and the use of commented-out code makes the intent unclear. My primary concern is that this change weakens validation and could lead to inefficient payment attempts that are rejected by peers. I've left a comment asking for clarification and suggesting that commented-out code should be avoided in favor of removal with explanation.

@ziggie1984 ziggie1984 added this to the v0.19.3 milestone Aug 2, 2025
@ziggie1984 ziggie1984 self-assigned this Aug 2, 2025
@ziggie1984 ziggie1984 requested review from Roasbeef and guggero August 2, 2025 15:25
@ziggie1984 ziggie1984 force-pushed the fix-payment-send-local-chan branch from 9c8ae52 to 7b03809 Compare August 2, 2025 15:34
@ziggie1984
Copy link
Collaborator Author

We need to find a proper way how we are going to fix this, because I am not sure when Tapd needs to skip this check, so open for suggestions how to carve-out the check here. cc @guggero

@ziggie1984 ziggie1984 marked this pull request as ready for review August 2, 2025 15:37
@ziggie1984 ziggie1984 force-pushed the fix-payment-send-local-chan branch 3 times, most recently from 403478d to e981bb5 Compare August 5, 2025 15:23
@saubyk saubyk linked an issue Aug 5, 2025 that may be closed by this pull request
@ziggie1984 ziggie1984 force-pushed the fix-payment-send-local-chan branch from e981bb5 to a2506e4 Compare August 5, 2025 16:52
@ziggie1984 ziggie1984 requested a review from guggero August 5, 2025 16:52
@ziggie1984 ziggie1984 force-pushed the fix-payment-send-local-chan branch from a2506e4 to 51147e6 Compare August 5, 2025 16:54
@ziggie1984
Copy link
Collaborator Author

@gemini-code-assist review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug where LND could enter an endless loop trying to send a payment that violates a local channel policy, such as max_htlc. The fix involves refactoring the HTLC amount validation to correctly identify and skip policy checks for custom HTLCs, preventing them from being caught in this loop. A new integration test is added to cover this scenario. My review focuses on style guide adherence and potential improvements in error handling. I've found a minor style guide violation in a function signature, a typo in a comment, and a case of silent error handling that could be improved with logging.

@ziggie1984 ziggie1984 force-pushed the fix-payment-send-local-chan branch 2 times, most recently from b691344 to 4ed1231 Compare August 5, 2025 17:47
@Roasbeef
Copy link
Member

Roasbeef commented Aug 6, 2025

2025-08-02 16:04:49.576 [WRN] CRTR: Attempt 1725 for payment c5c98d82996f4a03d73aa85e10831ef6c0e2bf3b498af455fd073cf6d5c3d745 failed: htlc exceeds maximum policy amount

Shouldn't this be prevented as we check up front if we can transit a link before sending?

EDIT: nvm that's only called after we send a payment down into the link

@Roasbeef
Copy link
Member

Roasbeef commented Aug 6, 2025

Re the above, perhaps this is caused a divergence between the local max HTLC value and the value that's set with the routing policy?

We'll skip adding an edge if it exceeds the max amt policy for a given channel:

// Skip channels for which this htlc is too large.
if u.policy.MessageFlags.HasMaxHtlc() &&
amt > u.policy.MaxHTLC {
log.Tracef("Exceeds policy's MaxHTLC: amt=%v, MaxHTLC=%v",
amt, u.policy.MaxHTLC)
return false
}
.

So I think my hypothesis above may be correct. The channel update max HTLC can be changed, but the link level version can't until something like dynamic commitments exists.

@Roasbeef
Copy link
Member

Roasbeef commented Aug 6, 2025

because the EndorsementBit is always set in the payment firstHop data here:

Ah ok, I missed this bit (🥁).

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐡

Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks a lot for the fix! Just a few small suggestions.

@ziggie1984 ziggie1984 force-pushed the fix-payment-send-local-chan branch from 4ed1231 to ddb21a9 Compare August 6, 2025 07:35
@ziggie1984 ziggie1984 force-pushed the fix-payment-send-local-chan branch from ddb21a9 to a19e7f2 Compare August 6, 2025 07:37
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

@guggero guggero merged commit e512770 into lightningnetwork:master Aug 6, 2025
38 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants