Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection and channel handshake fails with non-batching transactions #1971

Closed
5 tasks
ancazamfir opened this issue Mar 16, 2022 · 10 comments · Fixed by #2163
Closed
5 tasks

Connection and channel handshake fails with non-batching transactions #1971

ancazamfir opened this issue Mar 16, 2022 · 10 comments · Fixed by #2163
Assignees
Labels
A: bug Admin: something isn't working
Milestone

Comments

@ancazamfir
Copy link
Collaborator

Summary of Bug

With single message per Tx (i.e. max_msg_num = 1 in config) connection and channel creation fails

$ hermes create connection ibc-0 ibc-1
...
2022-03-16T15:43:14.842710Z ERROR ThreadId(35) send_tx_commit{id=ConnectionOpenTry}:send_tx{id=ibc-1}:estimate_gas: failed to simulate tx. propagating error to caller: gRPC call failed with status: status: InvalidArgument, message: "failed to execute message; message index: 0: connection handshake open try failed: failed connection state verification for client (07-tendermint-2): client state height < proof height ({0 799} < {0 801}), please ensure the client has been updated: invalid height: invalid request", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }
2022-03-16T15:43:14.842869Z ERROR ThreadId(01) failed ConnTry ConnectionSide { chain: BaseChainHandle { chain_id: ChainId { id: "ibc-1", version: 1 }, runtime_sender: Sender { .. } }, client_id: ClientId("07-tendermint-2"), connection_id: None }: failed during a transaction submission step to chain id ibc-1: gRPC call failed with status: status: InvalidArgument, message: "failed to execute message; message index: 0: connection handshake open try failed: failed connection state verification for client (07-tendermint-2): client state height < proof height ({0 799} < {0 801}), please ensure the client has been updated: invalid height: invalid request", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }

Packet relaying still works as we check for this particular failure during simulation.

Version

master and all recent ones

Steps to Reproduce

Acceptance Criteria


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate milestone (priority) applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@soareschen
Copy link
Contributor

I can reproduce this error with #1980.

I have also tried checking if it is CachingChainHandle that caused the issue, but it does not appear to be so. Therefore the error might be from elsewhere.

@soareschen soareschen added the A: bug Admin: something isn't working label Mar 17, 2022
@soareschen soareschen added this to the v0.14.0 milestone Mar 17, 2022
@adizere
Copy link
Member

adizere commented Mar 29, 2022

Should we simply make max_msg_num a domain type and enforce that it has a value > 1 ?

@ancazamfir
Copy link
Collaborator Author

ancazamfir commented May 2, 2022

Should we simply make max_msg_num a domain type and enforce that it has a value > 1 ?

Why can't we fix it in the same way we do for packet case? I am asking because i sometime use max_msg_num = 1 for testing purposes.

@adizere adizere moved this from Needs PR to Needs Review in IBC-rs: the road to v1 May 3, 2022
@romac
Copy link
Member

romac commented May 3, 2022

Why can't we fix it in the same way we do for packet case?

What do you mean by that?

@ancazamfir
Copy link
Collaborator Author

ancazamfir commented May 3, 2022

Why can't we fix it in the same way we do for packet case?

What do you mean by that?

For packet relaying we see the same error if using gaia v6.0.x. I think it's not there for gaia v7.0.x.
We used to handle the packet case by checking the simulation error, see: https://github.com/informalsystems/ibc-rs/blob/9bad6e2dedd462bd8e35538c9f0540e7f443e36f/relayer/src/chain/cosmos/estimate.rs#L148
For that specific client failure we would use the default gas. So my comment was to use the same approach. But...

I just checked master and this is not the case anymore so if a set of packet messages is split in two Tx-es, the second will fail. Packets will never be cleared. I also noticed that the connection setup also hits the same check above. But for some reason this doesn't catch the error anymore. I think if we fix this we solve both the connection/channel and packet problems. And we don't have to restrict the max_msg_num as in #2163

@ancazamfir
Copy link
Collaborator Author

These diffs seem to fix the issues:

diff --git a/relayer/src/error.rs b/relayer/src/error.rs
index 9fa4b160..2234ad26 100644
--- a/relayer/src/error.rs
+++ b/relayer/src/error.rs
@@ -513,7 +513,7 @@ impl GrpcStatusSubdetail {
     /// - status: InvalidArgument
     /// - message: verification failed: ... failed packet acknowledgement verification for client: client state height < proof height ...
     pub fn is_client_state_height_too_low(&self) -> bool {
-        if self.status.code() != tonic::Code::InvalidArgument {
+        if self.status.code() != tonic::Code::Unknown {
             return false;
         }

Still need to figure out if this would work for gaia versions before v6.0.x.

@romac
Copy link
Member

romac commented May 4, 2022

@ancazamfir Neat! Can you go into a bit more detail as to why this solves the problem?

@adizere
Copy link
Member

adizere commented May 4, 2022

These diffs seem to fix the issues:

While testing #2167 I'm encountering a similar problem that has this same fix. It seems gaiad v6.0.1 uses tonic::Code::InvalidArgument whereas v6.0.4 uses tonic::Code::Unknown.

More concretely, with v6.0.4 we get:

2022-05-04T09:51:57.482661Z ERROR ThreadId(11) send_tx_commit{id=nWl0l6enWK}:send_tx_with_account_sequence_retry{id=ibc-0}:estimate_gas: failed to simulate tx. propagating error to caller: gRPC call failed with status: status: Unknown, message: "failed to execute message; message index: 0: receive packet verification failed: couldn't verify counterparty packet commitment: failed packet commitment verification for client (07-tendermint-0): client state height < proof height ({1 12} < {1 19}), please ensure the client has been updated: invalid height", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "20"} }
Error: link error: failed with underlying error: [skipped]

With v6.0.1 we get:

2022-05-04T09:46:19.064742Z WARN ThreadId(11) send_tx_commit{id=h5m5kQx361}:send_tx_with_account_sequence_retry{id=ibc-0}:estimate_gas: failed to simulate tx, falling back on default gas because the error is potentially recoverable: gRPC call failed with status: status: InvalidArgument, message: "failed to execute message; message index: 0: receive packet verification failed: couldn't verify counterparty packet commitment: failed packet commitment verification for client (07-tendermint-0): client state height < proof height ({1 15} < {1 34}), please ensure the client has been updated: invalid height: invalid request", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }
2022-05-04T09:46:19.065096Z DEBUG ThreadId(11) send_tx_commit{id=h5m5kQx361}:send_tx_with_account_sequence_retry{id=ibc-0}: send_tx: using 10000000 gas, fee Fee { amount: "100001stake", gas_limit: 10000000 } id=ibc-0

I will do these changes Anca highlighted in the diff above directly in PR #2167.

Gaia v7.0.1 uses the same as 6.0.1.

Furthermore, we noticed a similar change from InvalidArgument -> Unknown in the account sequence mismatch case. Ref: https://github.com/informalsystems/ibc-rs/pull/1349/files#diff-d18b8c452b8118832b4112538d034e9490cf2c30ab35c42d2e19d2f842131e14

More details about upstream changes in Gaia/SDK

Gaiad v6.0.1 uses sdk 0.44.5
Gaiad v6.0.4 uses sdk 0.44.6
that's a patch version change. I had a quick look through the changelog but didn't find anything.

@ancazamfir
Copy link
Collaborator Author

Maybe in the fix we should relax the self.status.code() check so we cover multiple gaia versions?

@adizere
Copy link
Member

adizere commented May 5, 2022

Maybe in the fix we should relax the self.status.code() check so we cover multiple gaia versions?

This is the approach I took 7fd032b. Part of #2167.
Let me know if you think something else would be more appropriate.

romac added a commit that referenced this issue May 11, 2022
… (#2163)

* Enforce a min bound of 2 on `max_msg_num`

* Add a constructor for `MaxMsgNum`

* Add constructors for `MaxTxSize` and `Memo`

* Add changelog entry

* Add parsing tests for config domain types

* Relax min bound on MaxMsgNum to 1

* Remove useless dead_code allow

* Added workaround to tonic::code inconsistencies.

Ref:
#1971 (comment)

* Update changelog entries

* Re-add allow dead_code + comment

* Update changelog

Co-authored-by: Adi Seredinschi <[email protected]>
Repository owner moved this from Needs Review to Closed in IBC-rs: the road to v1 May 11, 2022
adizere added a commit that referenced this issue May 13, 2022
* Added docs for unreceived_acknowledgements and unreceived_packets

* Flipped order of sentences

* First iteration for #2087; packet-recv works, packet-ack left

* Incremental processing of packet-ack relaying

* Added consensus_state_height_bounded with optimistic path

* Nits

* Simplified progress tracking

* Comment cleanup

* fmt and better logs

* Document the alternative path in consensus_state_height_bounded

* renamed method to make it more explicit h/t SeanC

* better var name

* Added workaround to tonic::code inconsistencies.

Ref:
#1971 (comment)

* refactored the packet query methods

* Refactored CLIs for incremental relaying

* Experiment

* Fix for packet send for non-zero delay

* Using app status instead of network status

* fmt

* Undoing unnecessary chenges on foreign_client.rs

* Removed outdated comments

* Apply suggestions from code review

Co-authored-by: Anca Zamfir <[email protected]>

* Small refactor in cli.rs

* Address review comments

* Remove duplicate code

* Undo one-chain change

* Fix documentation

* Changelog

Co-authored-by: Anca Zamfir <[email protected]>
Co-authored-by: Anca Zamfir <[email protected]>
hu55a1n1 pushed a commit to hu55a1n1/hermes that referenced this issue Sep 13, 2022
…ormalsystems#1971 (informalsystems#2163)

* Enforce a min bound of 2 on `max_msg_num`

* Add a constructor for `MaxMsgNum`

* Add constructors for `MaxTxSize` and `Memo`

* Add changelog entry

* Add parsing tests for config domain types

* Relax min bound on MaxMsgNum to 1

* Remove useless dead_code allow

* Added workaround to tonic::code inconsistencies.

Ref:
informalsystems#1971 (comment)

* Update changelog entries

* Re-add allow dead_code + comment

* Update changelog

Co-authored-by: Adi Seredinschi <[email protected]>
hu55a1n1 pushed a commit to hu55a1n1/hermes that referenced this issue Sep 13, 2022
* Added docs for unreceived_acknowledgements and unreceived_packets

* Flipped order of sentences

* First iteration for informalsystems#2087; packet-recv works, packet-ack left

* Incremental processing of packet-ack relaying

* Added consensus_state_height_bounded with optimistic path

* Nits

* Simplified progress tracking

* Comment cleanup

* fmt and better logs

* Document the alternative path in consensus_state_height_bounded

* renamed method to make it more explicit h/t SeanC

* better var name

* Added workaround to tonic::code inconsistencies.

Ref:
informalsystems#1971 (comment)

* refactored the packet query methods

* Refactored CLIs for incremental relaying

* Experiment

* Fix for packet send for non-zero delay

* Using app status instead of network status

* fmt

* Undoing unnecessary chenges on foreign_client.rs

* Removed outdated comments

* Apply suggestions from code review

Co-authored-by: Anca Zamfir <[email protected]>

* Small refactor in cli.rs

* Address review comments

* Remove duplicate code

* Undo one-chain change

* Fix documentation

* Changelog

Co-authored-by: Anca Zamfir <[email protected]>
Co-authored-by: Anca Zamfir <[email protected]>
hu55a1n1 pushed a commit to cosmos/ibc-rs that referenced this issue Sep 29, 2022
* Added docs for unreceived_acknowledgements and unreceived_packets

* Flipped order of sentences

* First iteration for #2087; packet-recv works, packet-ack left

* Incremental processing of packet-ack relaying

* Added consensus_state_height_bounded with optimistic path

* Nits

* Simplified progress tracking

* Comment cleanup

* fmt and better logs

* Document the alternative path in consensus_state_height_bounded

* renamed method to make it more explicit h/t SeanC

* better var name

* Added workaround to tonic::code inconsistencies.

Ref:
informalsystems/hermes#1971 (comment)

* refactored the packet query methods

* Refactored CLIs for incremental relaying

* Experiment

* Fix for packet send for non-zero delay

* Using app status instead of network status

* fmt

* Undoing unnecessary chenges on foreign_client.rs

* Removed outdated comments

* Apply suggestions from code review

Co-authored-by: Anca Zamfir <[email protected]>

* Small refactor in cli.rs

* Address review comments

* Remove duplicate code

* Undo one-chain change

* Fix documentation

* Changelog

Co-authored-by: Anca Zamfir <[email protected]>
Co-authored-by: Anca Zamfir <[email protected]>
CoreFlux0x00 added a commit to CoreFlux0x00/cosmos-ibc-rust that referenced this issue Jan 14, 2025
* Added docs for unreceived_acknowledgements and unreceived_packets

* Flipped order of sentences

* First iteration for #2087; packet-recv works, packet-ack left

* Incremental processing of packet-ack relaying

* Added consensus_state_height_bounded with optimistic path

* Nits

* Simplified progress tracking

* Comment cleanup

* fmt and better logs

* Document the alternative path in consensus_state_height_bounded

* renamed method to make it more explicit h/t SeanC

* better var name

* Added workaround to tonic::code inconsistencies.

Ref:
informalsystems/hermes#1971 (comment)

* refactored the packet query methods

* Refactored CLIs for incremental relaying

* Experiment

* Fix for packet send for non-zero delay

* Using app status instead of network status

* fmt

* Undoing unnecessary chenges on foreign_client.rs

* Removed outdated comments

* Apply suggestions from code review

Co-authored-by: Anca Zamfir <[email protected]>

* Small refactor in cli.rs

* Address review comments

* Remove duplicate code

* Undo one-chain change

* Fix documentation

* Changelog

Co-authored-by: Anca Zamfir <[email protected]>
Co-authored-by: Anca Zamfir <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: bug Admin: something isn't working
Projects
No open projects
Status: Closed
Development

Successfully merging a pull request may close this issue.

4 participants