-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ci): flaky Fabric 2.x run tx endpoint test #718
Labels
bug
Something isn't working
Comments
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Sep 3, 2021
Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes hyperledger-cacti#718 Fixes hyperledger-cacti#876 Fixes hyperledger-cacti#320 Fixes hyperledger-cacti#319 Signed-off-by: Peter Somogyvari <[email protected]>
petermetz
added a commit
that referenced
this issue
Sep 7, 2021
Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes #718 Fixes #876 Fixes #320 Fixes #319 Signed-off-by: Peter Somogyvari <[email protected]>
RafaelAPB
pushed a commit
to RafaelAPB/blockchain-integration-framework
that referenced
this issue
Mar 9, 2022
Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes hyperledger-cacti#718 Fixes hyperledger-cacti#876 Fixes hyperledger-cacti#320 Fixes hyperledger-cacti#319 Signed-off-by: Peter Somogyvari <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
The test below fails occasionally and only on the CI servers of GHA, not reproducible on development machines unfortunately.
packages/cactus-plugin-ledger-connector-fabric/src/test/typescript/integration/fabric-v2-2-x/run-transaction-endpoint-v1.test.ts
To Reproduce
Keep submitting PRs and you'll hit this issue every once in a while, forcing you to re-run the CI and then have it pass...
Expected behavior
Tests are as stable as possible.
Logs/Stack traces
Screenshots
N/A
Cloud provider or hardware configuration:
GitHub Actions Runner
Operating system name, version, build:
Ubuntu 18/20 LTS
Hyperledger Cactus release version or commit (git rev-parse --short HEAD):
main
@ fea547fHyperledger Cactus Plugins/Connectors Used
Fabric
Additional context
This seems to be the last bug standing after several other ones were fixed recently as part of the effort to make #656 possible.
Guess on what might be going wrong: The Fabric 2.x AIO container may be hanging at boot with a race condition that it cannot recover from. This is more likely than it being just a not enough time on slow hardware type of thing because usually 3 out of 4 runs in the CI matrix pass in 30 minutes but then the 4th one takes an hour+ and still times out...
cc: @takeutak @sfuji822 @hartm @jonathan-m-hamilton @AzaharaC @jordigiam @kikoncuo @jagpreetsinghsasan
The text was updated successfully, but these errors were encountered: