-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(test-tooling): failing fabric AIO container launch #320
Comments
A little more investigation and I'm thinking this is most likely due to the fixed host ports we use. The CI VM runs the CI script twice against both NodeJS 12 and 14 and so one of them gets knocked out when it tires to launch the AIO container while the other instance of the CI script is doing the same thing sitting on the host port they both want. Meaning that this issue will most likely be fixed by #279 or anything else that gives us |
Related: MiniFabric might provide a solution idea that we can re-use or just flat out make our own Fabric AIO image inherit from the MiniFabric image... Not sure yet, but I've made some inquiries here: hyperledger-labs/minifabric#105 |
Unfortunately MiniFabric does not support pulling up multiple ledgers, just multiple channels within the same ledger but that is not a good fit for our tests. Still worth evaluating as some kind of workaround to use it if we cannot make DinD work within reasonable time. |
This is pretty unbounded in complexity because we are missing a feature from the Fabric NodeJS SDK to support discovery on non-standard ports/customizable ports. One option would be to monkey patch the Fabric SDK, which, if it works, then this was actually pretty easy to solve. We haven't done the exploration yet to check this though. |
Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes hyperledger-cacti#718 Fixes hyperledger-cacti#876 Fixes hyperledger-cacti#320 Fixes hyperledger-cacti#319 Signed-off-by: Peter Somogyvari <[email protected]>
Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes #718 Fixes #876 Fixes #320 Fixes #319 Signed-off-by: Peter Somogyvari <[email protected]>
Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes hyperledger-cacti#718 Fixes hyperledger-cacti#876 Fixes hyperledger-cacti#320 Fixes hyperledger-cacti#319 Signed-off-by: Peter Somogyvari <[email protected]>
Describe the bug
CI tests are failing in the test tooling package that verifies that the Fabric all in one (AIO) image works as expected.
To Reproduce
Appears to be flaky.
Run the CI script to attempt to reproduce:
npm run run-ci
Expected behavior
Test should be consistently failing or succeeding.
Logs/Stack traces
https://travis-ci.org/github/hyperledger/cactus/jobs/735204836
Cloud provider or hardware configuration:
Travis CI virtual machine
Operating system name, version, build:
Details are in the linked logs above.
Hyperledger Cactus release version or commit (git rev-parse --short HEAD):
a74a7ed
Hyperledger Cactus Plugins/Connectors Used
Fabric
Additional context
Issue appears to be about ports that are already allocated. The Fabric AIO image has this limitation that it does not randomly assign published ports because we did not yet finish the docker in docker (DIND) support for it and that is needed for the peer containers (or we must bind to specific host ports instead of random ones and for now that's just how we do it)
The text was updated successfully, but these errors were encountered: