-
-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Container logging exits prematurely with error #2328
Comments
As an update, I've noticed that the amount of time the log consumer works for is the same as the "log production timeout" setting (e.g. by default, the log consumer stops working after five seconds). Is this what the log production timeout setting is intended to control? It's not super clear to me from the docs. |
Hi @stevenh, thanks for pointing me to your PR. Unfortunately I don't think it solves my particular problem, the issue persists even using your branch. I've re-attached the container tarball for repro to this comment in case you or anyone else is interested. |
I tried to repo locally but can't, here my archive: # build
(cd container; docker build -t localhost/log_repro:latest .)
# run test
go test -v -race Output:
|
I tried adding |
Tried getting a podman working but its just way to flaky, vm totally hangs and then just wont work for me on WSL:
|
Finally got podman working after a kernel upgrade but still can't reproduce: === RUN TestLogIssue
2024/03/13 20:22:37 github.com/testcontainers/testcontainers-go - Connected to docker:
Server Version: 4.9.3
API Version: 1.41
Operating System: fedora
Total Memory: 15836 MB
Resolved Docker Host: unix:///run/user/1000/podman/podman.sock
Resolved Docker Socket Path: /run/user/1000/podman/podman.sock
Test SessionID: c63aa4e8af35d970a47b07194bf27bbbaaac5dfb929678b458c1e515fc310db4
Test ProcessID: ab2fd2cd-f65f-4e99-9e3a-958ccd19d80b
2024/03/13 20:22:37 🐳 Creating container for image testcontainers/ryuk:0.6.0
2024/03/13 20:22:37 ✅ Container created: 1db0a21708f0
2024/03/13 20:22:37 🐳 Starting container: 1db0a21708f0
2024/03/13 20:22:38 ✅ Container started: 1db0a21708f0
2024/03/13 20:22:38 🚧 Waiting for container id 1db0a21708f0 image: testcontainers/ryuk:0.6.0. Waiting for: &{Port:8080/tcp timeout:<nil> PollInterval:100ms}
2024/03/13 20:22:38 🔔 Container is ready: 1db0a21708f0
2024/03/13 20:22:38 🐳 Creating container for image localhost/log_repro:latest
2024/03/13 20:22:38 ✅ Container created: 4e83a135fa90
2024/03/13 20:22:38 🐳 Starting container: 4e83a135fa90
2024/03/13 20:22:38 ✅ Container started: 4e83a135fa90
2024/03/13 20:22:38 🔔 Container is ready: 4e83a135fa90
2024/03/13 20:22:38 tick 0
2024/03/13 20:22:39 tick 1
2024/03/13 20:22:40 tick 2
2024/03/13 20:22:41 tick 3
2024/03/13 20:22:42 tick 4
2024/03/13 20:22:44 tick 6
2024/03/13 20:22:45 tick 7
2024/03/13 20:22:46 tick 8
2024/03/13 20:22:47 tick 9
test over
2024/03/13 20:22:48 🐳 Terminating container: 4e83a135fa90
2024/03/13 20:22:50 🚫 Container terminated: 4e83a135fa90
--- PASS: TestLogIssue (12.54s)
PASS
ok log-test 12.546s |
Thanks for the effort to reproduce this. I noticed you're using a newer version of Podman based on your latest logs, I'll try that out on my end and see if it helps (the reason for the old version I'm using is it's the default available on |
@stevenh Looks like my reproducible example works fine with Podman v4.9.3 both with and without your change. Unfortunately, after updating, the codebase where I originally found this bug exhibits different buggy behavior. Using
I can see if I can come up with a reproducible example for this new issue. |
Alright, I was able to reproduce this with a pretty simple example. I've attached a new image tar, this container runs a binary that does nothing and simply busy-loops forever. The test code itself is also fairly minimal: package container_test
import (
"context"
"fmt"
"testing"
tc "github.com/testcontainers/testcontainers-go"
)
type StdoutLogConsumer struct{}
func (lc *StdoutLogConsumer) Accept(l tc.Log) {
fmt.Printf("%s", l.Content)
}
func TestLogIssue(t *testing.T) {
req := tc.ContainerRequest{
Image: "localhost/log_repro:latest",
LogConsumerCfg: &tc.LogConsumerConfig{
Consumers: []tc.LogConsumer{&StdoutLogConsumer{}},
},
}
ctx := context.Background()
container, err := tc.GenericContainer(ctx, tc.GenericContainerRequest{
ContainerRequest: req,
Started: true,
})
if err != nil {
t.Fatal(err)
}
t.Cleanup(func() {
if err := container.Terminate(ctx); err != nil {
t.Fatal(err)
}
})
fmt.Println("test over")
} The output I get:
|
Can you post the build for the container so we don't have to rely on images? |
@stevenh Sure - I just wrote up a build file and simple Makefile to run everything. I've attached the sources as a tarball, untar and run |
Thanks @nmoroze looks like a bug in podman to me, if the container doesn't generate any log entries the request for log entries just hangs with podman where as docker returns an empty log. I would suggest raising an issue with the podman team. |
Thanks for taking a look! That makes sense, I'll close this issue for now and follow up on that side. |
Testcontainers version
0.29.1
Using the latest Testcontainers version?
Yes
Host OS
Linux
Host arch
x86
Go version
1.22
Docker version
Docker info
What happened?
I've found that
LogConsumer
s sometimes stop receiving logs prematurely and dump an error.I've built a minimal reproducible example of the specific issue. It consists of a very small container image based on
distroless/base
. The container executes the following program:I've attached a prebuilt container to the issue as a tarball.
In order to reproduce the issue, first load the container tarball like so:
And then compile and run the following test:
I'd expect this test to print container logs showing 9-10 ticks and exit cleanly. Instead, it consistently stops displaying output after 6 ticks, and always outputs an error message:
I did notice some interesting things about this while trying to debug, but I'm not sure what to make of them. In the main loop of
docker.startLogProduction()
, it seems like the"use of closed network connection"
error block consistently triggers before the EOF error is triggered. In addition, it seems like attempting to restart the connection when EOF is received allows the log consumer to get a few more logs, e.g.:Relevant log output
No response
Additional information
container.tar.gz
The text was updated successfully, but these errors were encountered: