Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix nil pointer panic for monitor #6830

Merged
merged 1 commit into from
Apr 20, 2022
Merged

Conversation

junnplus
Copy link
Member

@junnplus junnplus commented Apr 20, 2022

Signed-off-by: Ye Sijun [email protected]

build main and run containerd failed.

INFO[2022-04-20T05:35:40.153277386Z] loading plugin "io.containerd.grpc.v1.sandbox-controllers"...  type=io.containerd.grpc.v1
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x98 pc=0x55d06830b312]

goroutine 45 [running]:
github.com/containerd/containerd/runtime/restart/monitor.(*monitor).monitor(0xc00011a050, {0x55d0693c0a40, 0xc00003a0c0})
        /home/jun/Documents/workspace/containerd/runtime/restart/monitor/monitor.go:226 +0x272
github.com/containerd/containerd/runtime/restart/monitor.(*monitor).reconcile.func1()
        /home/jun/Documents/workspace/containerd/runtime/restart/monitor/monitor.go:187 +0xab
created by github.com/containerd/containerd/runtime/restart/monitor.(*monitor).reconcile
        /home/jun/Documents/workspace/containerd/runtime/restart/monitor/monitor.go:184 +0x191

@@ -220,11 +220,13 @@ func (m *monitor) monitor(ctx context.Context) ([]change, error) {
}
desiredStatus := containerd.ProcessStatus(labels[restart.StatusLabel])
task, err := c.Task(ctx, nil)
if err != nil && desiredStatus == containerd.Stopped {
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the logic some, is there a case this was trying to account for when Task may return not found and the switch statement will handle it when desired state is running?

Copy link
Member Author

@junnplus junnplus Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case when Task returns not found and the restart desired to stop, maybe we need to continue to stop the container.

I change the logic to be the same as before .

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 20, 2022

Build succeeded.

if err != nil {
logrus.WithError(err).Error("monitor")
if desiredStatus == containerd.Stopped {
goto desiredSwitch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see if we can avoid goto and rewrite the logic a bit more clearly.

var status containerd.Status
if task, err := c.Task(ctx, nil); err == nil {
	status, err = task.Status(ctx)
	if err == nil {
		if desiredStatus == status.Stopped {
			continue
		}
	} else {
		logrus.WithError(err).Error("monitor")
		if desiredStatus == containerd.Stopped {
			continue
		}
	}
} else {
	logrus.WithError(err).Error("monitor")
	if desiredStatus == containerd.Stopped {
		continue
	}
}

This allows error cases where desiredStatus is equal to containerd.Running to fall through to the switch statement while avoiding the scope clarity issues with goto.

Copy link
Member Author

@junnplus junnplus Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this:

var (
	task   containerd.Task
	status containerd.Status
	err    error
)
if task, err = c.Task(ctx, nil); err == nil {
	if status, err = task.Status(ctx); err == nil {
		if desiredStatus == status.Status {
			continue
		}
	}
}

// Task or Status return error, only desired to stop
if err != nil {
	logrus.WithError(err).Error("monitor")
	if desiredStatus != containerd.Stopped {
		continue
	}
}

@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 20, 2022

Build succeeded.

@fuweid fuweid merged commit d85ac56 into containerd:main Apr 20, 2022
@AkihiroSuda
Copy link
Member

Seems still broken

        Apr 20 08:50:22 4e20dff0b6af containerd[42938]: time="2022-04-20T08:50:22.180313832Z" level=info msg="Start streaming server"
        Apr 20 08:50:22 4e20dff0b6af containerd[42938]: time="2022-04-20T08:50:22.180140133Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:50:22 4e20dff0b6af systemd[1]: Started containerd container runtime.
        Apr 20 08:50:32 4e20dff0b6af containerd[42938]: time="2022-04-20T08:50:32.181733644Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:50:42 4e20dff0b6af containerd[42938]: time="2022-04-20T08:50:42.182933645Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:50:52 4e20dff0b6af containerd[42938]: time="2022-04-20T08:50:52.183684441Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:51:02 4e20dff0b6af containerd[42938]: time="2022-04-20T08:51:02.184665351Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:51:12 4e20dff0b6af containerd[42938]: time="2022-04-20T08:51:12.185633691Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:51:22 4e20dff0b6af containerd[42938]: time="2022-04-20T08:51:22.186395414Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:51:32 4e20dff0b6af containerd[42938]: time="2022-04-20T08:51:32.187320246Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:51:42 4e20dff0b6af containerd[42938]: time="2022-04-20T08:51:42.187712793Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        Apr 20 08:51:52 4e20dff0b6af containerd[42938]: time="2022-04-20T08:51:52.188632092Z" level=error msg=monitor error="no running task found: task db0d630812d37a9985ff327be5f41f7d78075147bc3e085a3ddf44d79b388021 not found: not found"
        
    run_restart_linux_test.go:87: ==========
    run_restart_linux_test.go:88: the container does not seem to be restarted
--- FAIL: TestRunRestart (100.13s)

https://github.com/containerd/nerdctl/runs/6091620240?check_suite_focus=true

junnplus pushed a commit to junnplus/containerd that referenced this pull request Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Regression on main] Segfaults at runtime/restart/monitor/monitor.go:226
5 participants