Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runc "checkpoint --lazy-pages and restore" test failed #1338

Closed
kolyshkin opened this issue Jan 16, 2021 · 1 comment
Closed

runc "checkpoint --lazy-pages and restore" test failed #1338

kolyshkin opened this issue Jan 16, 2021 · 1 comment

Comments

@kolyshkin
Copy link
Contributor

kolyshkin commented Jan 16, 2021

I have only seen it once (here) but guess it's worth filing just to keep track of things.

Here's the relevant part of a test log:

ok 12 checkpoint and restore 
ok 13 checkpoint and restore (cgroupns) # skip test requires cgroups_v1
ok 14 checkpoint --pre-dump and restore
not ok 15 checkpoint --lazy-pages and restore
# (from function `fail' in file tests/integration/helpers.bash, line 275,
#  from function `runc_restore_with_pipes' in file tests/integration/checkpoint.bats, line 84,
#  in test file tests/integration/checkpoint.bats, line 208)
#   `runc_restore_with_pipes ./image-dir test_busybox_restore --lazy-pages' failed
# runc spec (status=0):
# 
# uffd-noncoop is supported
# runc state test_busybox (status=0):
# {
#   "ociVersion": "1.0.2-dev",
#   "id": "test_busybox",
#   "pid": 29402,
#   "status": "running",
#   "bundle": "/tmp/busyboxtest",
#   "rootfs": "/tmp/busyboxtest/rootfs",
#   "created": "2021-01-15T23:01:28.208600788Z",
#   "owner": ""
# }
# __runc restore test_busybox_restore failed (status: 1)
# time="2021-01-15T23:01:28Z" level=error msg="criu failed: type NOTIFY errno 0\nlog file: image-dir/restore.log"
# CRIU restore log errors (if any):
# (00.074834) 29463 (native) is going to execute the syscall 11, required is 15
# (00.074843) 29463 was trapped
# (00.074845) `- Expecting exit
# (00.074852) 29463 was trapped
# (00.074855) 29463 (native) is going to execute the syscall 18446744073709551615, required is 15
# (00.074862) Error (compel/src/lib/infect.c:1513): Task 29463 is in unexpected state: f7f
# (00.074878) Error (compel/src/lib/infect.c:1520): Task stopped with 15: Terminated
# (00.075508) Error (criu/cr-restore.c:2416): Can't stop all tasks on rt_sigreturn
# (00.075511) Error (criu/cr-restore.c:2454): Killing processes because of failure on restore.
# The Network was unlocked so some data or a connection may have been lost.
# (00.077109) Error (criu/mount.c:3396): mnt: Can't remove the directory /tmp/.criu.mntns.hDc1hY: No such file or directory
# (00.077116) Error (criu/cr-restore.c:2483): Restoring FAILED.
# runc restore failed
ok 16 checkpoint and restore in external network namespace
ok 17 checkpoint and restore with container specific CRIU config

In particular, this line looks shady:

(00.074855) 29463 (native) is going to execute the syscall 18446744073709551615, required is 15

This was criu-3.15 running on Fedora 33 (amd64) running in a vagrant virtualbox VM on Mac OS X. Sorry I could not provide more details at the moment.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant