Skip to content

Conversation

@aojea
Copy link

@aojea aojea commented Jun 6, 2019

kind uses prow for the CI, the jobs are defined in https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubernetes-sigs/kind/kind-presubmits.yaml
kind also has scripts in its repo that allows running the e2e tests and deals with all the plumbing necessary.

Openlab offers the possibility to test ARM64 architectures.
This PR adapts the Openlab jobs to use the kind scripts, allowing to align the testing between the different CI reducing the divergence and simplifying the maintenance

xref: kubernetes-sigs/kind#188

@theopenlab-ci
Copy link

theopenlab-ci bot commented Jun 6, 2019

Build succeeded.

@ZhengZhenyu
Copy link
Contributor

@aojea Thanks for doing this, this is now a period job that run 6 and 18 Utc per day, I can also trigger a manual job, but it won’t have log posted on openlab since it misses some param for log update

@aojea
Copy link
Author

aojea commented Jun 6, 2019

@ZhengZhenyu feel free to "steal" the PR and try it or adapt it so we can trigger manually and check the logs to see if it works

@ZhengZhenyu
Copy link
Contributor

@aojea thanks, I will do it, we are now at a 3 day holiday so I guess I will reply a little bit late

@ZhengZhenyu
Copy link
Contributor

ZhengZhenyu commented Jun 10, 2019

@aojea @BenTheElder Hi, I've tested this today, the result is this:

  1. go get kubernetes@master seems not working and the desplay is "error loading module requirements"
  2. the test script e2e.sh can not be used directly as the default base-image is a image built from x86 ( I guess) so that it could not be used on ARM 64 arch.
  3. I manually built a base-image and modified the script to use the base-image I built, the script errored at:
zuul@ubuntu:~/go/pkg/mod/sigs.k8s.io$  cd $GOPATH/src/k8s.io/kubernetes && $GOPATH/src/sigs.k8s.io/kind/hack/ci/e2e.sh
+ main
+ ARTIFACTS=/home/zuul/go/src/k8s.io/kubernetes/_artifacts
+ mkdir -p /home/zuul/go/src/k8s.io/kubernetes/_artifacts
+ export ARTIFACTS
+ trap cleanup EXIT
+ install_kind
++ mktemp -d
+ TMP_DIR=/tmp/tmp.WqyiW31WJi
+ mkdir -p /tmp/tmp.WqyiW31WJi/bin
+ local script_dir
++ dirname /home/zuul/go/src/sigs.k8s.io/kind/hack/ci/e2e.sh
+ script_dir=/home/zuul/go/src/sigs.k8s.io/kind/hack/ci
+ make -C /home/zuul/go/src/sigs.k8s.io/kind/hack/ci/../.. install INSTALL_PATH=/tmp/tmp.WqyiW31WJi/bin
make: Entering directory '/home/zuul/go/pkg/mod/sigs.k8s.io/[email protected]'
+ Ensuring build cache volume exists
docker volume create kind-build-cache
kind-build-cache
+ Ensuring build output directory exists
mkdir -p /home/zuul/go/pkg/mod/sigs.k8s.io/[email protected]/bin
+ Building kind binary
docker run \
	--rm \
	-v kind-build-cache:/go \
	-e GOCACHE=/go/cache \
	-v /home/zuul/go/pkg/mod/sigs.k8s.io/[email protected]/bin:/out \
	-v /home/zuul/go/pkg/mod/sigs.k8s.io/[email protected]:/src/kind \
	-w /src/kind \
	-e GO111MODULE=on \
	-e GOPROXY=https://proxy.golang.org \
	-e CGO_ENABLED=0 \
	-e GOOS=linux \
	-e GOARCH=arm64 \
	--user 1000:1000 \
	golang:1.12.5 \
	go build -v -o /out/kind .
+ Built kind binary to /home/zuul/go/pkg/mod/sigs.k8s.io/[email protected]/bin/kind
+ Copying kind binary to INSTALL_DIR
install /home/zuul/go/pkg/mod/sigs.k8s.io/[email protected]/bin/kind /home/zuul/go/bin/kind
make: Leaving directory '/home/zuul/go/pkg/mod/sigs.k8s.io/[email protected]'
+ PATH=/tmp/tmp.WqyiW31WJi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/go/bin:/home/zuul/go/bin
+ export PATH
+ build
+ BAZEL_REMOTE_CACHE_ENABLED=false
+ [[ false == \t\r\u\e ]]
++ go env GOPATH
+ kind build node-image --type=bazel '--kube-root=/home/zuul/go/src/k8s.io/kubernetes --base-image=kindest/base:latest --loglevel=debug'
INFO: Build option --platforms has changed, discarding analysis cache.
INFO: Analysed 4 targets (1 packages loaded, 22850 targets configured).
INFO: Found 4 targets...
INFO: Elapsed time: 171.636s, Critical Path: 131.82s
INFO: 420 processes: 420 linux-sandbox.
INFO: Build completed successfully, 434 total actions
ERRO[09:09:12] Failed to build Kubernetes: failed to write version file: open /home/zuul/go/src/k8s.io/kubernetes --base-image=kindest/base:latest --loglevel=debug/_output/git_version: no such file or directory 
Error: error building node image: failed to build kubernetes: failed to write version file: open /home/zuul/go/src/k8s.io/kubernetes --base-image=kindest/base:latest --loglevel=debug/_output/git_version: no such file or directory
+ cleanup
+ kind export logs /home/zuul/go/src/k8s.io/kubernetes/_artifacts/logs
Error: unknown cluster "kind"
+ true
+ [[ '' = true ]]
+ rm -f _output/bin/e2e.test
+ [[ -n /tmp/tmp.WqyiW31WJi ]]
+ rm -rf /tmp/tmp.WqyiW31WJi
  1. I also tested with the old way with bazel v0.3.0/master and k8s versions with master/1.15/1.14, it seems the problems are similar, and I noticed that the built image for kube-* services has a prefix of -dirty will that matters?

@aojea
Copy link
Author

aojea commented Jun 10, 2019

can you try to build the image with kind build node-image --type docker instead of --type bazel?

when you use go get, do you have the environment variable GO111MODULE=on?

@ZhengZhenyu
Copy link
Contributor

@aojea yes, I used G111MODULE=on when calling go get
I will try --type docker today, but I think there was a problem before and that's why we used bazel, but I forgot the details, I will try again.

@ZhengZhenyu
Copy link
Contributor

@aojea here is the test result:
zuul@ubuntu:~/go/src/k8s.io/kubernetes$ kind build node-image --type docker --base-image kindest/base:latest --loglevel debug
INFO[08:20:24] Starting to build Kubernetes
DEBU[08:20:24] Running: build/run.sh [build/run.sh make all WHAT=cmd/kubeadm cmd/kubectl cmd/kubelet KUBE_BUILD_PLATFORMS=linux/arm64 KUBE_VERBOSE=0]
+++ [0611 08:20:24] Verifying Prerequisites....
+++ [0611 08:20:26] Building Docker image kube-build:build-5c6a6c6b02-5-v1.12.5-1
+++ Docker build command failed for kube-build:build-5c6a6c6b02-5-v1.12.5-1

Sending build context to Docker daemon 8.704kB
Step 1/16 : FROM k8s.gcr.io/kube-cross:v1.12.5-1
v1.12.5-1: Pulling from kube-cross
e79bb959ec00: Pulling fs layer
d4b7902036fe: Pulling fs layer
1b2a72d4e030: Pulling fs layer
d54db43011fd: Pulling fs layer
963c818ebafc: Pulling fs layer
d54db43011fd: Waiting
8eb6142fa761: Pulling fs layer
5c20a8a1d332: Pulling fs layer
963c818ebafc: Waiting
10753a28d3d7: Pulling fs layer
61f250a1e3c4: Pulling fs layer
28e5ed6566f2: Pulling fs layer
8eb6142fa761: Waiting
10753a28d3d7: Waiting
2ac9ff1ce5a7: Pulling fs layer
61f250a1e3c4: Waiting
dabbfba85121: Pulling fs layer
28e5ed6566f2: Waiting
611d3dfd1990: Pulling fs layer
dabbfba85121: Waiting
44a40e365e15: Pulling fs layer
611d3dfd1990: Waiting
2ac9ff1ce5a7: Waiting
1b2a72d4e030: Verifying Checksum
1b2a72d4e030: Download complete
d4b7902036fe: Verifying Checksum
d4b7902036fe: Download complete
e79bb959ec00: Verifying Checksum
e79bb959ec00: Download complete
d54db43011fd: Verifying Checksum
d54db43011fd: Download complete
5c20a8a1d332: Verifying Checksum
5c20a8a1d332: Download complete
963c818ebafc: Verifying Checksum
963c818ebafc: Download complete
61f250a1e3c4: Verifying Checksum
61f250a1e3c4: Download complete
e79bb959ec00: Pull complete
d4b7902036fe: Pull complete
1b2a72d4e030: Pull complete
8eb6142fa761: Download complete
2ac9ff1ce5a7: Verifying Checksum
2ac9ff1ce5a7: Download complete
dabbfba85121: Verifying Checksum
dabbfba85121: Download complete
10753a28d3d7: Verifying Checksum
10753a28d3d7: Download complete
611d3dfd1990: Verifying Checksum
611d3dfd1990: Download complete
44a40e365e15: Verifying Checksum
44a40e365e15: Download complete
28e5ed6566f2: Verifying Checksum
28e5ed6566f2: Download complete
d54db43011fd: Pull complete
963c818ebafc: Pull complete
8eb6142fa761: Pull complete
5c20a8a1d332: Pull complete
10753a28d3d7: Pull complete
61f250a1e3c4: Pull complete
28e5ed6566f2: Pull complete
2ac9ff1ce5a7: Pull complete
dabbfba85121: Pull complete
611d3dfd1990: Pull complete
44a40e365e15: Pull complete
Digest: sha256:465ec84ec3a24e739e5a43c615c27d3b35742d1c2ec80e9d12432126ff3b20a4
Status: Downloaded newer image for k8s.gcr.io/kube-cross:v1.12.5-1
---> 834eab288e26
Step 2/16 : RUN touch /kube-build-image
---> Running in 371f96134c16
standard_init_linux.go:207: exec user process caused "exec format error"
The command '/bin/sh -c touch /kube-build-image' returned a non-zero code: 1

To retry manually, run:

docker build -t kube-build:build-5c6a6c6b02-5-v1.12.5-1 --pull=false /home/zuul/go/src/k8s.io/kubernetes/_output/images/kube-build:build-5c6a6c6b02-5-v1.12.5-1

!!! [0611 08:23:33] Call tree:
!!! [0611 08:23:33] 1: build/run.sh:31 kube::build::build_image(...)
!!! Error in build/../build/common.sh:462
Error in build/../build/common.sh:462. '((i<3-1))' exited with status 1
Call stack:
1: build/../build/common.sh:462 kube::build::build_image(...)
2: build/run.sh:31 main(...)
Exiting with status 1
ERRO[08:23:33] Failed to build Kubernetes: failed to build binaries: exit status 1
Error: error building node image: failed to build kubernetes: failed to build binaries: exit status 1

And I remembered that there was an issue on this, but I cannot find it

@dims
Copy link
Contributor

dims commented Jun 11, 2019

@aojea @ZhengZhenyu kubernetes/kubernetes#75058

@aojea
Copy link
Author

aojea commented Jun 11, 2019

It seems there is also a problem with the images, kind can't create a cluster with the standards images too


root@node:~/go/bin# docker ps -a
CONTAINER ID        IMAGE                  COMMAND                  CREATED              STATUS                     PORTS               NAMES
2d93c6996498        kindest/node:v1.14.2   "/usr/local/bin/entr…"   9 seconds ago        Exited (1) 6 seconds ago                       kind-control-plane
4bf68a20b6f5        nginx                  "nginx -g 'daemon of…"   About a minute ago   Up About a minute          80/tcp              elegant_shamir
226a859583f9        kindest/node:v1.14.2   "bash"                   3 minutes ago        Exited (1) 3 minutes ago                       suspicious_ganguly
root@node:~/go/bin# docker logs 2d93c6996498
standard_init_linux.go:207: exec user process caused "exec format error"

@ZhengZhenyu
Copy link
Contributor

It seems there is also a problem with the images, kind can't create a cluster with the standards images too


root@node:~/go/bin# docker ps -a
CONTAINER ID        IMAGE                  COMMAND                  CREATED              STATUS                     PORTS               NAMES
2d93c6996498        kindest/node:v1.14.2   "/usr/local/bin/entr…"   9 seconds ago        Exited (1) 6 seconds ago                       kind-control-plane
4bf68a20b6f5        nginx                  "nginx -g 'daemon of…"   About a minute ago   Up About a minute          80/tcp              elegant_shamir
226a859583f9        kindest/node:v1.14.2   "bash"                   3 minutes ago        Exited (1) 3 minutes ago                       suspicious_ganguly
root@node:~/go/bin# docker logs 2d93c6996498
standard_init_linux.go:207: exec user process caused "exec format error"

Where does this came from :) ?

@aojea
Copy link
Author

aojea commented Jun 12, 2019

@ZhengZhenyu I managed to get access to an ARM64 server ;)

@kiwik
Copy link
Contributor

kiwik commented Jun 25, 2019

@ZhengZhenyu Any updates about this issue?

@BenTheElder
Copy link

using the bazel build mode might (??) help while we wait for the patch to the make build to get in upstream

@ZhengZhenyu
Copy link
Contributor

@kiwik I'm planning to make this particular job to be a [email protected] + [email protected] since thats what most of other jobs in testgrid looks like(testing amount stable versions). I've tested the workflow and downgraded the bazel to 0.23.2 since k8s 1.14 could not build with newer bazel(there was a bug, but I cannot find the link at the moment), the test process still fails because of k8s cluster could not come up, but I think we can first put this version up to test grid and then debug the details.

@BenTheElder sorry I didn't quite get what you mean

@ZhengZhenyu
Copy link
Contributor

@aojea @BenTheElder as aojea mentioned before, the images for kube-* services seems not correctly included in the node-image, I noticed that during the build process those images are with dirty tag:

INFO[2019-06-26T02:57:58.001141763Z] ImageCreate event &ImageCreate{Name:k8s.gcr.io/coredns:1.3.1,Labels:map[string]string{},}
INFO[2019-06-26T02:58:00.486173536Z] ImageCreate event &ImageCreate{Name:sha256:7e8edeee9a1e73cdd4a1209eaa12aee15933456c7b6c0eb7d6758c8e1a078d0a,Labels:map[string]string{io.cri-containerd.image: managed,},}
INFO[2019-06-26T02:58:00.488179783Z] ImageUpdate event &ImageUpdate{Name:k8s.gcr.io/coredns:1.3.1,Labels:map[string]string{io.cri-containerd.image: managed,},}
INFO[2019-06-26T02:58:01.515949925Z] ImageCreate event &ImageCreate{Name:docker.io/kindest/kindnetd:0.5.0,Labels:map[string]string{},}
INFO[2019-06-26T02:58:03.949139107Z] ImageCreate event &ImageCreate{Name:sha256:cdabfe761f136a9d3be0847b71cea1477d877e8d3da08120296e7d577aad5b40,Labels:map[string]string{io.cri-containerd.image: managed,},}
INFO[2019-06-26T02:58:03.951185014Z] ImageCreate event &ImageCreate{Name:k8s.gcr.io/kube-scheduler:v1.14.3-dirty,Labels:map[string]string{},}
INFO[2019-06-26T02:58:04.039681918Z] ImageUpdate event &ImageUpdate{Name:docker.io/kindest/kindnetd:0.5.0,Labels:map[string]string{io.cri-containerd.image: managed,},}
INFO[2019-06-26T02:58:04.041740396Z] ImageCreate event &ImageCreate{Name:k8s.gcr.io/kube-controller-manager:v1.14.3-dirty,Labels:map[string]string{},}
INFO[2019-06-26T02:58:04.093156584Z] ImageCreate event &ImageCreate{Name:sha256:582a9728dd169dce863fa42f2d3295a49e6c536dc7cb4f40c50e6cd3ed3d51dc,Labels:map[string]string{io.cri-containerd.image: managed,},}
INFO[2019-06-26T02:58:04.095104559Z] ImageCreate event &ImageCreate{Name:k8s.gcr.io/kube-apiserver:v1.14.3-dirty,Labels:map[string]string{},}
INFO[2019-06-26T02:58:04.121447549Z] ImageUpdate event &ImageUpdate{Name:k8s.gcr.io/kube-scheduler:v1.14.3-dirty,Labels:map[string]string{io.cri-containerd.image: managed,},}
INFO[2019-06-26T02:58:04.123270129Z] ImageCreate event &ImageCreate{Name:k8s.gcr.io/kube-proxy:v1.14.3-dirty,Labels:map[string]string{},}

will this be a possible problem?

@ZhengZhenyu
Copy link
Contributor

ZhengZhenyu commented Jun 28, 2019

@aojea @BenTheElder Hi, I've tested these days and it turns out the problem might be related to Containerd, as I tested [email protected], 0.3.0, 0.4.0 and master, it turns out 0.2.0 is OK and all others using containerd is not OK, and the error is actually:

https://gist.github.com/ZhengZhenyu/69226fc7bdebc2025d0fcb827fd75e47

so it is the command:
https://github.com/kubernetes-sigs/kind/blob/3d35585e745c3426dbb3b9b97cb5e5e409796fad/pkg/build/node/node.go#L503

that failed to execute, the image import failed and there is no *.tar file in container's:
/run/containerd/io/containerd.runtime.v1.linux/ dir so the k8s cluster won't come up

I've refactored the current job and since it is a periodic job, I will update logs tommorrow and maybe report an issue in kind.

@ZhengZhenyu
Copy link
Contributor

Test updated: #576
&
Issue reported: kubernetes-sigs/kind#676

@ZhengZhenyu ZhengZhenyu closed this Jul 1, 2019
@BenTheElder
Copy link

@aojea @BenTheElder as aojea mentioned before, the images for kube-* services seems not correctly included in the node-image, I noticed that during the build process those images are with dirty tag:

dirty means something has impoperly modified the kubernetes source tree and the kubernetes build scripts are detecting this. we should not do this

@BenTheElder
Copy link

It seems there is also a problem with the images, kind can't create a cluster with the standards images too

kindest/node as published in docker hub is single arch currently while logistics are sorted out, this one will be a huge pain to cross compile (because kubernetes etc)

the rest (like the CNI) are generally multi-arch. this one can be built on arm (which we are doing here) and then run.

@BenTheElder sorry I didn't quite get what you mean

er what I mean is, kind build node-image --type=bazel which uses kubernetes's bazel based build instead, since there are known issues with the make / docker / bash build that have outstanding PRs upstream (not kind related)

@BenTheElder
Copy link

the containerd storage should be in the content store by layer, and mostly appears to be working as expected, I am pretty sure the issues is CNI related which also happened in the same time frame. many things changed between 0.2 and 0.3

I'll try to get on an arm sever soon and look at this...

@aojea aojea deleted the kindarm branch July 4, 2019 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants