You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We found a potential overlaybd bug that it returned incorrect data during networking was down. This could lead to application failures, in our case is Java failed to load class
What did you expect to happen?
When networking is down, the class loading should be completely blocked until the network recovers. However, we currently see "Exception: java.lang.NoClassDefFoundError" and " error reading zip file" after retrying for 3+ minutes.
We suspect there's a bug in overlaybd that it returned some unexpected result but instead it should block until networking is recovered. given the following experiments we did:
We did systemctl stop overlaybd-tcmu, after which jar command would actually hang forever until overlaybd-tcmu recover
With a normal jar stored on a device-mapper block device, if we suspend the IO in the DM device, the jar command would hang forever until the IO suspension was removed
How can we reproduce it?
Step 1, build, convert and push a repro image using the following Dockerfile
FROM ubuntu:18.04
RUN apt-get update \
# TODO: upgrade to JAVA 11 in the next sprint
&& apt-get install -y openjdk-8-jdk git vim \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN git clone https://github.com/macagua/example.java.helloworld.git && cd example.java.helloworld && javac HelloWorld/Main.java && jar cfme Main.jar Manifest.txt HelloWorld.Main HelloWorld/Main.class
RUN echo 'export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64' >> ~/.bashrc
Step 2: rpull and bash into the container
/opt/overlaybd/snapshotter/ctr -n k8s.io rpull -u $USERNAME:$PASSWORD $IMAGE_REF
ctr -n k8s.io run --snapshotter=overlaybd --rm -t $IMAGE_REF test-jar bash
# In side the shell, run `jar` command to load the binary
Step 3: shutdown the network, we did this by turning off the security group of the VM
Step 4: inside the bash shell, run
jar vft ./example.java.helloworld/Main.jar
# After several minutes, we see "error reading zip file" error
root@ip-10-0-0-134:/# jar vft ./example.java.helloworld/Main.jar
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar: error reading zip file
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar: error reading zip file
Exception in thread "main"
Exception: java.lang.NoClassDefFoundError thrown from the UncaughtExceptionHandler in thread "main"
What is the version of your Accelerated Container Image?
overlaybd 0.6.10
What is your OS environment?
ubuntu
Are you willing to submit PRs to fix it?
Yes, I am willing to fix it.
The text was updated successfully, but these errors were encountered:
What happened in your environment?
We found a potential overlaybd bug that it returned incorrect data during networking was down. This could lead to application failures, in our case is Java failed to load class
What did you expect to happen?
When networking is down, the class loading should be completely blocked until the network recovers. However, we currently see "Exception: java.lang.NoClassDefFoundError" and " error reading zip file" after retrying for 3+ minutes.
We suspect there's a bug in overlaybd that it returned some unexpected result but instead it should block until networking is recovered. given the following experiments we did:
systemctl stop overlaybd-tcmu
, after whichjar
command would actually hang forever until overlaybd-tcmu recoverHow can we reproduce it?
What is the version of your Accelerated Container Image?
What is your OS environment?
ubuntu
Are you willing to submit PRs to fix it?
The text was updated successfully, but these errors were encountered: