-
Notifications
You must be signed in to change notification settings - Fork 113
agent: Add support for ephemeral volumes #236
Conversation
Corresponding PR in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea - thanks for the PR, Harshal. Couple questions/comments.
@amshinde PTAL?
mount.go
Outdated
absSource, err = filepath.EvalSymlinks(source) | ||
if err != nil { | ||
return grpcStatus.Errorf(codes.Internal, "Could not resolve symlink for source %v", source) | ||
if fsType == typeTmpFs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have more than a couple cases now for the fsType now, I wonder if it'd warrant throwing this into a case now? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, switch case
makes more sense here.
grpc.go
Outdated
if len(splitSourceSlice) > 1 { | ||
k8sStorageType := splitSourceSlice[len(splitSourceSlice)-2] | ||
if k8sStorageType == "emphemeral" { | ||
// Only mount tmpfs once on the host, and reuse the the mount destination |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my own ignorance likely -- can you clarify "mount temps once on the host" ? The agent is running in the guest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we shouldn't mount tmpfs
in the host. And direct mount it in the guest vm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@egernst Sorry I will update that comment in the code to avoid the confusion. It should read, Only mount tmpfs once inside the VM, and reuse the the mount destination
Ephemeral Volumes
live and die with the pod. They are created and mounted on the host by kublet and then binded mounted to every container in the pod that wants to use the volume. In case of kata, the pod is our VM Sandbox. So the volume has to be confined within the VM context. This tmpfs has to be created inside the VM, hence you see the code to create tmpfs in the agent
. This way we increase the isolation provided by kata to the containers. The containers that want to share the data using ephemeral volumes won't have to leak
their data to the host because the backing volume stays within VM memory.
@jshachm By the time control comes to kata, kublet has already provision tmpfs on the host. We can choose to ingore that and provision our tmpfs inside the guest as this patching achieving to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you create some tests for this change (either unit tests here, or maybe raise an issue in https://github.com/kata-containers/tests for a different type of test)?
grpc.go
Outdated
splitSourceSlice := strings.Split(ociMnt.Source, "/") | ||
if len(splitSourceSlice) > 1 { | ||
k8sStorageType := splitSourceSlice[len(splitSourceSlice)-2] | ||
if k8sStorageType == "emphemeral" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Is this a reliable test? This appears to be looking for the word "emphemeral" in the
ociMnt.Source
path (and that "magic tag" might change at some point. - This block uses deep indentation. You could save one level by restructuring slightly:
if k8sStorageType != "emphemeral" { continue } ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jodh-intel Is it very reliable to check for k8sStorageType == "emphemeral"
condition. The string emphemeral
is introduced by this patch, https://github.com/kata-containers/runtime/pull/307/files#diff-8a56f5a1fa5971cc43e650ffa24f1e2bR376
That patch in runtime
loops over the given OCI mounts to look for the mounts used by k8s for ephemeral storage. It then replaces those mount paths on the host with the mount paths in the guest by adding "/ephemeral"
to the path. So it's very much in our control the existance of that string. Right now without this patch, an ephemeral volume is passed to guested via 9p.
e.g. A typical ephemeral volume
path looks like, /var/lib/kubelet/pods/366c3a75-4869-11e8-b479-507b9ddd5ce4/volumes/kubernetes.io~empty-dir/cache-volume
But this resides on the host. We want that volume to be provisioned inside our VM. We can keep the same path or we can make it simpler since we have the total control on what happens inside the sandbox VM. So I replace that original source path with much simpler "/ephemeral"
. Inside the VM that above path becomes, /ephemeral/cache-volume
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jodh-intel I will add some unit test cases.
grpc.go
Outdated
for idx, ociMnt := range ociSpec.Mounts { | ||
splitSourceSlice := strings.Split(ociMnt.Source, "/") | ||
if len(splitSourceSlice) > 1 { | ||
k8sStorageType := splitSourceSlice[len(splitSourceSlice)-2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could create a variable for len(splitSourceSlice)
as you've scanned the string twice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
Instead of checking k8sStorageType, we should add an ephemeral storage driver type and let ociSpec.Mounts just reference it. For example, Let req.stroage have And in ociSpec.Mounts, let ociSpec.Mounts[i].Mount have The main change is in addStorages, we create the tmpfs mountpoint at |
@bergwolf Sounds good, I will give it a shot and update the PR. |
@harche just to clarify about ephemeral volumes. Do you expect anything to be present on the host at a location like I ask because if this is basically an empty volume, then great, a simple volume backed by tmpfs on the guest will do the trick. Just want to make sure we're on the same page ! |
@sboeuf There isn't any data initially. Ephemeral volumes are used only for sharing the data between the containers of the same pod. So the volume is empty to begin with unless a container from the respective pod writes something in it. |
@harche okay thanks for clarification ;) |
@sboeuf @bergwolf @jodh-intel I have updated this PR as well as corresponding PR in runtime. |
@harche Thanks for the change. It |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Once CI is green, it can be merged.
mount.go
Outdated
_, err = commonStorageHandler(storage) | ||
return "", err | ||
|
||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can delete this else to make CI happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
Codecov Report
@@ Coverage Diff @@
## master #236 +/- ##
==========================================
+ Coverage 43.42% 43.84% +0.42%
==========================================
Files 14 14
Lines 2091 2178 +87
==========================================
+ Hits 908 955 +47
- Misses 1070 1103 +33
- Partials 113 120 +7
Continue to review full report at Codecov.
|
Thanks @harche! lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment!
Looks good to me otherwise.
mount.go
Outdated
|
||
func ephemeralStorageHandler(storage pb.Storage, s *sandbox) (string, error) { | ||
if _, err := os.Stat(storage.MountPoint); os.IsNotExist(err) { | ||
os.MkdirAll(storage.MountPoint, os.ModePerm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test the return of MkdirAll()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch @sboeuf!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harche You do not even need to check if the directory exists before calling MkdirAll
. From the docs, "If path is already a directory, MkdirAll does nothing and returns nil."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amshinde Well I think you need to check this because in case it does exist, you don't need to call into commonStorageHandler()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and why are we not calling then commonStorageHandler()
then? It may happen that the directory exists, but we skip mounting altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amshinde the tmpfs mount point has to be mounted on the guest only once. Then rest of the containers will bind mount that directory.
e.g. If all the containers need to have an ephemeral volume in the pod they will bind mount, say /sharedMount directory on the VM. But before that this /sharedMount has to be mounted on the VM backed by tmpfs (but only once)
The above code is tested with a k8s yaml with 3 containers sharing the mount point.
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
annotations:
io.kubernetes.cri.untrusted-workload: "true"
labels:
app: myapp
spec:
containers:
# Application Container
- name: myapp-container
volumeMounts:
- mountPath: /cache
name: cache-volume
imagePullPolicy: Never
image: app_image:latest
command: ['sh', '-c', 'cat /cache/test && sleep 3600']
initContainers:
# Copy Data container
- name: data-container
volumeMounts:
- mountPath: /cache
name: cache-volume
imagePullPolicy: Never
image: data_image:latest
command: ['sh', '-c', 'cp /test /cache/']
# Append Data container
- name: init-append-data
volumeMounts:
- mountPath: /cache
name: cache-volume
imagePullPolicy: Never
image: test_image:latest
command: ['sh', '-c', 'echo test >> /cache/test']
volumes:
- name: cache-volume
emptyDir:
medium: "Memory"
``
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amshinde It's is intentionally skipped mounting if dir exists because then every single container end up pointing towards a seperate tmpfs volume and they don't end up sharing the volume. So only once (for the first time) you have to make sure the dir exists and then mount it on guest backed by tmpfs. The corresponding enteries in all container's OCI spec already point towards bind mounting that dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harche Thanks for the explanation, my question was based on the assumption that you would be adding storage just once for the first container.
Adding dnm label until the mkdir check is added. |
if fsType != type9pFs { | ||
var err error | ||
|
||
var err error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you dont need to declare err here, since you use a new instance in the inner scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. thanks.
mount.go
Outdated
|
||
func ephemeralStorageHandler(storage pb.Storage, s *sandbox) (string, error) { | ||
if _, err := os.Stat(storage.MountPoint); os.IsNotExist(err) { | ||
os.MkdirAll(storage.MountPoint, os.ModePerm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harche You do not even need to check if the directory exists before calling MkdirAll
. From the docs, "If path is already a directory, MkdirAll does nothing and returns nil."
mount.go
Outdated
var err error | ||
switch fsType { | ||
case typeTmpFs, type9pFs: | ||
if err := createDestinationDir(destination); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harche Are you not already creating the mount destination directory in the "ephemeral" storage handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will update the PR. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@harche CI is failing,can you take a look so that this can be merged. |
@harche please fix the unit tests. |
Ephemeral volumes needs to mounted inside VM and not passed as 9pfs mounts. Fixes : kata-containers#235 Signed-off-by: Harshal Patil <[email protected]>
Ephemeral volumes needs to mounted inside VM
and not passed as 9pfs mounts.
Signed-off-by: Harshal Patil [email protected]