Skip to content

Commit 50d8e87

Browse files
authored
[Nomad] Support using external gluster cluster in Nomad (kadalu#659)
- Use Kadalu CSI as a bridge b/n external gluster cluster and Nomad jobs which requires persistent data Fixes: kadalu#510 Signed-off-by: Leela Venkaiah G <[email protected]>
1 parent 68ef132 commit 50d8e87

File tree

7 files changed

+598
-0
lines changed

7 files changed

+598
-0
lines changed

doc/quick-start.adoc

+2
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55

66
Download the latest release of Kadalu Kubectl plugin using,
77

8+
For using external gluster cluster volumes in HashiCorp Nomad, please refer https://github.com/kadalu/kadalu/tree/devel/nomad[Nomad folder] in Kadalu repo
9+
810
[source,console]
911
----
1012

nomad/README.md

+221
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
# Kadalu CSI Plugin
2+
3+
- The configuration here is for using external gluster volumes as persistent storage in Nomad using Kadalu CSI
4+
- Please refer actual job files before proceeding with demo and change config as required and follow along with commands according to your config
5+
- Locally tested against Nomad v1.1.4
6+
7+
## Local Development
8+
9+
- This section can be skipped if you already have a Nomad cluster setup
10+
11+
``` sh
12+
# Clone config repo used to create local Nomad cluster in docker and install shipyard following it's README
13+
-> git clone https://github.com/leelavg/kadalu-nomad && cd kadalu-nomad
14+
15+
# After install shipyard, create local cluster
16+
-> shipyard run
17+
[...]
18+
-> eval $(shipyard env)
19+
-> export job_dir="$(pwd)/kadalu"
20+
```
21+
- End of local cluster creation and nomad config
22+
23+
24+
## Demo
25+
26+
### Pre-requisites
27+
- You can configure vars mentioned in `cluster.vars` to reflect your external gluster details
28+
- For convenience necessary vars are set from CLI while running the job
29+
30+
``` sh
31+
-> export volname="sample-pool" gluster_hosts="10.x.x.x" gluster_volname="sample-vol" job_dir="${job_dir:-$(pwd)}"
32+
# Make sure external gluster volume is started and quota is set
33+
-> ssh $gluster_hosts "gluster volume info $gluster_volname | grep Status"
34+
Status: Started
35+
36+
-> ssh $gluster_hosts "gluster volume quota $gluster_volname enable"
37+
volume quota : success
38+
```
39+
40+
### CSI Deployment
41+
42+
- Controller part
43+
``` sh
44+
-> nomad run -var="volname=$volname" -var="gluster_hosts=$gluster_hosts" -var="gluster_volname=$gluster_volname" $job_dir/controller.nomad
45+
==> 2021-09-20T18:23:07+05:30: Monitoring evaluation "19317b74"
46+
2021-09-20T18:23:07+05:30: Evaluation triggered by job "kadalu-csi-controller"
47+
==> 2021-09-20T18:23:08+05:30: Monitoring evaluation "19317b74"
48+
2021-09-20T18:23:08+05:30: Evaluation within deployment: "d9ee4dd7"
49+
2021-09-20T18:23:08+05:30: Allocation "d55e314d" created: node "4e105698", group "controller"
50+
2021-09-20T18:23:08+05:30: Evaluation status changed: "pending" -> "complete"
51+
==> 2021-09-20T18:23:08+05:30: Evaluation "19317b74" finished with status "complete"
52+
==> 2021-09-20T18:23:08+05:30: Monitoring deployment "d9ee4dd7"
53+
✓ Deployment "d9ee4dd7" successful
54+
55+
2021-09-20T18:23:28+05:30
56+
ID = d9ee4dd7
57+
Job ID = kadalu-csi-controller
58+
Job Version = 0
59+
Status = successful
60+
Description = Deployment completed successfully
61+
62+
Deployed
63+
Task Group Desired Placed Healthy Unhealthy Progress Deadline
64+
controller 1 1 1 0 2021-09-20T13:03:27Z
65+
```
66+
67+
- Nodeplugin part
68+
``` sh
69+
-> nomad run -var="volname=$volname" -var="gluster_hosts=$gluster_hosts" -var="gluster_volname=$gluster_volname" $job_dir/nodeplugin.nomad
70+
==> 2021-09-20T18:23:53+05:30: Monitoring evaluation "bd4d95d1"
71+
2021-09-20T18:23:53+05:30: Evaluation triggered by job "kadalu-csi-nodeplugin"
72+
==> 2021-09-20T18:23:54+05:30: Monitoring evaluation "bd4d95d1"
73+
2021-09-20T18:23:54+05:30: Allocation "4c05ab5a" created: node "4e105698", group "nodeplugin"
74+
2021-09-20T18:23:54+05:30: Evaluation status changed: "pending" -> "complete"
75+
==> 2021-09-20T18:23:54+05:30: Evaluation "bd4d95d1" finished with status "complete"
76+
```
77+
78+
- Check CSI plugin status
79+
``` sh
80+
-> nomad plugin status kadalu-csi
81+
ID = kadalu-csi
82+
Provider = kadalu
83+
Version = 0.8.6
84+
Controllers Healthy = 1
85+
Controllers Expected = 1
86+
Nodes Healthy = 1
87+
Nodes Expected = 1
88+
89+
Allocations
90+
ID Node ID Task Group Version Desired Status Created Modified
91+
d55e314d 4e105698 controller 0 run running 1m20s ago 1m ago
92+
4c05ab5a 4e105698 nodeplugin 0 run running 35s ago 20s ago
93+
```
94+
95+
### Volume Ops
96+
97+
- We'll go through volume creation, attachment and deletion operations, basically, a typical volume life-cycle
98+
99+
``` sh
100+
# Creation of nomad volume
101+
-> sed -e "s/POOL/$volname/" -e "s/GHOST/$gluster_hosts/" -e "s/GVOL/$gluster_volname/" $job_dir/volume.hcl | nomad volume create -
102+
Created external volume csi-test with ID csi-test
103+
104+
# Attach the volume to a sample app
105+
-> nomad run $job_dir/app.nomad
106+
==> 2021-09-20T18:28:28+05:30: Monitoring evaluation "e6dd3129"
107+
2021-09-20T18:28:28+05:30: Evaluation triggered by job "sample-pv-check"
108+
==> 2021-09-20T18:28:29+05:30: Monitoring evaluation "e6dd3129"
109+
2021-09-20T18:28:29+05:30: Evaluation within deployment: "814e328c"
110+
2021-09-20T18:28:29+05:30: Allocation "64745b25" created: node "4e105698", group "apps"
111+
2021-09-20T18:28:29+05:30: Evaluation status changed: "pending" -> "complete"
112+
==> 2021-09-20T18:28:29+05:30: Evaluation "e6dd3129" finished with status "complete"
113+
==> 2021-09-20T18:28:29+05:30: Monitoring deployment "814e328c"
114+
✓ Deployment "814e328c" successful
115+
116+
2021-09-20T18:28:58+05:30
117+
ID = 814e328c
118+
Job ID = sample-pv-check
119+
Job Version = 0
120+
Status = successful
121+
Description = Deployment completed successfully
122+
123+
Deployed
124+
Task Group Desired Placed Healthy Unhealthy Progress Deadline
125+
apps 1 1 1 0 2021-09-20T13:08:56Z
126+
127+
# Allocation ID (64745b25) is from above o/p
128+
-> export app=64745b25
129+
130+
# Verify CSI Volume is accessible or not
131+
-> nomad alloc exec $app bash /kadalu/script.sh
132+
This is a sample application
133+
# df -h
134+
Filesystem Size Used Available Use% Mounted on
135+
<gluster_hosts>:<gluster_volname> 181.2M 0 181.2M 0% /mnt/pv
136+
# mount
137+
Write/Read test on PV mount
138+
Mon Sep 20 12:59:34 UTC 2021
139+
SUCCESS
140+
141+
# Let's write some data on the volume
142+
-> nomad alloc exec $app bash -c 'cd /mnt/pv; for i in {1..10}; do cat /dev/urandom | tr -dc [:space:][:print:] | head -c 1m > file$i; done;'
143+
144+
# Checksum of written data
145+
-> nomad alloc exec $app bash -c 'ls /mnt/pv; find /mnt/pv -type f -exec md5sum {} + | cut -f1 -d" " | sort | md5sum'
146+
file1 file2 file4 file6 file8
147+
file10 file3 file5 file7 file9
148+
6776dd355c0f2ba5a1781b9831e5c174 -
149+
150+
# We'll stop sample app and run again to check data persistence
151+
-> nomad status
152+
ID Type Priority Status Submit Date
153+
kadalu-csi-controller service 50 running 2021-09-20T18:23:07+05:30
154+
kadalu-csi-nodeplugin system 50 running 2021-09-20T18:23:53+05:30
155+
sample-pv-check service 50 running 2021-09-20T18:28:28+05:30
156+
157+
-> nomad stop sample-pv-check
158+
==> 2021-09-20T18:36:47+05:30: Monitoring evaluation "eecc0c00"
159+
2021-09-20T18:36:47+05:30: Evaluation triggered by job "sample-pv-check"
160+
==> 2021-09-20T18:36:48+05:30: Monitoring evaluation "eecc0c00"
161+
2021-09-20T18:36:48+05:30: Evaluation within deployment: "814e328c"
162+
2021-09-20T18:36:48+05:30: Evaluation status changed: "pending" -> "complete"
163+
==> 2021-09-20T18:36:48+05:30: Evaluation "eecc0c00" finished with status "complete"
164+
==> 2021-09-20T18:36:48+05:30: Monitoring deployment "814e328c"
165+
✓ Deployment "814e328c" successful
166+
167+
2021-09-20T18:36:48+05:30
168+
ID = 814e328c
169+
Job ID = sample-pv-check
170+
Job Version = 0
171+
Status = successful
172+
Description = Deployment completed successfully
173+
174+
Deployed
175+
Task Group Desired Placed Healthy Unhealthy Progress Deadline
176+
apps 1 1 1 0 2021-09-20T13:08:56Z
177+
178+
-> nomad run $job_dir/app.nomad
179+
==> 2021-09-20T18:37:49+05:30: Monitoring evaluation "e04b4549"
180+
2021-09-20T18:37:49+05:30: Evaluation triggered by job "sample-pv-check"
181+
==> 2021-09-20T18:37:50+05:30: Monitoring evaluation "e04b4549"
182+
2021-09-20T18:37:50+05:30: Evaluation within deployment: "66d246ee"
183+
2021-09-20T18:37:50+05:30: Allocation "526d5543" created: node "4e105698", group "apps"
184+
2021-09-20T18:37:50+05:30: Evaluation status changed: "pending" -> "complete"
185+
==> 2021-09-20T18:37:50+05:30: Evaluation "e04b4549" finished with status "complete"
186+
==> 2021-09-20T18:37:50+05:30: Monitoring deployment "66d246ee"
187+
✓ Deployment "66d246ee" successful
188+
189+
2021-09-20T18:38:10+05:30
190+
ID = 66d246ee
191+
Job ID = sample-pv-check
192+
Job Version = 2
193+
Status = successful
194+
Description = Deployment completed successfully
195+
196+
Deployed
197+
Task Group Desired Placed Healthy Unhealthy Progress Deadline
198+
apps 1 1 1 0 2021-09-20T13:18:08Z
199+
200+
# Allocation ID is reset and md5sum is matched after stop and run of same job
201+
-> export app=526d5543
202+
-> nomad alloc exec $app bash -c 'ls /mnt/pv; find /mnt/pv -type f -exec md5sum {} + | cut -f1 -d" " | sort | md5sum'
203+
file1 file10 file2 file3 file4 file5 file6 file7 file8 file9
204+
6776dd355c0f2ba5a1781b9831e5c174 -
205+
206+
# Cleanup
207+
-> nomad stop sample-pv-check
208+
-> nomad volume delete csi-test
209+
-> nomad stop kadalu-csi-nodeplugin
210+
-> nomad stop kadalu-csi-controller
211+
212+
# Destroying local cluster
213+
-> shipyard destroy
214+
```
215+
216+
## Contact
217+
218+
- For any extra information/feature wrt Kadalu CSI, please raise an issue against kadalu [repo](https://github.com/kadalu/kadalu)
219+
- For any extra information wrt local Nomad dev setup for CSI, please raise an issue against kadalu-nomad [repo](https://github.com/leelavg/kadalu-nomad)
220+
- Based on ask/feature request, we may work on supporting internal gluster deployed and managed by Nomad itself (feature parity wrt current Kubernetes deployments)
221+
- If this folder isn't updated frequently one can find updated jobs at kadalu nomad [folder](https://github.com/kadalu/kadalu/tree/devel/nomad)

nomad/app.nomad

+45
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
variable "cn_network" {
2+
default = "dc1"
3+
}
4+
5+
variable "vol-id" {
6+
default = "csi-test"
7+
}
8+
9+
job "sample-pv-check" {
10+
datacenters = ["${var.cn_network}"]
11+
12+
group "apps" {
13+
volume "test" {
14+
type = "csi"
15+
source = "${var.vol-id}"
16+
access_mode = "multi-node-multi-writer"
17+
attachment_mode = "file-system"
18+
}
19+
20+
task "sample" {
21+
# To verify volume is mounted correctly and accessible, please run
22+
# 'nomad alloc exec <alloc_id> bash /kadalu/script.sh'
23+
# after this job is scheduled and running on a nomad client
24+
driver = "docker"
25+
26+
config {
27+
image = "kadalu/sample-pv-check-app:latest"
28+
force_pull = false
29+
30+
entrypoint = [
31+
"tail",
32+
"-f",
33+
"/dev/null",
34+
]
35+
}
36+
37+
volume_mount {
38+
volume = "test"
39+
40+
# Script in this image looks for PV mounted at '/mnt/pv'
41+
destination = "/mnt/pv"
42+
}
43+
}
44+
}
45+
}

nomad/cluster.vars

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# client_nodes is only applicable for local dev environment
2+
client_nodes=0
3+
4+
# Below are the variables with defaults that corresponding job accepts
5+
6+
/* # controller.nomad */
7+
/* cn_network = "dc1" */
8+
/* volname = "sample-pool" */
9+
/* gluster_hosts = "ghost.example.com" */
10+
/* gluster_volname = "dist" */
11+
/* gluster_user = "root" */
12+
/* kadalu_version = "0.8.6" */
13+
/* ssh_priv_path = "/root/.ssh/id_rsa" */
14+
15+
/* # nodeplugin.nomad */
16+
/* cn_network = "dc1" */
17+
/* volname = "sample-pool" */
18+
/* gluster_hosts = "ghost.example.com" */
19+
/* gluster_volname = "dist" */
20+
/* gluster_user = "root" */
21+
/* kadalu_version = "0.8.6" */
22+
23+
/* # app.nomad */
24+
/* cn_network = "dc1" */
25+
/* vol_id = "csi-test" */

0 commit comments

Comments
 (0)