Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat dragonfly-based Object Storage. #4057

Merged
merged 13 commits into from
Nov 22, 2023
Merged

Feat dragonfly-based Object Storage. #4057

merged 13 commits into from
Nov 22, 2023

Conversation

XDTD
Copy link
Contributor

@XDTD XDTD commented Sep 20, 2023

This PR aims to implement Dragonfly-based object storage in JuiceFS. Currently, I am participating in the open-source summer camp organized by CCF, and the topic is integrating Fluid with Dragonfly to accelerate data distribution. A detailed description is as follows:

Fluid is an open source Kubernetes-native Distributed Dataset Orchestrator and Accelerator for data-intensive applications, such as big data and AI applications. It is hosted by the Cloud Native Computing Foundation (CNCF) as a sandbox project.

Dragonfly provides efficient, stable and secure file distribution and image acceleration based on p2p technology to be the best practice and standard solution in cloud native architectures.

The goal of this project is to integrate Fluid with Dragonfly to fully leverage the advantages of P2P distribution in addressing potential bottleneck bandwidth issues during Fluid's distribution process. Currently, Fluid supports multiple caching runtime, one of which is JuiceFSRuntime. We have decided to implement Dragonfly-based object storage in JuiceFS to achieve integration between Fluid and Dragonfly.

Basic functionality development based on the ObjectStorage interface has already been completed. Unit tests have been added in the objectstorage_test.go, and the supported functions are as follows:

  • String()
  • Create()
  • Head()
  • Get()
  • Put()
  • Copy()
  • Delete()
  • List()
  • ListAll()

Architecture

image

Dragonfly becomes a new cache between JuiceFS and object storage. There are optimizations in the reading and writing. When reading, if there is no hit in the JuiceFS cache, the traffic will be forwarded to Dragonfly Peer. It can be used to eliminate the bandwidth limit of the object storage through P2P technology, thereby accelerating file downloading. When writing, you can set async writing to the object storage and sync writing to the P2P network to increase writing speed.

Install JuiceFS with Dragonfly

Dragonfly Kubernetes Cluster Setup

Setup Kubernetes Cluster

Kind is recommended if no Kubernetes cluster is available for testing.
Create kind multi-node cluster configuration file kind-config.yaml, configuration content is as follows:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
  - role: worker
  - role: worker

Create a kind multi-node cluster using the configuration file:

kind create cluster --config kind-config.yaml

Kind loads dragonfly image

Pull dragonfly latest images:

docker pull dragonflyoss/scheduler:latest
docker pull dragonflyoss/manager:latest
docker pull dragonflyoss/dfdaemon:latest

Kind cluster loads dragonfly latest images:

kind load docker-image dragonflyoss/scheduler:latest
kind load docker-image dragonflyoss/manager:latest
kind load docker-image dragonflyoss/dfdaemon:latest

Create dragonfly cluster based on helm charts

Create helm charts configuration file charts-config.yaml and setmanager.config.objectStorage to change configuration of the object storage, configuration content is as follows:

scheduler:
  replicas: 1
  metrics:
    enable: true
  config:
    verbose: true
    pprofPort: 18066

seedPeer:
  replicas: 1
  metrics:
    enable: true
  config:
    verbose: true
    pprofPort: 18066
    objectStorage:
      enable: true

dfdaemon:
  metrics:
    enable: true
  config:
    verbose: true
    pprofPort: 18066
    objectStorage:
      enable: true    

manager:
  replicas: 1
  metrics:
    enable: true
  config:
    verbose: true
    pprofPort: 18066
    objectStorage:
      # Enable object storage.
      enable: true
      # Name is object storage name of type, it can be s3, oss.
      name: 'your_storage'
      # Region is storage region.
      region: 'your_region'
      # Endpoint is datacenter endpoint.
      endpoint: 'your_endpoint'
      # AccessKey is access key ID.
      accessKey: 'your_access_key'
      # SecretKey is access key secret.
      secretKey: 'your_secret_key'
      # s3ForcePathStyle sets force path style for s3, true by default.
      # Set this to `true` to force the request to use path-style addressing,
      # i.e., `http://s3.amazonaws.com/BUCKET/KEY`. By default, the S3 client
      # will use virtual hosted bucket addressing when possible
      # (`http://BUCKET.s3.amazonaws.com/KEY`).
      # Refer to https://github.com/aws/aws-sdk-go/blob/main/aws/config.go#L118.
      s3ForcePathStyle: false

jaeger:
  enable: true

Create a dragonfly cluster using the configuration file:

$ helm repo add dragonfly https://dragonflyoss.github.io/helm-charts/
$ helm install --wait --create-namespace --namespace dragonfly-system dragonfly dragonfly/dragonfly -f charts-config.yaml
NAME: dragonfly
LAST DEPLOYED: Thu Sep 28 17:35:49 2023
NAMESPACE: dragonfly-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Get the scheduler address by running these commands:
  export SCHEDULER_POD_NAME=$(kubectl get pods --namespace dragonfly-system -l "app=dragonfly,release=dragonfly,component=scheduler" -o jsonpath={.items[0].metadata.name})
  export SCHEDULER_CONTAINER_PORT=$(kubectl get pod --namespace dragonfly-system $SCHEDULER_POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
  kubectl --namespace dragonfly-system port-forward $SCHEDULER_POD_NAME 8002:$SCHEDULER_CONTAINER_PORT
  echo "Visit http://127.0.0.1:8002 to use your scheduler"

2. Get the dfdaemon port by running these commands:
  export DFDAEMON_POD_NAME=$(kubectl get pods --namespace dragonfly-system -l "app=dragonfly,release=dragonfly,component=dfdaemon" -o jsonpath={.items[0].metadata.name})
  export DFDAEMON_CONTAINER_PORT=$(kubectl get pod --namespace dragonfly-system $DFDAEMON_POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
  You can use $DFDAEMON_CONTAINER_PORT as a proxy port in Node.

3. Configure runtime to use dragonfly:
  https://d7y.io/docs/getting-started/quick-start/kubernetes/


4. Get Jaeger query URL by running these commands:
  export JAEGER_QUERY_PORT=$(kubectl --namespace dragonfly-system get services dragonfly-jaeger-query -o jsonpath="{.spec.ports[0].port}")
  kubectl --namespace dragonfly-system port-forward service/dragonfly-jaeger-query 16686:$JAEGER_QUERY_PORT
  echo "Visit http://127.0.0.1:16686/search?limit=20&lookback=1h&maxDuration&minDuration&service=dragonfly to query download events"

Check that dragonfly is deployed successfully:

$ kubectl get po -n dragonfly-system
NAME                                 READY   STATUS    RESTARTS        AGE
dragonfly-dfdaemon-65rz7             1/1     Running   5 (6m17s ago)   8m43s
dragonfly-dfdaemon-rnvsj             1/1     Running   5 (6m23s ago)   8m43s
dragonfly-jaeger-7d58dfcfc8-qmn8c    1/1     Running   0               8m43s
dragonfly-manager-6f8b4f5c66-qq8sd   1/1     Running   0               8m43s
dragonfly-mysql-0                    1/1     Running   0               8m43s
dragonfly-redis-master-0             1/1     Running   0               8m43s
dragonfly-redis-replicas-0           1/1     Running   0               8m43s
dragonfly-redis-replicas-1           1/1     Running   0               7m33s
dragonfly-redis-replicas-2           1/1     Running   0               5m50s
dragonfly-scheduler-0                1/1     Running   0               8m43s
dragonfly-seed-peer-0                1/1     Running   3 (5m56s ago)   8m43s

Expose Dragonfly Dfstore's Object Storage service port

Create the dfstore.yaml configuration to expose the port on which the Dragonfly Dfstore listens. The default port is 65004 and settargetPort to 65004.

kind: Service
apiVersion: v1
metadata:
  name: dfstore
spec:
  selector:
    app: dragonfly
    component: dfdaemon
    release: dragonfly

  ports:
  - protocol: TCP
    port: 65004
    targetPort: 65004
 
  type: NodePort

Create service:

kubectl --namespace dragonfly-system apply -f dfstore.yaml

Forward request to Dragonfly Dfstore:

kubectl --namespace dragonfly-system port-forward service/dfstore 65004:65004

Install JuiceFS

For detailed installation documentation, please refer to JuiceFS document. For Linux and macOS systems, you can use a one-click installation script that automatically downloads and installs the latest version of the JuiceFS client based on your hardware architecture. (Note that it hasn't been merged into the main branch yet, so you'll need to manually compile from this PR)

# /usr/local/bin
curl -sSL https://d.juicefs.com/install | sh -

After installation, you can specify the use of Dragonfly
Dfstore as the object storage when executing commands such as juicefs format and juicefs config:

$ juicefs format \
    --storage dragonfly \
    --bucket "http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2" \
    redis://192.168.1.6:6379/1 \
    myjfs-dragonfly

The bucket parameters is added to the query string, and the Endpoint set to the exposed Dragonfly Dfstore's object storage service. The details of parameter is as follows:

Param A Type Describe Value Required
mode string Write mode. WriteBack represents sync writing to the backend object storage, while AsyncWriteBack represents async writing to the backend object storage. 0 represents AsyncWriteBack, 1 represents WriteBack, and the default value is 1. N
maxReplicas string The maximum number of replicas to be written to P2P network. [0,1000] N

Verify the created file system status:

$ juicefs status redis://192.168.1.6:6379/1
2023/10/17 19:09:35.738635 juicefs[2273224] <INFO>: Meta address: redis://localhost:6379/1 [interface.go:498]
2023/10/17 19:09:35.739344 juicefs[2273224] <WARNING>: AOF is not enabled, you may lose data if Redis is not shutdown properly. [info.go:84]
2023/10/17 19:09:35.739407 juicefs[2273224] <INFO>: Ping redis latency: 22.384µs [redis.go:3572]
{
  "Setting": {
    "Name": "myjfs-dragonfly",
    "UUID": "316d39df-a7ba-4cde-8cc7-5568a7a0f745",
    "Storage": "dragonfly",
    "Bucket": "http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2",
    "BlockSize": 4096,
    "Compression": "none",
    "EncryptAlgo": "aes256gcm-rsa",
    "TrashDays": 1,
    "MetaVersion": 1,
    "MinClientVersion": "1.1.0-A",
    "DirStats": true
  },
  "Sessions": [],
  "Statistic": {
    "UsedSpace": 0,
    "AvailableSpace": 1125899906842624,
    "UsedInodes": 0,
    "AvailableInodes": 10485760
  }
}

When using other JuiceFS commands, you can also specify Dragonfly Dfstore as the object storage. For detailed JuiceFS commands documentation, plese refer to document.

Verify

juicefs objbench \
    --storage dragonfly \
    http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2

The endpoint is http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2. It should pass the unit test for teststorage.

+----------+---------------------+--------------------------------------------------+
| CATEGORY |         TEST        |                      RESULT                      |
+----------+---------------------+--------------------------------------------------+
|    basic |     create a bucket |                                             pass |
|    basic |       put an object |                                             pass |
|    basic |       get an object |                                             pass |
|    basic |       get non-exist |                                             pass |
|    basic |  get partial object | failed to get object with the offset out of r... |
|    basic |      head an object |                                             pass |
|    basic |    delete an object |                                             pass |
|    basic |    delete non-exist |                                             pass |
|    basic |        list objects |                                             pass |
|    basic |         special key | put encode file failed: bad response status 4... |
|     sync |    put a big object |                                             pass |
|     sync | put an empty object |                                             pass |
|     sync |    multipart upload |                                      not support |
|     sync |  change owner/group |                                      not support |
|     sync |   change permission |                                      not support |
|     sync |        change mtime |                                      not support |
+----------+---------------------+--------------------------------------------------+

Multi-Node Read Performance Testing

Hit JuiceFS Cache

Test the caching performance of JuiceFS. The configured object storage needs to be the same as in Dragonfly.

juicefs format \
    --storage s3 \
    --bucket https://myjuicefs.s3.us-east-2.amazonaws.com \
    --access-key your_access_key \
    --secret-key your_secret_key \
    redis://192.168.1.6:6379/2 \
    myjfs

Mount the file system using the juicefs mount command:

juicefs mount redis://192.168.1.6:6379/2  /mnt/jfs

Creat a 1GB file in the mounted directory:

$ time dd if=/dev/zero of=/mnt/jfs/test.txt bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 10.7013 s, 98.0 MB/s
dd if=/dev/zero of=/mnt/jfs/test.txt bs=1M count=1000  0.00s user 0.33s system 3% cpu 10.711 total

For the first read, JuiceFS triggers back-to-source download and it takes 11.356 seconds.

$ time cp /mnt/jfs/test.txt /dev/null                                                         
cp /mnt/jfs/test.txt /dev/null  0.00s user 0.29s system 2% cpu 11.356 total

Clear the page cache and read again. Hit JuiceFS's cache, and it takes 0.347 seconds.

$ sync && echo 3 > /proc/sys/vm/drop_caches
$ time cp /mnt/jfs/test.txt /dev/null
cp /mnt/jfs/test.txt /dev/null  0.00s user 0.30s system 86% cpu 0.347 total

Hit Dragonfly Cache

Test the performance of Dragonfly cache and hit local peer cache and remote peer cache. Expose Draognfly Peer's 65004 port.

export dragonfly_dfdaemon_name=$(kubectl get po -n dragonfly-system | grep dragonfly-dfdaemon- | tail -n 1 | awk '{print $1}')
kubectl --namespace dragonfly-system port-forward $dragonfly_dfdaemon_name   65004:65004

Initialize the file system based on Dragonfly:

juicefs format \
    --storage dragonfly \
    --bucket "http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2" \
    redis://192.168.1.6:6379/1 \
    myjfs-dragonfly

Mount the file system and disable JuiceFS's cache:

juicefs mount redis://192.168.1.6:6379/1  /mnt/jfs --cache-size=0

Create a 1GB file in the mounted directory:

$ time dd if=/dev/zero of=/mnt/jfs/test.txt bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 10.2689 s, 102 MB/s
dd if=/dev/zero of=/mnt/jfs/test.txt bs=1M count=1000  0.00s user 0.38s system 3% cpu 10.271 total

For the first read. No cache hits for JuiceFS and Dragonfly, and it triggers back-to-source download, taking 11.147 seconds.

$ time cp /mnt/jfs/test.txt /dev/null 
cp /mnt/jfs/test.txt /dev/null  0.00s user 0.30s system 2% cpu 11.147 total

Clear the cache of the file system and read again. Hit the cache of Dragonfly's Local Peer and it takes 1.554 seconds.

$ sync && echo 3 > /proc/sys/vm/drop_caches
$ time cp /mnt/jfs/test.txt /dev/null
cp /mnt/jfs/test.txt /dev/null  0.00s user 0.32s system 20% cpu 1.554 total

Test the cache speed of the hit Dragonfly Remote Peer, delete the Peer:

$ kubectl delete po $dragonfly_dfdaemon_name -n dragonfly-system
$ export dragonfly_dfdaemon_name=$(kubectl get po -n dragonfly-system --sort-by=.metadata.creationTimestamp | grep "dragonfly-dfdaemon-" | awk '{print $1}' | tail -n 1)
$ kubectl get po -n dragonfly-system
NAME                                 READY   STATUS    RESTARTS   AGE
dragonfly-dfdaemon-5q4r8             1/1     Running   0          30s # new pod
dragonfly-dfdaemon-nhzcc             1/1     Running   0          19m
dragonfly-jaeger-c7947b579-q4hr4     1/1     Running   0          19m
dragonfly-manager-5dc5fbf548-zrf7d   1/1     Running   0          19m
dragonfly-mysql-0                    1/1     Running   0          19m
dragonfly-redis-master-0             1/1     Running   0          19m
dragonfly-redis-replicas-0           1/1     Running   0          19m
dragonfly-redis-replicas-1           1/1     Running   0          18m
dragonfly-redis-replicas-2           1/1     Running   0          18m
dragonfly-scheduler-0                1/1     Running   0          19m
dragonfly-seed-peer-0                1/1     Running   0          19m

Recreate the pod:

kubectl --namespace dragonfly-system port-forward $dragonfly_dfdaemon_name 65004:65004

Clear the cache of the file system and read again. The created Pod has no cache, and it hits the cache of the Remote Peer, it takes 1.937 seconds.

$ sync && echo 3 > /proc/sys/vm/drop_caches
$ time cp /mnt/jfs/test.txt /dev/null
cp /mnt/jfs/test.txt /dev/null  0.01s user 0.32s system 16% cpu 1.937 total

Analysis

image

Test results show JuiceFS and Dragonfly integration. It can effectively reduce the file download time. Due to the influence of the network environment of the machine itself, the actual download time is not important, but the ratio of the increase in the download time in different scenarios is very important.

Single-Node Performance Testing

Use juicefs format to format the file system based on object storage and the file system based on Dragonfly.
JuiceFS:

juicefs format \
    --storage s3 \
    --bucket https://myjuicefs.s3.us-east-2.amazonaws.com \
    --access-key your_access_key \
    --secret-key your_secret_key \
    redis://192.168.1.6:6379/2 \
    myjfs

Dragonfly Sync Write:

juicefs format \
    --storage dragonfly \
    --bucket "http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2" \
    redis://192.168.1.6:6379/1 \
    myjfs-dragonfly

Dragonfly Async Write:

juicefs format \
    --storage dragonfly \
    --bucket "http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2" \
    redis://192.168.1.6:6379/1 \
    myjfs-dragonfly

Clear the cache and wait for Pod to be recreated before each test:

$ sync && echo 3 > /proc/sys/vm/drop_caches
$ rm -rf /var/jfsCache
$ kubectl delete pod -n dragonfly-system -l component=dfdaemon
$ kubectl delete pod -n dragonfly-system -l component=seed-peer
$ kubectl get po -n dragonfly-system
NAME                                 READY   STATUS    RESTARTS   AGE
dragonfly-dfdaemon-5q4r8             1/1     Running   0          30s 
dragonfly-dfdaemon-nhzcc             1/1     Running   0          30s
dragonfly-jaeger-c7947b579-q4hr4     1/1     Running   0          19m
dragonfly-manager-5dc5fbf548-zrf7d   1/1     Running   0          19m
dragonfly-mysql-0                    1/1     Running   0          19m
dragonfly-redis-master-0             1/1     Running   0          19m
dragonfly-redis-replicas-0           1/1     Running   0          19m
dragonfly-redis-replicas-1           1/1     Running   0          18m
dragonfly-redis-replicas-2           1/1     Running   0          18m
dragonfly-scheduler-0                1/1     Running   0          19m
dragonfly-seed-peer-0                1/1     Running   0          30s

Forward request to Dragonfly Dfstore:

kubectl --namespace dragonfly-system port-forward service/dfstore 65004:65004

Then, mount them using juicefs mount and execute the test commands.

Big File Sequential Read

image

fio --name=sequential-read --directory=/mnt/jfs --rw=read --refill_buffers --bs=256k --size=4G

Big File Sequential Write

image

fio --name=sequential-write --directory=/mnt/jfs --rw=write --refill_buffers --bs=256k --size=4G --end_fsync=1

Big File Random Read

image

fio --name=big-file-rand-read \
    --directory=/mnt/jfs \
    --rw=randread --refill_buffers \
    --size=4G --filename=randread.bin \
    --bs=256k --pre_read=1
sync && echo 3 > /proc/sys/vm/drop_caches
fio --name=big-file-rand-read \
    --directory=/mnt/jfs \
    --rw=randread --refill_buffers \
    --size=4G --filename=randread.bin \
    --bs=256k  

Big File Random Write

image

fio --name=big-file-random-write \                                                             
    --directory=/mnt/jfs \
    --rw=randwrite --refill_buffers \
    --size=4G --bs=256k   

Analysis

JuiceFS integrates Dragonfly without performance degradation in large file reading and writing, and has better acceleration when reading the same large file repeatedly.

Install Fluid & JuiceFS Runtime with Dragonfly

Dragonfly Kubernetes Cluster Setup

Setup Kubernetes Cluster

Kind is recommended if no Kubernetes cluster is available for testing.
Create kind multi-node cluster configuration file kind-config.yaml, configuration content is as follows:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
  - role: worker
  - role: worker

Create a kind multi-node cluster using the configuration file:

kind create cluster --config kind-config.yaml

Kind loads dragonfly image

Pull dragonfly latest images:

docker pull dragonflyoss/scheduler:latest
docker pull dragonflyoss/manager:latest
docker pull dragonflyoss/dfdaemon:latest

Kind cluster loads dragonfly latest images:

kind load docker-image dragonflyoss/scheduler:latest
kind load docker-image dragonflyoss/manager:latest
kind load docker-image dragonflyoss/dfdaemon:latest

Create dragonfly cluster based on helm charts

Create helm charts configuration file charts-config.yaml and setmanager.config.objectStorage to change configuration of the object storage, configuration content is as follows:

scheduler:
  replicas: 1
  metrics:
    enable: true
  config:
    verbose: true
    pprofPort: 18066

seedPeer:
  replicas: 1
  metrics:
    enable: true
  config:
    verbose: true
    pprofPort: 18066
    objectStorage:
      enable: true

dfdaemon:
  metrics:
    enable: true
  config:
    verbose: true
    pprofPort: 18066
    objectStorage:
      enable: true    

manager:
  replicas: 1
  metrics:
    enable: true
  config:
    verbose: true
    pprofPort: 18066
    objectStorage:
      # Enable object storage.
      enable: true
      # Name is object storage name of type, it can be s3, oss.
      name: 'your_storage'
      # Region is storage region.
      region: 'your_region'
      # Endpoint is datacenter endpoint.
      endpoint: 'your_endpoint'
      # AccessKey is access key ID.
      accessKey: 'your_access_key'
      # SecretKey is access key secret.
      secretKey: 'your_secret_key'
      # s3ForcePathStyle sets force path style for s3, true by default.
      # Set this to `true` to force the request to use path-style addressing,
      # i.e., `http://s3.amazonaws.com/BUCKET/KEY`. By default, the S3 client
      # will use virtual hosted bucket addressing when possible
      # (`http://BUCKET.s3.amazonaws.com/KEY`).
      # Refer to https://github.com/aws/aws-sdk-go/blob/main/aws/config.go#L118.
      s3ForcePathStyle: false

jaeger:
  enable: true

Create a dragonfly cluster using the configuration file:

$ helm repo add dragonfly https://dragonflyoss.github.io/helm-charts/
$ helm install --wait --create-namespace --namespace dragonfly-system dragonfly dragonfly/dragonfly -f charts-config.yaml
NAME: dragonfly
LAST DEPLOYED: Thu Sep 28 17:35:49 2023
NAMESPACE: dragonfly-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Get the scheduler address by running these commands:
  export SCHEDULER_POD_NAME=$(kubectl get pods --namespace dragonfly-system -l "app=dragonfly,release=dragonfly,component=scheduler" -o jsonpath={.items[0].metadata.name})
  export SCHEDULER_CONTAINER_PORT=$(kubectl get pod --namespace dragonfly-system $SCHEDULER_POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
  kubectl --namespace dragonfly-system port-forward $SCHEDULER_POD_NAME 8002:$SCHEDULER_CONTAINER_PORT
  echo "Visit http://127.0.0.1:8002 to use your scheduler"

2. Get the dfdaemon port by running these commands:
  export DFDAEMON_POD_NAME=$(kubectl get pods --namespace dragonfly-system -l "app=dragonfly,release=dragonfly,component=dfdaemon" -o jsonpath={.items[0].metadata.name})
  export DFDAEMON_CONTAINER_PORT=$(kubectl get pod --namespace dragonfly-system $DFDAEMON_POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
  You can use $DFDAEMON_CONTAINER_PORT as a proxy port in Node.

3. Configure runtime to use dragonfly:
  https://d7y.io/docs/getting-started/quick-start/kubernetes/


4. Get Jaeger query URL by running these commands:
  export JAEGER_QUERY_PORT=$(kubectl --namespace dragonfly-system get services dragonfly-jaeger-query -o jsonpath="{.spec.ports[0].port}")
  kubectl --namespace dragonfly-system port-forward service/dragonfly-jaeger-query 16686:$JAEGER_QUERY_PORT
  echo "Visit http://127.0.0.1:16686/search?limit=20&lookback=1h&maxDuration&minDuration&service=dragonfly to query download events"

Check that dragonfly is deployed successfully:

$ kubectl get po -n dragonfly-system
NAME                                 READY   STATUS    RESTARTS        AGE
dragonfly-dfdaemon-65rz7             1/1     Running   5 (6m17s ago)   8m43s
dragonfly-dfdaemon-rnvsj             1/1     Running   5 (6m23s ago)   8m43s
dragonfly-jaeger-7d58dfcfc8-qmn8c    1/1     Running   0               8m43s
dragonfly-manager-6f8b4f5c66-qq8sd   1/1     Running   0               8m43s
dragonfly-mysql-0                    1/1     Running   0               8m43s
dragonfly-redis-master-0             1/1     Running   0               8m43s
dragonfly-redis-replicas-0           1/1     Running   0               8m43s
dragonfly-redis-replicas-1           1/1     Running   0               7m33s
dragonfly-redis-replicas-2           1/1     Running   0               5m50s
dragonfly-scheduler-0                1/1     Running   0               8m43s
dragonfly-seed-peer-0                1/1     Running   3 (5m56s ago)   8m43s

Expose Dragonfly Dfstore's Object Storage service port
Create the dfstore.yaml configuration to expose the port on which the Dragonfly Dfstore listens. The default port is 65004 and settargetPort to 65004.

kind: Service
apiVersion: v1
metadata:
  name: dfstore
spec:
  selector:
    app: dragonfly
    component: dfdaemon
    release: dragonfly

  ports:
  - protocol: TCP
    port: 65004
    targetPort: 65004
 
  type: NodePort

Create service:

kubectl --namespace dragonfly-system apply -f dfstore.yaml

Forward request to Dragonfly Dfstore:

kubectl --namespace dragonfly-system port-forward service/dfstore 65004:65004

Install Fluid

Create Fluid cluster based on helm charts

For detailed installation documentation, please refer to document.
Create namespace:

$ kubectl create ns fluid-system

Create a dragonfly cluster:

$ helm repo add fluid https://fluid-cloudnative.github.io/charts
$ helm repo update
$ helm install fluid fluid/fluid
NAME: fluid
LAST DEPLOYED: Thu Oct 12 21:54:34 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

Check that Fluid is deployed successfully:

$ kubectl get po -n fluid-system 
NAME                                   READY   STATUS    RESTARTS   AGE
csi-nodeplugin-fluid-nq65p             2/2     Running   0          7m12s
csi-nodeplugin-fluid-nrwbt             2/2     Running   0          7m12s
csi-nodeplugin-fluid-q565r             2/2     Running   0          7m12s
dataset-controller-5f5f46d969-lpsc7    1/1     Running   0          7m11s
fluid-webhook-75f489c7b5-whzjz         1/1     Running   0          7m12s
fluidapp-controller-54975849ff-w272h   1/1     Running   0          7m12s
Create Dataset
Create the secret.yaml:
apiVersion: v1
kind: Secret
metadata:
  name: jfs-secret
type: Opaque
stringData:
  name: dragonfly
  metaurl: redis://127.0.0.1:6379/3

Create Dataset

Create a secret using the configuration:

$ kubectl create -f secret.yaml

Create dataset.yaml:

$ cat<<EOF >dataset.yaml
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: jfsdemo
spec:
  mounts:
    - name: dragonfly
      mountPoint: "juicefs:///"
      options:
        bucket: "'http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2'"
        storage: "dragonfly"
      encryptOptions:
        - name: metaurl
          valueFrom:
            secretKeyRef:
              name: jfs-secret
              key: metaurl
EOF

Where:

  • mountPoint: The directory where users store data in the JuiceFS file system, starting with juicefs://. For example, juicefs:///demo is a subdirectory /demo of the JuiceFS file system.
  • storage: The type of object storage.
  • bucket: The parameters is added to the query string, and the Endpoint set to the exposed Dragonfly Dfstore's object storage service.

The details of Endpoint parameters is as follows:

Param A Type Describe Value Required
mode string Write mode. WriteBack represents sync writing to the backend object storage, while AsyncWriteBack represents async writing to the backend object storage. 0 represents AsyncWriteBack, 1 represents WriteBack, and the default value is 1. N
maxReplicas string The maximum number of replicas to be written to P2P network. [0,1000] N

Verify the created file system status:

Create a Dataset:

$ kubectl create -f dataset.yaml
dataset.data.fluid.io/jfsdemo created

Create JuiceFS Runtime

Create runtime.yaml:
$ cat<<EOF >runtime.yaml
apiVersion: data.fluid.io/v1alpha1
kind: JuiceFSRuntime
metadata:
  name: jfsdemo
spec:
  replicas: 1
  fuse:
    image: dragonflyoss/juicefs-fuse
    imageTag: 0.1.0
    imagePullPolicy: IfNotPresent
  juicefsVersion:
    image: dragonflyoss/juicefs-fuse
    imageTag: 0.1.0
    imagePullPolicy: IfNotPresent
  tieredstore:
    levels:
      - mediumtype: MEM
        path: /dev/shm
        quota: 40Gi
        low: "0.1"
EOF

Create JuiceFS Runtime:

$ kubectl create -f runtime.yaml
juicefsruntime.data.fluid.io/jfsdemo created

JuiceFS Runtime to start successfully:

$ kubectl get po |grep jfs
jfsdemo-worker-0                                          1/1     Running   0          4m2s

Check the dataset status, it has been bound to JuiceFS Runtime:

$ kubectl get dataset jfsdemo
NAME      UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE   AGE
jfsdemo   0.00B            0.00B    40.00GiB         0.0%                Bound   21h

Fluid has created PV and PVC with the same name as the dataset.

$ kubectl get pv | grep jfs                             
default-jfsdemo       100Pi      ROX            Retain           Bound    default/jfsdemo                                          fluid                   16h
$ kubectl get pvc
NAME      STATUS   VOLUME            CAPACITY   ACCESS MODES   STORAGECLASS   AGE
jfsdemo   Bound    default-jfsdemo   100Pi      ROX            fluid          21h

Verify

Create an application to use the dataset:

$ cat<<EOF >app.yaml
apiVersion: v1
kind: Pod
metadata:
  name: demo-app
spec:
  containers:
    - name: demo
      image: nginx
      volumeMounts:
        - mountPath: /demo
          name: demo
  volumes:
    - name: demo
      persistentVolumeClaim:
        claimName: jfsdemo
EOF

Create application:

$ kubectl create -f app.yaml  

Check that the Pod has been created:

$ kubectl get pod
NAME                 READY   STATUS    RESTARTS   AGE
demo-app             1/1     Running   0          40s
jfsdemo-fuse-vfqgt   1/1     Running   0          40s
jfsdemo-worker-0     1/1     Running   0          103s

The Pod has been created successfully and JuiceFS's FUSE component has also started successfully.

Reference

  1. JuiceFS Install
  2. JuiceFS Performance Testing
  3. JuiceFS Command Reference
  4. Dragonfly Quick Start
  5. Dragonfly Helm chart configuration file
  6. How to Use JuiceFS in Fluid

@CLAassistant
Copy link

CLAassistant commented Sep 20, 2023

CLA assistant check
All committers have signed the CLA.

@gaius-qi
Copy link

Hi, @SandyXSD @zhijian-pro ✋!
Thanks for maintaining this repository!

I am maintainer of the dragonflyoss, this PR aims to implement Dragonfly-based object storage in JuiceFS and would like it merged👍!

@SandyXSD
Copy link
Contributor

@XDTD It changes a lot of dependencies, are these all necessary?

@gaius-qi
Copy link

@XDTD It changes a lot of dependencies, are these all necessary?

@SandyXSD
Because Draognfly depends on the latest version of the library, it will be updated together.

@zhijian-pro
Copy link
Contributor

d7y.io/dragonfly/v2 (go1.20)has a higher minimum requirement for the go version than juicefs(go1.18).

@XDTD XDTD force-pushed the main branch 2 times, most recently from b982850 to fdca2b7 Compare September 26, 2023 06:21
@XDTD
Copy link
Contributor Author

XDTD commented Sep 26, 2023

d7y.io/dragonfly/v2 (go1.20)has a higher minimum requirement for the go version than juicefs(go1.18).

@zhijian-pro We have removed the dependency on Dragonfly. Could you please review the code again?

@zhijian-pro
Copy link
Contributor

@XDTD I tested it locally and it didn't pass the unit test TestDragonfly.
image

@zhijian-pro
Copy link
Contributor

@XDTD The dragonfly endpoint format is best keep the same style as other object storage dragonfly://127.0.0.1:8080/bucketname?param1=value1&param2=value2. Environment variables are complementary rather than the primary mode of passing parameters.

You can join our slack to communicate this pr in a more timely manner

@XDTD
Copy link
Contributor Author

XDTD commented Sep 27, 2023

@XDTD The dragonfly endpoint format is best keep the same style as other object storage dragonfly://127.0.0.1:8080/bucketname?param1=value1&param2=value2. Environment variables are complementary rather than the primary mode of passing parameters.

You can join our slack to communicate this pr in a more timely manner

Thx for ur advice. We will modify the code later. Btw, we cannot reproduce the unit test error locally, which may be related to the deployment of Dragonfly. We are currently in the process of writing comprehensive usage documentation, and we will provide it all together once it's completed, expected to be by tomorrow.

@zhijian-pro
Copy link
Contributor

@gaius-qi @XDTD Sorry, we don't know enough about the practical uses of this PR and the practical problems it can solve. So, can you join the slack to communicate this pr ?
This is more important than the actual code working properly, because we want this to be a really useful solution and not a brainstormed idea.

@davies
Copy link
Contributor

davies commented Sep 27, 2023

Dragonfly is a p2p cache rather than an object store, so we'd like to have more experience on that before merging into upstream.

@gaius-qi
Copy link

@davies We will provide documentation guide including benchmark next week.

@zhijian-pro zhijian-pro removed their request for review October 24, 2023 08:42
@XDTD
Copy link
Contributor Author

XDTD commented Nov 4, 2023

@gaius-qi @XDTD Sorry, we don't know enough about the practical uses of this PR and the practical problems it can solve. So, can you join the slack to communicate this pr ? This is more important than the actual code working properly, because we want this to be a really useful solution and not a brainstormed idea.

Hello, we have completed the documentation, which includes the architecture and testing. There are both English and Chinese versions available. Plz review.

@davies
Copy link
Contributor

davies commented Nov 4, 2023

Can you post the doc here (in this PR or a gist)?

@XDTD
Copy link
Contributor Author

XDTD commented Nov 5, 2023

Can you post the doc here (in this PR or a gist)?

I have posted the doc in this PR, plz review.

@XDTD
Copy link
Contributor Author

XDTD commented Nov 7, 2023

Can you post the doc here (in this PR or a gist)?

The PDF version of the doc is available here:
Fluid & JuiceFS & Dragonfly-ZH.pdf
Fluid & JuiceFS & Dragonfly-EN.pdf

Copy link

codecov bot commented Nov 8, 2023

Codecov Report

Attention: 334 lines in your changes are missing coverage. Please review.

Comparison is base (628b846) 55.53% compared to head (f7e7884) 55.14%.
Report is 4 commits behind head on main.

Files Patch % Lines
pkg/object/dragonfly.go 0.59% 334 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4057      +/-   ##
==========================================
- Coverage   55.53%   55.14%   -0.40%     
==========================================
  Files         153      154       +1     
  Lines       38906    39301     +395     
==========================================
+ Hits        21608    21673      +65     
- Misses      14875    15213     +338     
+ Partials     2423     2415       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pkg/object/dragonfly.go Outdated Show resolved Hide resolved
pkg/object/dragonfly.go Outdated Show resolved Hide resolved
@zhijian-pro
Copy link
Contributor

According to the latest service and code tests, there is still a 500 error code.
image

@zhijian-pro
Copy link
Contributor

According to the latest service and code tests, there is still a 500 error code. image

Have been solved

Copy link
Contributor

@zhijian-pro zhijian-pro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhijian-pro zhijian-pro requested a review from SandyXSD November 14, 2023 02:29
@SandyXSD SandyXSD merged commit e242ca5 into juicedata:main Nov 22, 2023
35 of 37 checks passed
@changweige
Copy link

Do we have any performance metric compared with Juicefs' dedicated cache cluster?

@lidaohang
Copy link
Contributor

mark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants