-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat dragonfly-based Object Storage. #4057
Conversation
Hi, @SandyXSD @zhijian-pro ✋! I am maintainer of the dragonflyoss, this PR aims to implement Dragonfly-based object storage in |
@XDTD It changes a lot of dependencies, are these all necessary? |
|
b982850
to
fdca2b7
Compare
@zhijian-pro We have removed the dependency on Dragonfly. Could you please review the code again? |
@XDTD I tested it locally and it didn't pass the unit test |
@XDTD The dragonfly endpoint format is best keep the same style as other object storage You can join our slack to communicate this pr in a more timely manner |
Thx for ur advice. We will modify the code later. Btw, we cannot reproduce the unit test error locally, which may be related to the deployment of Dragonfly. We are currently in the process of writing comprehensive usage documentation, and we will provide it all together once it's completed, expected to be by tomorrow. |
@gaius-qi @XDTD Sorry, we don't know enough about the practical uses of this PR and the practical problems it can solve. So, can you join the slack to communicate this pr ? |
Dragonfly is a p2p cache rather than an object store, so we'd like to have more experience on that before merging into upstream. |
@davies We will provide documentation guide including benchmark next week. |
Hello, we have completed the documentation, which includes the architecture and testing. There are both English and Chinese versions available. Plz review. |
Can you post the doc here (in this PR or a gist)? |
I have posted the doc in this PR, plz review. |
The PDF version of the doc is available here: |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #4057 +/- ##
==========================================
- Coverage 55.53% 55.14% -0.40%
==========================================
Files 153 154 +1
Lines 38906 39301 +395
==========================================
+ Hits 21608 21673 +65
- Misses 14875 15213 +338
+ Partials 2423 2415 -8 ☔ View full report in Codecov by Sentry. |
Signed-off-by: XDTD <[email protected]>
Signed-off-by: Gaius <[email protected]>
Signed-off-by: XDTD <[email protected]>
Signed-off-by: XDTD <[email protected]>
Signed-off-by: XDTD <[email protected]>
Signed-off-by: XDTD <[email protected]>
Signed-off-by: XDTD <[email protected]>
Signed-off-by: XDTD <[email protected]>
Signed-off-by: XDTD <[email protected]>
Signed-off-by: Gaius <[email protected]>
refactor: dragonfly object storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Gaius <[email protected]>
feat: add license header to dragonfly.go
Do we have any performance metric compared with Juicefs' dedicated cache cluster? |
mark |
This PR aims to implement Dragonfly-based object storage in JuiceFS. Currently, I am participating in the open-source summer camp organized by CCF, and the topic is integrating Fluid with Dragonfly to accelerate data distribution. A detailed description is as follows:
Fluid is an open source Kubernetes-native Distributed Dataset Orchestrator and Accelerator for data-intensive applications, such as big data and AI applications. It is hosted by the Cloud Native Computing Foundation (CNCF) as a sandbox project.
Dragonfly provides efficient, stable and secure file distribution and image acceleration based on p2p technology to be the best practice and standard solution in cloud native architectures.
The goal of this project is to integrate Fluid with Dragonfly to fully leverage the advantages of P2P distribution in addressing potential bottleneck bandwidth issues during Fluid's distribution process. Currently, Fluid supports multiple caching runtime, one of which is JuiceFSRuntime. We have decided to implement Dragonfly-based object storage in JuiceFS to achieve integration between Fluid and Dragonfly.
Basic functionality development based on the ObjectStorage interface has already been completed. Unit tests have been added in the objectstorage_test.go, and the supported functions are as follows:
String()
Create()
Head()
Get()
Put()
Copy()
Delete()
List()
ListAll()
Architecture
Dragonfly becomes a new cache between JuiceFS and object storage. There are optimizations in the reading and writing. When reading, if there is no hit in the JuiceFS cache, the traffic will be forwarded to Dragonfly Peer. It can be used to eliminate the bandwidth limit of the object storage through P2P technology, thereby accelerating file downloading. When writing, you can set async writing to the object storage and sync writing to the P2P network to increase writing speed.
Install JuiceFS with Dragonfly
Dragonfly Kubernetes Cluster Setup
Setup Kubernetes Cluster
Kind is recommended if no Kubernetes cluster is available for testing.
Create kind multi-node cluster configuration file kind-config.yaml, configuration content is as follows:
Create a kind multi-node cluster using the configuration file:
kind create cluster --config kind-config.yaml
Kind loads dragonfly image
Pull dragonfly latest images:
Kind cluster loads dragonfly latest images:
Create dragonfly cluster based on helm charts
Create helm charts configuration file charts-config.yaml and setmanager.config.objectStorage to change configuration of the object storage, configuration content is as follows:
Create a dragonfly cluster using the configuration file:
Check that dragonfly is deployed successfully:
Expose Dragonfly Dfstore's Object Storage service port
Create the dfstore.yaml configuration to expose the port on which the Dragonfly Dfstore listens. The default port is 65004 and settargetPort to 65004.
Create service:
Forward request to Dragonfly Dfstore:
Install JuiceFS
For detailed installation documentation, please refer to JuiceFS document. For Linux and macOS systems, you can use a one-click installation script that automatically downloads and installs the latest version of the JuiceFS client based on your hardware architecture. (Note that it hasn't been merged into the main branch yet, so you'll need to manually compile from this PR)
After installation, you can specify the use of Dragonfly
Dfstore as the object storage when executing commands such as juicefs format and juicefs config:
$ juicefs format \ --storage dragonfly \ --bucket "http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2" \ redis://192.168.1.6:6379/1 \ myjfs-dragonfly
The bucket parameters is added to the query string, and the Endpoint set to the exposed Dragonfly Dfstore's object storage service. The details of parameter is as follows:
Verify the created file system status:
When using other JuiceFS commands, you can also specify Dragonfly Dfstore as the object storage. For detailed JuiceFS commands documentation, plese refer to document.
Verify
The endpoint is
http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2.
It should pass the unit test for teststorage.Multi-Node Read Performance Testing
Hit JuiceFS Cache
Test the caching performance of JuiceFS. The configured object storage needs to be the same as in Dragonfly.
Mount the file system using the juicefs mount command:
Creat a 1GB file in the mounted directory:
For the first read, JuiceFS triggers back-to-source download and it takes 11.356 seconds.
Clear the page cache and read again. Hit JuiceFS's cache, and it takes 0.347 seconds.
Hit Dragonfly Cache
Test the performance of Dragonfly cache and hit local peer cache and remote peer cache. Expose Draognfly Peer's 65004 port.
Initialize the file system based on Dragonfly:
Mount the file system and disable JuiceFS's cache:
Create a 1GB file in the mounted directory:
For the first read. No cache hits for JuiceFS and Dragonfly, and it triggers back-to-source download, taking 11.147 seconds.
Clear the cache of the file system and read again. Hit the cache of Dragonfly's Local Peer and it takes 1.554 seconds.
Test the cache speed of the hit Dragonfly Remote Peer, delete the Peer:
Recreate the pod:
Clear the cache of the file system and read again. The created Pod has no cache, and it hits the cache of the Remote Peer, it takes 1.937 seconds.
Analysis
Test results show JuiceFS and Dragonfly integration. It can effectively reduce the file download time. Due to the influence of the network environment of the machine itself, the actual download time is not important, but the ratio of the increase in the download time in different scenarios is very important.
Single-Node Performance Testing
Use juicefs format to format the file system based on object storage and the file system based on Dragonfly.
JuiceFS:
Dragonfly Sync Write:
juicefs format \ --storage dragonfly \ --bucket "http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2" \ redis://192.168.1.6:6379/1 \ myjfs-dragonfly
Dragonfly Async Write:
juicefs format \ --storage dragonfly \ --bucket "http://127.0.0.1:65004/your_bucket?mode=1&maxReplicas=2" \ redis://192.168.1.6:6379/1 \ myjfs-dragonfly
Clear the cache and wait for Pod to be recreated before each test:
Forward request to Dragonfly Dfstore:
Then, mount them using juicefs mount and execute the test commands.
Big File Sequential Read
Big File Sequential Write
Big File Random Read
Big File Random Write
fio --name=big-file-random-write \ --directory=/mnt/jfs \ --rw=randwrite --refill_buffers \ --size=4G --bs=256k
Analysis
JuiceFS integrates Dragonfly without performance degradation in large file reading and writing, and has better acceleration when reading the same large file repeatedly.
Install Fluid & JuiceFS Runtime with Dragonfly
Dragonfly Kubernetes Cluster Setup
Setup Kubernetes Cluster
Kind is recommended if no Kubernetes cluster is available for testing.
Create kind multi-node cluster configuration file kind-config.yaml, configuration content is as follows:
Create a kind multi-node cluster using the configuration file:
kind create cluster --config kind-config.yaml
Kind loads dragonfly image
Pull dragonfly latest images:
Kind cluster loads dragonfly latest images:
Create dragonfly cluster based on helm charts
Create helm charts configuration file charts-config.yaml and setmanager.config.objectStorage to change configuration of the object storage, configuration content is as follows:
Create a dragonfly cluster using the configuration file:
Check that dragonfly is deployed successfully:
Expose Dragonfly Dfstore's Object Storage service port
Create the dfstore.yaml configuration to expose the port on which the Dragonfly Dfstore listens. The default port is 65004 and settargetPort to 65004.
Create service:
Forward request to Dragonfly Dfstore:
Install Fluid
Create Fluid cluster based on helm charts
For detailed installation documentation, please refer to document.
Create namespace:
Create a dragonfly cluster:
Check that Fluid is deployed successfully:
Create Dataset
Create a secret using the configuration:
Create dataset.yaml:
Where:
mountPoint
: The directory where users store data in the JuiceFS file system, starting withjuicefs://
. For example,juicefs:///demo
is a subdirectory/demo
of the JuiceFS file system.storage
: The type of object storage.bucket
: The parameters is added to the query string, and the Endpoint set to the exposed Dragonfly Dfstore's object storage service.The details of Endpoint parameters is as follows:
Verify the created file system status:
Create a Dataset:
Create JuiceFS Runtime
Create JuiceFS Runtime:
JuiceFS Runtime to start successfully:
$ kubectl get po |grep jfs jfsdemo-worker-0 1/1 Running 0 4m2s
Check the dataset status, it has been bound to JuiceFS Runtime:
Fluid has created PV and PVC with the same name as the dataset.
$ kubectl get pv | grep jfs default-jfsdemo 100Pi ROX Retain Bound default/jfsdemo fluid 16h $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE jfsdemo Bound default-jfsdemo 100Pi ROX fluid 21h
Verify
Create an application to use the dataset:
Create application:
Check that the Pod has been created:
The Pod has been created successfully and JuiceFS's FUSE component has also started successfully.
Reference