-
Notifications
You must be signed in to change notification settings - Fork 959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate Fluid with Kubeflow Pipline #3694
Conversation
Signed-off-by: wang-mask <[email protected]>
Signed-off-by: wang-mask <[email protected]>
Signed-off-by: ZhangXiaozheng <[email protected]>
Signed-off-by: wang-mask <[email protected]>
Signed-off-by: ZhangXiaozheng <[email protected]>
Quality Gate passedThe SonarCloud Quality Gate passed, but some issues were introduced. 2 New issues |
from kfp import dsl, compiler | ||
|
||
# Create a Fluid dataset which contains data in S3. | ||
@dsl.component(packages_to_install=['git+https://github.com/fluid-cloudnative/fluid-client-python.git']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we should avoid using package_to_install
in production environment (more details). But for now we only have image which install Fluid Python SDK in our image repo. I think this image should be maintained by Fluid community, so this maybe need to update when we have a official base image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. We can handle this in next PR. And currently we have optimized the fluid sdk, please check this doc for reference. https://github.com/fluid-cloudnative/fluid-client-python/blob/master/examples/ml_pipeline/pipeline.ipynb
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3694 +/- ##
=======================================
Coverage 64.46% 64.47%
=======================================
Files 471 471
Lines 28153 28153
=======================================
+ Hits 18149 18151 +2
+ Misses 7848 7847 -1
+ Partials 2156 2155 -1 ☔ View full report in Codecov by Sentry. |
- datasets | ||
- datasets/status | ||
- alluxioruntimes | ||
- alluxioruntimes/status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need the access of dataload? And I think the verbs can be restricted as create/delete/get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do need dataload. I will change the verbs and re-test it.
@@ -0,0 +1,27 @@ | |||
apiVersion: rbac.authorization.k8s.io/v1 | |||
kind: ClusterRole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not clusterRole. It should be Role in specified namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. It will be more safe in that case. For standalone KFP deployment, maybe we can only set this namespace to which KFP was deployed. More details about the namespace in Kubeflow.
@zhang-x-z I can merge this PR first. Please create another PR to continue enhancing according my comments. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cheyang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Ⅰ. Describe what this PR does
This PR provides a demo which demonstrates how to integrate Fluid with Kubeflow Pipline (KFP).
We build some Fluid KFP components to
We use these components to load and preheat the dataset FashionMNIST which is used to train and test a simple CNN. Users can directly use the YAML files we already compiled or modify the code we provide and re-generate the YAML files.
Ⅱ. Does this pull request fix one issue?
None
Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.
We have run this demo in our KFP environment.
Ⅳ. Describe how to verify it
Ⅴ. Special notes for reviews