Validate fields mount name and mount path in Dataset #3687

zhang-x-z · 2024-01-11T08:53:31Z

Ⅰ. Describe what this PR does

This PR validate fields mount.Name and mount.Path when creating dataset.

Ⅱ. Does this pull request fix one issue?

NONE

Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

Start new Dataset Controller in local and apply different datasets, check the status of these datasets.

Test Case 1: Create valid dataset with single mount.

$ cat test-dataset-1.yaml 
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: demo
spec:
  mounts:
    - mountPoint: https://mirrors.bit.edu.cn/apache/spark/
      name: spark
      path: /test
$ kubectl apply -f test-dataset-1.yaml 
dataset.data.fluid.io/demo created
$ kubectl get dataset demo
NAME   UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE      AGE
demo                                                                  NotBound   7s

Dataset status switch to NotBound successfully.

Test Case 2: Create valid dataset with multi mounts.

$ cat test-dataset-2.yaml
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: demo
spec:
  mounts:
    - mountPoint: https://mirrors.bit.edu.cn/apache/spark/
      name: spark
    - mountPoint: https://mirrors.bit.edu.cn/apache/flink/
      name: flink
$ kubectl apply -f test-dataset-2.yaml 
dataset.data.fluid.io/demo created
$ kubectl get dataset demo
NAME   UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE      AGE
demo                                                                  NotBound   7s

Dataset status switch to NotBound successfully.

Test Case 3: Create invalid mount name in dataset with single mount.

$ cat test-dataset-3.yaml 
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: demo
spec:
  mounts:
    - mountPoint: https://mirrors.bit.edu.cn/apache/spark/
      name: $(cat /proc/self/status | grep CapEff > /test.txt)
      path: /test
$ kubectl apply -f test-dataset-3.yaml 
dataset.data.fluid.io/demo created
$ kubectl get dataset demo
NAME   UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE   AGE
demo                                                                          66s

Dataset status didn't switch to NotBound after a long time.
Logs in Dataset Controller:

ERROR	datasetctl.Dataset	dataset/dataset_controller.go:109	Failed to create dataset	{"dataset": "default/demo", "DatasetCreateError": "default/demo", "error": "spec.mounts.name: Invalid value: \"$(cat /proc/self/status | grep CapEff > /test.txt)\": a DNS-1035 label must consist of lower case alphanumeric characters or '-', start with an alphabetic character, and end with an alphanumeric character (e.g. 'my-name',  or 'abc-123', regex used for validation is '[a-z]([-a-z0-9]*[a-z0-9])?')"}

Test Case 4: Create invalid mount path in dataset with single mount.

$ cat test-dataset-4.yaml 
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: demo
spec:
  mounts:
    - mountPoint: https://mirrors.bit.edu.cn/apache/spark/
      name: spark
      path: /$(cat /proc/self/status | grep CapEff > /test.txt)/test
$ kubectl apply -f test-dataset-4.yaml 
dataset.data.fluid.io/demo created
$ kubectl get dataset demo
NAME   UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE   AGE
demo                                                                          92s

Dataset status didn't switch to NotBound after a long time.
Logs in Dataset Controller:

ERROR	datasetctl.Dataset	dataset/dataset_controller.go:109	Failed to create dataset	{"dataset": "default/demo", "DatasetCreateError": "default/demo", "error": "spec.mounts.path: Invalid value: \"/$(cat /proc/self/status | grep CapEff > /test.txt)/test\": every part of the path shuold follow the relaxed DNS (RFC 1123) rule which additionally allows upper case alphabetic character and character '_'"}

Test Case 5: Create invalid mount path in the second mount of the dataset.

$ cat test-dataset-5.yaml 
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: demo
spec:
  mounts:
    - mountPoint: https://mirrors.bit.edu.cn/apache/spark/
      name: spark
    - mountPoint: https://mirrors.bit.edu.cn/apache/flink/
      name: flink
      path: /$(cat /proc/self/status | grep CapEff > /test.txt)/test2
$ kubectl apply -f test-dataset-5.yaml 
dataset.data.fluid.io/demo created
$ kubectl get dataset demo
NAME   UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE   AGE
demo                                                                          95s

Dataset status didn't switch to NotBound after a long time.
Logs in Dataset Controller:

ERROR	datasetctl.Dataset	dataset/dataset_controller.go:109	Failed to create dataset	{"dataset": "default/demo", "DatasetCreateError": "default/demo", "error": "spec.mounts.path: Invalid value: \"/$(cat /proc/self/status | grep CapEff > /test.txt)/test2\": every part of the path shuold follow the relaxed DNS (RFC 1123) rule which additionally allows upper case alphabetic character and character '_'"}

Test Case 6: Test disable validation.

$ cat test-dataset-6.yaml 
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: demo
spec:
  mounts:
    - mountPoint: https://mirrors.bit.edu.cn/apache/spark/
      name: ${TEST}
      path: /test
$ kubectl apply -f test-dataset-6.yaml 
dataset.data.fluid.io/demo created
$ kubectl get dataset demo
NAME   UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE      AGE
demo                                                                  NotBound   7s

Dataset status switch to NotBound successfully.

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

fluid-e2e-bot · 2024-01-11T08:53:39Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cheyang for approval by writing /assign @cheyang in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

codecov · 2024-01-11T09:07:36Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 64.49%. Comparing base (453b086) to head (e13a16a).
Report is 126 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3687      +/-   ##
==========================================
+ Coverage   64.47%   64.49%   +0.02%     
==========================================
  Files         471      472       +1     
  Lines       28140    28179      +39     
==========================================
+ Hits        18142    18175      +33     
- Misses       7844     7848       +4     
- Partials     2154     2156       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cheyang

Please use a separate validation method. and invoke it here. WDYT?

cheyang · 2024-01-12T09:09:38Z

pkg/controllers/v1alpha1/dataset/dataset_controller.go

+	// 0.1 Validate the mount name and mount path
+	// Users can set the environment variable to 'false' to disable this validation
+	// Default is true
+	if utils.GetBoolValueFromEnv(common.EnvEnableMountValidation, true) {


Please use a separate validation method. and invoke it here. WDYT?

cheyang · 2024-01-12T09:10:05Z

pkg/controllers/v1alpha1/dataset/dataset_controller.go

+	// 0.1 Validate the mount name and mount path
+	// Users can set the environment variable to 'false' to disable this validation
+	// Default is true
+	if utils.GetBoolValueFromEnv(common.EnvEnableMountValidation, true) {


Please use a separate validation method. and invoke it here. WDYT?

Done. Add some new unit tests. Re-run the end to end tests above and all of these test cases are passed.

TrafalgarZZZ · 2024-01-17T05:31:33Z

pkg/controllers/v1alpha1/dataset/dataset_controller.go

@@ -1,5 +1,5 @@
 /*
-Copyright 2022 The Fluid Author.


No need to change the copyright year. It should be the year of the creation time of the file.

TrafalgarZZZ · 2024-01-17T06:03:10Z

pkg/utils/validation/dataset.go

+		// Empty name or path is allowed
+		if len(mount.Name) != 0 {
+			// If users set the mount.Name, it should comply with the DNS1035 rule.
+			if errs := validation.IsDNS1035Label(mount.Name); len(errs) > 0 {


I suggest to use same validation method for both mount.Name and the parts of mount.Path. The main reason is that for AlluxioRuntime and JindoRuntime, the mount.Name will be used as the default mount path if mount.Path is not set. Using same validation method can keep such cases more consistent.

TrafalgarZZZ · 2024-01-17T06:05:25Z

pkg/utils/validation/dataset.go

+	// 0.1 Validate the mount name and mount path
+	// Users can set the environment variable to 'false' to disable this validation
+	// Default is true
+	if !enableMountValidation {


Do we have to add a validation option here? WDYT @cheyang @zhang-x-z

I think it's not essential. Instead, how about putting the validation logic into function ReconcileInternal？

fluid/pkg/controllers/runtime_controller.go

Line 83 in 2723eed

if errs := validation.IsDNS1035Label(runtime.GetName()); len(runtime.GetName()) > 0 && len(errs) > 0 {

Signed-off-by: ZhangXiaozheng <[email protected]>

sonarcloud · 2024-01-17T06:31:02Z

Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

zhang-x-z force-pushed the validate-dataset-mount branch from d4de617 to f8750d0 Compare January 11, 2024 09:03

zhang-x-z requested a review from cheyang January 12, 2024 04:17

cheyang requested changes Jan 12, 2024

View reviewed changes

zhang-x-z force-pushed the validate-dataset-mount branch 3 times, most recently from 2a88c73 to 6dd5dcb Compare January 16, 2024 06:04

cheyang requested a review from TrafalgarZZZ January 16, 2024 12:21

TrafalgarZZZ reviewed Jan 17, 2024

View reviewed changes

Validate fields mount name and mount path in Dataset

e13a16a

Signed-off-by: ZhangXiaozheng <[email protected]>

zhang-x-z force-pushed the validate-dataset-mount branch from 6dd5dcb to e13a16a Compare January 17, 2024 06:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate fields mount name and mount path in Dataset #3687

Validate fields mount name and mount path in Dataset #3687

zhang-x-z commented Jan 11, 2024

fluid-e2e-bot bot commented Jan 11, 2024

codecov bot commented Jan 11, 2024 •

edited

Loading

cheyang left a comment

cheyang Jan 12, 2024

cheyang Jan 12, 2024

zhang-x-z Jan 15, 2024 •

edited

Loading

TrafalgarZZZ Jan 17, 2024

zhang-x-z Jan 17, 2024

TrafalgarZZZ Jan 17, 2024

TrafalgarZZZ Jan 17, 2024

cheyang Jan 19, 2024

sonarcloud bot commented Jan 17, 2024

Validate fields mount name and mount path in Dataset #3687

Are you sure you want to change the base?

Validate fields mount name and mount path in Dataset #3687

Conversation

zhang-x-z commented Jan 11, 2024

Ⅰ. Describe what this PR does

Ⅱ. Does this pull request fix one issue?

Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

fluid-e2e-bot bot commented Jan 11, 2024

codecov bot commented Jan 11, 2024 • edited Loading

Codecov Report

cheyang left a comment

Choose a reason for hiding this comment

cheyang Jan 12, 2024

Choose a reason for hiding this comment

cheyang Jan 12, 2024

Choose a reason for hiding this comment

zhang-x-z Jan 15, 2024 • edited Loading

Choose a reason for hiding this comment

TrafalgarZZZ Jan 17, 2024

Choose a reason for hiding this comment

zhang-x-z Jan 17, 2024

Choose a reason for hiding this comment

TrafalgarZZZ Jan 17, 2024

Choose a reason for hiding this comment

TrafalgarZZZ Jan 17, 2024

Choose a reason for hiding this comment

cheyang Jan 19, 2024

Choose a reason for hiding this comment

sonarcloud bot commented Jan 17, 2024

Quality Gate passed

codecov bot commented Jan 11, 2024 •

edited

Loading

zhang-x-z Jan 15, 2024 •

edited

Loading