Fix GID allocator by RomanBednar · Pull Request #850 · kubernetes-sigs/aws-efs-csi-driver

RomanBednar · 2022-12-05T14:51:19Z

Is this a bug fix or adding new feature?
Bug fix.

What is this PR about? / Why do we need it?

Dynamically provisioning EFS volume creates an access point (AP) for which it allocates a GID from default range or a range defined in SC parameters.

The GID allocator did not check what GIDs might be already in use. For example let's say we have an EFS with access point for /mydir path which is owned by GID 1000 with permissions of 770 that grant rwx access to group 1000. Then we create another access point with same path or "higher" (can be also /) with the same GID of 1000. Now whoever mounts the second access point will have rwx access to /mydir because the GID matches. The old code does not prevent this and so volume creation done by the driver might result in creating access point with a GID that is already used, and when it gets mounted to a pod this pod might get access to other pods data.

Additionally, when user changed the SC parameters it was not reflected in the driver (controller pod) and GID allocator kept allocating from the range that got in first (for a given FS ID). In order for GID range changes to take effect the pod had to be restarted.

What testing is done?

current unit tests
new unit test added
manual testing

RomanBednar · 2022-12-14T13:44:22Z

Here is what I've tested manually to verify the patches:

Test GID allocator avoids used GIDS

Create one PVC dynamically with OCP and one directly in AWS cloud with GID of 6999999.

$ aws efs describe-access-points --file-system-id fs-0c1bf51c4cc8a5771
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-0b38ad3470381476b    fsap-0b38ad3470381476b  pvc-a0e372b1-2f63-4561-a117-56df9df8d693        fs-0c1bf51c4cc8a5771    available                     269733383066
POSIXUSER       7000000 7000000
ROOTDIRECTORY   /dynamic_provisioning/pvc-a0e372b1-2f63-4561-a117-56df9df8d693
CREATIONINFO    7000000 7000000 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-051deabd137c8ea32    fsap-051deabd137c8ea32  console-4d9cd57f-8192-4e14-8e2d-eb4144c8d5fb    fs-0c1bf51c4cc8a5771    available             test-ap 269733383066
POSIXUSER       6999999 6999999
SECONDARYGIDS   0
ROOTDIRECTORY   /
TAGS    Name    test-ap

Create a new PVC with dynamic provisioning with the same FS ID
Observe the used GID was detected and next available (6999998) got assigned.

$ aws efs describe-access-points --file-system-id fs-0c1bf51c4cc8a5771
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-0b38ad3470381476b    fsap-0b38ad3470381476b  pvc-a0e372b1-2f63-4561-a117-56df9df8d693        fs-0c1bf51c4cc8a5771    available                     269733383066
POSIXUSER       7000000 7000000
ROOTDIRECTORY   /dynamic_provisioning/pvc-a0e372b1-2f63-4561-a117-56df9df8d693
CREATIONINFO    7000000 7000000 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-051deabd137c8ea32    fsap-051deabd137c8ea32  console-4d9cd57f-8192-4e14-8e2d-eb4144c8d5fb    fs-0c1bf51c4cc8a5771    available             test-ap 269733383066
POSIXUSER       6999999 6999999
SECONDARYGIDS   0
ROOTDIRECTORY   /
TAGS    Name    test-ap
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-05066babbad2648ac    fsap-05066babbad2648ac  pvc-1e332980-9c4f-41d4-9f03-aa47dd3feb37        fs-0c1bf51c4cc8a5771    available                     269733383066
POSIXUSER       6999998 6999998
ROOTDIRECTORY   /dynamic_provisioning/pvc-1e332980-9c4f-41d4-9f03-aa47dd3feb37
CREATIONINFO    6999998 6999998 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned

Test that GID allocator can re-use released GIDs

Remove the extra AP (test-ap) that was created manually and create another PVC with dynamic provisioning. Now that the GID (6999999) is released we can see this GID is assigned to the new dynamically provisioned AP because it is now the next available GID.

$ aws efs describe-access-points --file-system-id fs-0c1bf51c4cc8a5771
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-0b38ad3470381476b    fsap-0b38ad3470381476b  pvc-a0e372b1-2f63-4561-a117-56df9df8d693        fs-0c1bf51c4cc8a5771    available             269733383066
POSIXUSER       7000000 7000000
ROOTDIRECTORY   /dynamic_provisioning/pvc-a0e372b1-2f63-4561-a117-56df9df8d693
CREATIONINFO    7000000 7000000 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-05066babbad2648ac    fsap-05066babbad2648ac  pvc-1e332980-9c4f-41d4-9f03-aa47dd3feb37        fs-0c1bf51c4cc8a5771    available             269733383066
POSIXUSER       6999998 6999998
ROOTDIRECTORY   /dynamic_provisioning/pvc-1e332980-9c4f-41d4-9f03-aa47dd3feb37
CREATIONINFO    6999998 6999998 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-0d5ffbeba4c86ef2d    fsap-0d5ffbeba4c86ef2d  pvc-a8329ba9-ca9c-4164-b4fb-41fcdaba6014        fs-0c1bf51c4cc8a5771    available             269733383066
POSIXUSER       6999999 6999999
ROOTDIRECTORY   /dynamic_provisioning/pvc-a8329ba9-ca9c-4164-b4fb-41fcdaba6014
CREATIONINFO    6999999 6999999 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned

Test GID Range propagation

Recreate a storage class with gidRangeStart and gidRangeEnd parameters

$ oc get sc/efs-sc -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2022-12-08T10:48:42Z"
  name: efs-sc
  resourceVersion: "102207"
  uid: 6c66b91c-1aed-4b7e-8d4e-1518748b6e29
mountOptions:
- tls
parameters:
  basePath: /dynamic_provisioning
  directoryPerms: "700"
  fileSystemId: fs-0c1bf51c4cc8a5771
  gidRangeEnd: "2000"
  gidRangeStart: "1000"
  provisioningMode: efs-ap
provisioner: efs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

Without restarting any driver pods create another PVC and validate the new range was propagated correctly and new AP has GID 2000

$ aws efs describe-access-points --file-system-id fs-0c1bf51c4cc8a5771
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-0b38ad3470381476b    fsap-0b38ad3470381476b  pvc-a0e372b1-2f63-4561-a117-56df9df8d693        fs-0c1bf51c4cc8a5771    available             269733383066
POSIXUSER       7000000 7000000
ROOTDIRECTORY   /dynamic_provisioning/pvc-a0e372b1-2f63-4561-a117-56df9df8d693
CREATIONINFO    7000000 7000000 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-05066babbad2648ac    fsap-05066babbad2648ac  pvc-1e332980-9c4f-41d4-9f03-aa47dd3feb37        fs-0c1bf51c4cc8a5771    available             269733383066
POSIXUSER       6999998 6999998
ROOTDIRECTORY   /dynamic_provisioning/pvc-1e332980-9c4f-41d4-9f03-aa47dd3feb37
CREATIONINFO    6999998 6999998 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-0d5ffbeba4c86ef2d    fsap-0d5ffbeba4c86ef2d  pvc-a8329ba9-ca9c-4164-b4fb-41fcdaba6014        fs-0c1bf51c4cc8a5771    available             269733383066
POSIXUSER       6999999 6999999
ROOTDIRECTORY   /dynamic_provisioning/pvc-a8329ba9-ca9c-4164-b4fb-41fcdaba6014
CREATIONINFO    6999999 6999999 700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned
ACCESSPOINTS    arn:aws:elasticfilesystem:us-east-1:269733383066:access-point/fsap-04755a3ae8f8b7701    fsap-04755a3ae8f8b7701  pvc-fe36200c-bdb8-49ec-97bc-f9fb4e570627        fs-0c1bf51c4cc8a5771    available             269733383066
POSIXUSER       2000    2000
ROOTDIRECTORY   /dynamic_provisioning/pvc-fe36200c-bdb8-49ec-97bc-f9fb4e570627
CREATIONINFO    2000    2000    700
TAGS    efs.csi.aws.com/cluster true
TAGS    kubernetes.io/cluster/rbednar-mycluster-01-l8wjr        owned

RomanBednar · 2022-12-20T08:30:44Z

@wongma7 @nckturner what do you think about the change?

Ashley-wenyizha · 2022-12-22T14:21:29Z

 # See the License for the specific language governing permissions and
 # limitations under the License.

-FROM golang:1.17 as builder


Could we separate Go upgrade in a separate PR?

Sure, opened here: #867

Ashley-wenyizha · 2022-12-22T14:47:40Z

Hi Roman

Thanks for providing this fix. We are currently going through your code changes internally from EFS.

One thing we want to call out is, also as you pointed out here #693 (comment)

that with the patch, it would check all possible GIDs currently is 120 from given range each time a volume is created and this might not scale well as we have plan to increase this to 1000 early next year. AP increase internally dev is complete but waiting for deployment and official launch.

We will need to do a performance test upon the new limit and get back to you. Sorry about the potential delay, but what to let you know that we are looking into this and having internal tracking as well. Will provide more update upon this once we have more performance testing data.

Thank you!

RomanBednar · 2023-01-03T09:43:34Z

/hold until #867 merged

RomanBednar · 2023-02-06T09:05:47Z

Hello @Ashley-wenyizha, just checking in - anything new regarding the performance test?

Ashley-wenyizha · 2023-08-18T20:10:46Z

Hi @RomanBednar

Testing looks good, if not hearing back we will fix the merge conflict and merge in early next week.

Thanks for your contribution!

This code is needed to allow listing of access points. This can be used by GID allocator to avoid assigning GIDs that might already be used.

Internal limit of EFS is 120 Access Points per Filesystem ID. There is no reason to check the entire GID range specified by user if we can't allocate those GIDs anyway. Considering the internal limit there is no need to track the GIDs using heap structure and we can always look up the next GID from the full range which won't exceed 120. Because the limit is relatively small this won't impact performance.

RomanBednar · 2023-08-21T10:28:55Z

@Ashley-wenyizha Thank you for the update, I've resolved the conflicts now.

Ashley-wenyizha · 2023-08-21T13:02:52Z

/lgtm

Ashley-wenyizha · 2023-08-24T15:05:49Z

/approve

k8s-ci-robot · 2023-08-24T15:06:03Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ashley-wenyizha, RomanBednar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [Ashley-wenyizha]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

johnpmayer · 2023-08-25T20:01:49Z

Is the plan to leave ACCESS_POINT_PER_FS_LIMIT at the current value (120)?

I see this comment marked resolved - #850 (comment)

I'd definitely like to increase the value in my deployment. Is there still a performance concern listing large numbers of access points? Can we make this value configurable, so that users can make the trade-off themselves?

mskanth972 · 2023-08-26T00:07:03Z

@johnpmayer, No the comment is just telling that Access points limit has been increased to 1000 from 120 recently.
https://www.amazonaws.cn/en/new/2023/amazon-efs-increases-the-maximum-number-of-access-points-per-file-system/

johnpmayer · 2023-09-08T17:15:28Z

I see this is being addressed in #1119

With kubernetes-sigs#850, the new way of allocating GIDs introduced a new call to the ListAccessPoints endpoint of EFS API, that is generating problems on systems where EFS CSI driver is under high load (lots of PVC created within a short time period). In this PR, we are extracting the ListAccessPoints call from gid_allocator, moving it one level up. In case of dynamic provisioining of GIDs we can reuse the ListAccessPoints call to check for the file system existence (thus removing the DescribeFileSystem call in such case). In case of a fixed UID/GID, we continue calling DescribeFileSystem, and no calls to ListAccessPoints. In addition to the change explained above, gidMin and gidMax have been converted to int64. kubernetes-sigs#850 made both uid and gid int64, but gidMin and gidMax were nout touched. Also changing the default value for gidMin, as the value of 50000 was spamming the logs with a message coming from gid_allocator (i.e. range bigger than the max number of access points). Setting the value to 6999000 (the value that gid_allocator was setting by substracting 1000 to gidMax)

With kubernetes-sigs#850, the new way of allocating GIDs introduced a new call to the ListAccessPoints endpoint of EFS API, that is generating problems on systems where EFS CSI driver is under high load (lots of PVC created within a short time period). In this PR, we are extracting the ListAccessPoints call from gid_allocator, moving it one level up. In case of dynamic provisioining of GIDs we can reuse the ListAccessPoints call to check for the file system existence (thus removing the DescribeFileSystem call in such case). In case of a fixed UID/GID, we continue calling DescribeFileSystem, and no calls to ListAccessPoints. In addition to the change explained above, gidMin and gidMax have been converted to int64. kubernetes-sigs#850 made both uid and gid int64, but gidMin and gidMax were nout touched. Also changing the default value for gidMin, as the value of 50000 was spamming the logs with a message coming from gid_allocator (i.e. range bigger than the max number of access points). Setting the value to 6999000 (the value that gid_allocator was setting by substracting 1000 to gidMax). Removing an unused function and field from gid_allocator too.

With kubernetes-sigs#850, the new way of allocating GIDs introduced a new call to the ListAccessPoints endpoint of EFS API, that is generating problems on systems where EFS CSI driver is under high load (lots of PVC created within a short time period). In this PR, we are extracting the ListAccessPoints call from gid_allocator, moving it one level up. In case of dynamic provisioining of GIDs we can reuse the ListAccessPoints call to check for the file system existence (thus removing the DescribeFileSystem call in such case). In case of a fixed UID/GID, we continue calling DescribeFileSystem, and no calls to ListAccessPoints. In addition to the change explained above, gidMin and gidMax have been converted to int64. kubernetes-sigs#850 made both uid and gid int64, but gidMin and gidMax were nout touched. Also changing the default value for gidMax, as the value of 7000000 was spamming the logs with a message coming from gid_allocator (i.e. range bigger than the max number of access points). Setting the value to 51000 (the value that gid_allocator was setting by adding 1000 to gidMin). Removing an unused function and field from gid_allocator too.

With kubernetes-sigs#850, the new way of allocating GIDs introduced a new call to the ListAccessPoints endpoint of EFS API, that is generating problems on systems where EFS CSI driver is under high load (lots of PVC created within a short time period). In this PR, we are extracting the ListAccessPoints call from gid_allocator, moving it one level up. In case of dynamic provisioining of GIDs we can reuse the ListAccessPoints call to check for the file system existence (thus removing the DescribeFileSystem call in such case). In case of a fixed UID/GID, we continue calling DescribeFileSystem, and no calls to ListAccessPoints. In addition to the change explained above, gidMin and gidMax have been converted to int64. kubernetes-sigs#850 made both uid and gid int64, but gidMin and gidMax were nout touched. Also changing the default value for gidMax, as the value of 7000000 was spamming the logs with a message coming from gid_allocator (i.e. range bigger than the max number of access points). Setting the value to 51000 (the value that gid_allocator was setting by adding 1000 to gidMin). Removing an unused function and field from gid_allocator too. (cherry picked from commit dbbc733)

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 5, 2022

k8s-ci-robot requested review from nckturner and wongma7 December 5, 2022 14:51

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 5, 2022

RomanBednar force-pushed the gid-allocator branch from 0d6264d to 378d22a Compare December 8, 2022 13:31

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 8, 2022

RomanBednar force-pushed the gid-allocator branch from 378d22a to 3d65fbd Compare December 14, 2022 13:39

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 14, 2022

RomanBednar changed the title ~~WIP: Fix GID allocator~~ Fix GID allocator Dec 14, 2022

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 14, 2022

This was referenced Dec 14, 2022

efs-plugin crash loops when a storage class is created with a fixed uid and gid, and access point creation fails #693

Closed

Skip deallocating Gid when static Gid set #733

Closed

RomanBednar mentioned this pull request Dec 19, 2022

GID range parameter change requires pod restart #735

Closed

Ashley-wenyizha reviewed Dec 22, 2022

View reviewed changes

mskanth972 reviewed Dec 29, 2022

View reviewed changes

Comment thread pkg/driver/gid_allocator.go Outdated

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 3, 2023

RomanBednar force-pushed the gid-allocator branch from 3d65fbd to b1a40b9 Compare January 3, 2023 10:06

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 7, 2023

RomanBednar force-pushed the gid-allocator branch from b1a40b9 to b193158 Compare January 9, 2023 10:04

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2023

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 8, 2023

RomanBednar force-pushed the gid-allocator branch from b193158 to 4fa0e4e Compare April 17, 2023 10:26

RomanBednar added 4 commits August 21, 2023 10:47

extend Cloud interface to allow listing access points

56b35ca

This code is needed to allow listing of access points. This can be used by GID allocator to avoid assigning GIDs that might already be used.

add unit tests

85c9b1e

tidy && vendor

3e32348

RomanBednar force-pushed the gid-allocator branch from bbaab09 to 3e32348 Compare August 21, 2023 09:14

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 21, 2023

k8s-ci-robot assigned Ashley-wenyizha Aug 21, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 21, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 24, 2023

k8s-ci-robot merged commit 5e44f89 into kubernetes-sigs:master Aug 24, 2023

mskanth972 mentioned this pull request Aug 25, 2023

More robust GidAllocator option #1107

Closed

seanzatzdev-amazon mentioned this pull request Sep 19, 2023

Cherry-picking commits to release-1.7 branch #1142

Merged

mskanth972 mentioned this pull request Sep 22, 2023

Failed to locate a free GID with delete-access-point-root-dir=true #764

Closed

RomanBednar mentioned this pull request Oct 10, 2023

STOR-1398: OCPBUGS-9855: Rebase to v1.7.0 for OCP 4.15 openshift/aws-efs-csi-driver#48

Merged

RomanBednar mentioned this pull request Nov 9, 2023

allocate GIDs in increasing order #1182

Merged

otorreno mentioned this pull request Dec 29, 2023

Reduce calls to EFS API #1226

Merged

Conversation

RomanBednar commented Dec 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RomanBednar commented Dec 14, 2022

Test GID allocator avoids used GIDS

Test that GID allocator can re-use released GIDs

Test GID Range propagation

Uh oh!

RomanBednar commented Dec 20, 2022

Uh oh!

Ashley-wenyizha Dec 22, 2022

Choose a reason for hiding this comment

Uh oh!

RomanBednar Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

Ashley-wenyizha commented Dec 22, 2022

Uh oh!

Uh oh!

RomanBednar commented Jan 3, 2023

Uh oh!

RomanBednar commented Feb 6, 2023

Uh oh!

Ashley-wenyizha commented Aug 18, 2023

Uh oh!

RomanBednar commented Aug 21, 2023

Uh oh!

Ashley-wenyizha commented Aug 21, 2023

Uh oh!

Ashley-wenyizha commented Aug 24, 2023

Uh oh!

k8s-ci-robot commented Aug 24, 2023

Uh oh!

johnpmayer commented Aug 25, 2023

Uh oh!

mskanth972 commented Aug 26, 2023

Uh oh!

johnpmayer commented Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

RomanBednar commented Dec 5, 2022 •

edited

Loading

johnpmayer commented Sep 8, 2023 •

edited

Loading