support quota set on overlay snapshot#5859
Conversation
|
Hi @yylt. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Build succeeded.
|
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| */ | ||
|
|
There was a problem hiding this comment.
This file seems copied from Moby. Please clarify the origin of the code and the license in code comments.
snapshots/quota/projectquota.go
Outdated
| #define Q_XGETPQUOTA QCMD(Q_XGETQUOTA, PRJQUOTA) | ||
| #endif | ||
| */ | ||
| import "C" |
|
Quota handling is a topic we should discuss at a high level first and possibly a good topic for a future community meeting. However, getting library support for setting project quotas across different backing FS makes sense to do now. As an end feature, we need to discuss how it will be integrated with Kubernetes and how we can use it from our own tooling and go libraries. It seems backwards to have the cri layer here enable the quota then the snapshot set a static quota. I would think the quota would be set by the client adding a quota and the snapshotter enabling it via configuration and erroring out if the backing FS couldn't support it when enabled. The difficult part is figuring out from the client perspective whether quota is enabled so it knows whether it can avoid expensive ephemeral storage accounting (such as in the Kubelet). |
pkg/cri/config/config.go
Outdated
| NetNSMountsUnderStateDir bool `toml:"netns_mounts_under_state_dir" json:"netnsMountsUnderStateDir"` | ||
| // SupportSetQuota indicate to set quota on container read-write snapshot, when snapshot do not | ||
| // support quota set, it will do nothing. | ||
| SupportSetQuota bool `toml:"support_set_quota" json:"supportSetQuota"` |
There was a problem hiding this comment.
Maybe EnableRootfsQuota or EnableSetQuotasounds better.
In Kubernetes, there is a feature gate and a check method |
|
sorry for so long time later.
it is good suggestion. now quota setter focus on overlay, and with pure go. and update snapshots litter, add new option such as: |
pacoxu
left a comment
There was a problem hiding this comment.
I tried to build a containerd to do some test. Too many quota is created in my env.
[root@daocloud ~]# xfs_quota -xc "report -a -p -h" | head -n 100
Project quota on /var/lib/kubelet (/dev/sdc)
Blocks
Project ID Used Soft Hard Warn/Grace
---------- ---------------------------------
#0 820K 0 0 00 [------]
volume1048590 0 8E 8E 00 [------]
volume1048605 0 8E 8E 00 [------]
volume1048607 0 8E 8E 00 [------]
volume1048610 0 8E 8E 00 [------]
Project quota on /var/lib/containerd (/dev/sdb)
Blocks
Project ID Used Soft Hard Warn/Grace
---------- ---------------------------------
#0 10.2G 0 0 00 [------]
#2 0 0 0 00 [------]
#3 0 15G 15G 00 [------]
#4 12K 15G 15G 00 [------]
#5 0 15G 15G 00 [------]
#6 0 15G 15G 00 [------]
#7 0 15G 15G 00 [------]
#8 0 15G 15G 00 [------]
#9 0 15G 15G 00 [------]
#10 0 15G 15G 00 [------]
#11 0 15G 15G 00 [------]
#12 0 15G 15G 00 [------]
#13 0 15G 15G 00 [------]
#14 0 15G 15G 00 [------]
#15 0 15G 15G 00 [------]
#16 0 15G 15G 00 [------]
#17 0 15G 15G 00 [------]
#18 0 15G 15G 00 [------]
#19 0 15G 15G 00 [------]
#20 0 15G 15G 00 [------]
#21 0 15G 15G 00 [------]
#22 0 15G 15G 00 [------]
#23 0 15G 15G 00 [------]
#24 4K 15G 15G 00 [------]
#25 0 15G 15G 00 [------]
#26 0 15G 15G 00 [------]
#27 0 15G 15G 00 [------]
#28 0 15G 15G 00 [------]
#29 4K 15G 15G 00 [------]
#30 0 15G 15G 00 [------]
#31 0 15G 15G 00 [------]
#32 0 15G 15G 00 [------]
#33 0 15G 15G 00 [------]
#34 0 15G 15G 00 [------]
#35 4K 15G 15G 00 [------]
#36 4K 15G 15G 00 [------]
#37 0 15G 15G 00 [------]
#38 540K 15G 15G 00 [------]
#39 0 15G 15G 00 [------]
#40 0 15G 15G 00 [------]
#41 0 15G 15G 00 [------]
#42 0 15G 15G 00 [------]
#43 0 10G 10G 00 [------]
#44 0 10G 10G 00 [------]
#45 0 10G 10G 00 [------]
#46 0 10G 10G 00 [------]
#47 0 10G 10G 00 [------]
#48 0 10G 10G 00 [------]
#49 0 10G 10G 00 [------]
#50 0 10G 10G 00 [------]
#51 0 10G 10G 00 [------]
#52 0 10G 10G 00 [------]
#53 0 10G 10G 00 [------]
#54 0 10G 10G 00 [------]
#55 0 10G 10G 00 [------]
#56 0 10G 10G 00 [------]
#57 0 10G 10G 00 [------]
#58 0 10G 10G 00 [------]
#59 0 10G 10G 00 [------]
#60 0 10G 10G 00 [------]
#61 0 10G 10G 00 [------]
#62 0 10G 10G 00 [------]
#63 0 10G 10G 00 [------]
I set the size to 10G at first and 15G later.
[root@daocloud ~]# xfs_quota -xc "report -a -p -h" | wc -l
1956507
snapshots/overlay/plugin/plugin.go
Outdated
| oOpts = append(oOpts, overlay.WithUpperdirLabel) | ||
| } | ||
|
|
||
| ic.Meta.Exports["root"] = root |
snapshots/overlay/overlay.go
Outdated
| fsMagic, err := overlayutils.GetFSMagic(root) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| if config.quotaSize > 0 { |
There was a problem hiding this comment.
| fsMagic, err := overlayutils.GetFSMagic(root) | |
| if err != nil { | |
| return nil, err | |
| } | |
| if config.quotaSize > 0 { | |
| if config.quotaSize > 0 { | |
| fsMagic, err := overlayutils.GetFSMagic(root) | |
| if err != nil { | |
| return nil, err | |
| } |
snapshots/overlay/overlay.go
Outdated
| } else { | ||
| size = config.quotaSize | ||
| } | ||
| return quotaCtl.SetAllQuota(size, targets...) |
There was a problem hiding this comment.
setting quota may need some logs. Probably, some debug log.
yes, it becauce set quota committed layer too. but it is not expected |
|
I got some new errors with latest code.
And the quota is not set as expected. |
f72d83c to
ed16c4b
Compare
|
I try command and use bpftrace to debug, and the syscall finally, use below to translate |
|
Any update? |
There was a problem hiding this comment.
Could you also update https://github.com/containerd/containerd/tree/main/docs/snapshotters to explain the usage?
There was a problem hiding this comment.
At present, there is no direct way to use quota on overlayfs, here is to support quota-enabled for overlayfs only.
As mentioned at #5859 (comment), perhaps in the future, configuring quotas through config.toml, cri-api, or annotations will be added, and this should discussion more.
Therefore, I believe it is more appropriate to modify the document when it is supported configure.
Signed-off-by: Yang Yang <yang8518296@163.com>
| } | ||
| } | ||
|
|
||
| // WithLabels appends labels to a created snapshot |
There was a problem hiding this comment.
| // WithLabels appends labels to a created snapshot | |
| // WithQuotaSize appends a label to this snapshot identifying it's maximum quota size in bytes for the snapshot writable layer |
|
|
||
| // WithLabels appends labels to a created snapshot | ||
| func WithQuotaSize(bytesize uint64) Opt { | ||
| return func(info *Info) error { |
| return nil | ||
| } | ||
|
|
||
| // WithEnableQuota define the enable quota |
There was a problem hiding this comment.
| // WithEnableQuota define the enable quota | |
| // WithEnableQuota set enable quota to true/on |
| } | ||
| } | ||
|
|
||
| func QuotaSize(labels map[string]string) string { |
There was a problem hiding this comment.
| func QuotaSize(labels map[string]string) string { | |
| // QuotaSize returns a string comprising the maximum quota size in bytes as an unsigned integer or nil if unset | |
| func QuotaSize(labels map[string]string) string { |
There was a problem hiding this comment.
I don't think we should add more helpers here. This helper doesn't do much either.
| // MountOptions are options used for the overlay mount (not used on bind mounts) | ||
| MountOptions []string `toml:"mount_options"` | ||
|
|
||
| // EnableQuota is define the quota on the snapshot writable layer |
There was a problem hiding this comment.
| // EnableQuota is define the quota on the snapshot writable layer | |
| // EnableQuota turns on the quota max size in bytes feature for the snapshot writable layer |
| case fs.MagicXfs: | ||
| quotaCtl, err := quota.NewControl(root) | ||
| if err != nil { | ||
| log.L.WithError(err).Warnf("could not initinal quota control") |
There was a problem hiding this comment.
| log.L.WithError(err).Warnf("could not initinal quota control") | |
| log.L.WithError(err).Warnf("could not initialize quota control") |
| log.L.WithError(err).Warnf("could not initinal quota control") | ||
| return nil, fmt.Errorf("directory '%s' does not support set quota, make sure mount with 'pquota'", root) | ||
| } | ||
| return func(targets []string, size uint64) error { |
There was a problem hiding this comment.
noting we switched to a string... but are still using quota size here as a uint. thus no way to set "0" on size vs "" to mean unset return nil...
dmcgowan
left a comment
There was a problem hiding this comment.
I understand this is only the first step but it is defining an interface for how it will be used by other snapshotters and backing filesystems in the future. I am fine we we want to start by just defining the label but think we should have a more generalized solution or plan before we put it into the default snapshotter.
Is there a plan for how this would interact with CRI/k8s? Setting a default max in the CRI config seems more dangerous than helpful as these sort of limits should be defined by k8s. This should also be able to free up the costly Usage calculation when it is used, otherwise it is only a minor enhancement.
I would suggest not considering this for the 2.1 release since it would only be experimental and the limited backing store support and possible client use doesn't warrant changes to the default snapshotter. I do envision in 2.2 on beyond we can start thinking about more hybrid snapshotters, where a snapshotter may be using overlay as well as block devices (similar to what erofs is doing). That could provide a more generalized solution for quota but possibly not a part of the default overlay snapshotter.
| LabelSnapshotUIDMapping = "containerd.io/snapshot/uidmapping" | ||
| // LabelSnapshotGIDMapping is the label used for GID mappings | ||
| LabelSnapshotGIDMapping = "containerd.io/snapshot/gidmapping" | ||
| // LabelSnapshotQuotaSize is the quota size. |
There was a problem hiding this comment.
If we define the label here, we should clearly define the expected value
| } | ||
| } | ||
|
|
||
| func QuotaSize(labels map[string]string) string { |
There was a problem hiding this comment.
I don't think we should add more helpers here. This helper doesn't do much either.
| // SupportQuota checks if overlay filesystem supports quota or not. | ||
| // | ||
| // This function returns quotaSetter or error if it fails to check the filesystem. | ||
| func SupportQuota(root string) (QuotaSetter, error) { |
| // | ||
| // Get project id of parent dir as minimal id to be used by driver | ||
| // | ||
| minProjectID, err := getProjectID(basePath) |
There was a problem hiding this comment.
when containerd restart , can not obtain an existing path with prjid set on the system. In this case, the value in q.quotas[] is empty. The newly started container will use the prjid already used by another container
(dlv) p q
("*github.com/containerd/containerd/snapshots/quota.Control")(0xc000456800)
github.com/containerd/containerd/snapshots/quota.Control {
RWMutex: sync.RWMutex {
w: (sync.Mutex)(0xc000456800),
writerSem: 0,
readerSem: 0,
readerCount: ("sync/atomic.Int32")(0xc000456810),
readerWait: ("sync/atomic.Int32")(0xc000456814),},
backingFsBlockDev: "/home/containerd_rt/root/io.containerd.snapshotter.v1.overlayfs/backingFsBlockDev",
nextProjectID: 2,
quotas: map[string]uint32 [],}
(dlv)
|
Compared to ephemeral storage, adding storage configuration to cri-API is completely unnecessary, and there are many file system restrictions, making it seem useless |
|
are we on the same page? "Goals - Universal snapshot support for quota: To implement a unified approach for managing quotas across all snapshot types. ... " along with Crosby's orig. recommendation "this can be implemented with snapshot labels and then the snapshotters will have to be updated to support these things. However, we still need higher layers like CRI and kube to pass these labels" and this discussion thread: #759 It's doesn't feel like we should only care about quotas for overlayfs and it doesn't seem like quota requirements don't need to flow over the CRI pod/container/metrics apis, ... not following the idea that quotas would be completely unnecessary and seemingly useless for CRI clients. |
sorry for that, it's possible that I didn't fully express my thoughts. Integrating with CRI is obviously necessary, but that would require submitting a KEP([Kubernetes Enhancement Proposals) and engaging in discussions. I anticipate this process to be quite time-consuming. Moreover, ephemeral storage generally doesn't require snapshot type, making it more convenient. Only after the KEP had been merged, that would be necessary to modify the current patch, it's likely that widespread agreement will only be achieved at that point. |
It would be great if this feature could also be introduced to docker with containerd snapshotter enabled. Currently with |
ref: https://linux.die.net/man/8/xfs_quota, and here use project id set limit, like this
Perhaps it's another way of implementing quota |
|
@yylt Thanks. This is working for vanilla Docker ( for UpperDir ), but with containerd, there are snapshots, and these are not predictable. Any idea? This is what I have tried: |
Snapshot aim to implemented union file system with different backends. Currently, if it's overlayfs, you can usually see the following information on the system Quotas can be set for the directories specified by |
|
This information is available after the container is started, not before; thus, it's impossible to set quotas before the container starts. |
Right, but it's should be acceptable for want to workaround, also you can subscribe for |
|
@yylt Hello, could you please provide the configuration and usage methods again? I attempted to set a quota limit but was unsuccessful. The file system has enabled prjquota. |
Sorry for the late reply. The current implementation cannot be directly used, it will be completed in a few steps. Now only allows the overlayfs to support quota syscall in the internal. |
I do believe quota solution (or other forms for writable layers such as seperate sparse files) is much cleaner to be implemented as a part of mount manager/plugin so that overlayfs-based writable layers among various snapshotters can leverage this seemlessly. |
|
I'm going to remove this out of the 2.2 milestone. I'm not sure we want this PR as is, a few things to consider though
If you want to split out the changes to define the quota size label and documentation around it, I think we could consider that for 2.2. |
Thank you for providing this information. I’ve always had some concerns about whether to make changes:
split out the changes should be related to [Mount Management(https://github.com//issues/11303)], and also should be another pull reqeust. Let’s leave it at that now. |
support set quota on overlay snapshot
as discussed in issue #3329
Fixes #3329
enable quota set by follows:
config.toml like this
check containerd root mount info
check container had been set quota
Signed-off-by: Yang Yang yang8518296@163.com