Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass assume_storage_prezeroed option when formatting ext4 if possible #4948

Open
idryomov opened this issue Nov 8, 2024 · 1 comment
Open
Labels
component/rbd Issues related to RBD good first issue Good for newcomers

Comments

@idryomov
Copy link
Contributor

idryomov commented Nov 8, 2024

Describe the feature you'd like to have

Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to mkfs.ext4, Ceph CSI should pass assume_storage_prezeroed=1 which is stronger and allows the filesystem to skip inode table zeroing completely instead of simply doing it lazily (it's a superset of lazy_itable_init=1 and lazy_journal_init=1). As before with lazy_itable_init=1 and lazy_journal_init=1, this should be limited to dynamically provisioned volumes -- a case where Ceph CSI can guarantee that mkfs.ext4 invocation immediately follows the creation of the RBD image.

assume_storage_prezeroed option became available in e2fsprogs 1.47.0 last year, so it should probably be "discovered" similar to xfsSupportsReflink().

What is the value to the end user? (why is it a priority?)

Freshly created RBD volumes would consume less space. Quoting assume_storage_prezeroed implementation patch:

    - Avoiding zeroing out the inode table and journal reduces the
      initial metadata space allocation from 0.48% to 0.01%.
    - Lazy inode table zeroing results in a further 1.45% of logical
      volume space getting allocated for inode tables, even if no file
      data is added to the filesystem. With assume_storage_prezeroed,
      the metadata allocation remains at 0.01%.

How will we know we have a good solution? (acceptance criteria)

Run before and after tests that would create and attach batches of e.g. 100G, 500G, 1T and 5T RBD volumes, noting space usage as reported by ceph df. The volumes would need to be attached for a while to allow for lazy inode table zeroing to complete in the before case.

@nixpanic nixpanic added component/rbd Issues related to RBD good first issue Good for newcomers labels Nov 8, 2024
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 11, 2024

cc @ceph/ceph-csi-contributors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants