You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to mkfs.ext4, Ceph CSI should pass assume_storage_prezeroed=1 which is stronger and allows the filesystem to skip inode table zeroing completely instead of simply doing it lazily (it's a superset of lazy_itable_init=1 and lazy_journal_init=1). As before with lazy_itable_init=1 and lazy_journal_init=1, this should be limited to dynamically provisioned volumes -- a case where Ceph CSI can guarantee that mkfs.ext4 invocation immediately follows the creation of the RBD image.
assume_storage_prezeroed option became available in e2fsprogs 1.47.0 last year, so it should probably be "discovered" similar to xfsSupportsReflink().
What is the value to the end user? (why is it a priority?)
Freshly created RBD volumes would consume less space. Quoting assume_storage_prezeroed implementation patch:
- Avoiding zeroing out the inode table and journal reduces the
initial metadata space allocation from 0.48% to 0.01%.
- Lazy inode table zeroing results in a further 1.45% of logical
volume space getting allocated for inode tables, even if no file
data is added to the filesystem. With assume_storage_prezeroed,
the metadata allocation remains at 0.01%.
How will we know we have a good solution? (acceptance criteria)
Run before and after tests that would create and attach batches of e.g. 100G, 500G, 1T and 5T RBD volumes, noting space usage as reported by ceph df. The volumes would need to be attached for a while to allow for lazy inode table zeroing to complete in the before case.
The text was updated successfully, but these errors were encountered:
Describe the feature you'd like to have
Instead of passing
lazy_itable_init=1
andlazy_journal_init=1
tomkfs.ext4
, Ceph CSI should passassume_storage_prezeroed=1
which is stronger and allows the filesystem to skip inode table zeroing completely instead of simply doing it lazily (it's a superset oflazy_itable_init=1
andlazy_journal_init=1
). As before withlazy_itable_init=1
andlazy_journal_init=1
, this should be limited to dynamically provisioned volumes -- a case where Ceph CSI can guarantee thatmkfs.ext4
invocation immediately follows the creation of the RBD image.assume_storage_prezeroed
option became available in e2fsprogs 1.47.0 last year, so it should probably be "discovered" similar toxfsSupportsReflink()
.What is the value to the end user? (why is it a priority?)
Freshly created RBD volumes would consume less space. Quoting
assume_storage_prezeroed
implementation patch:How will we know we have a good solution? (acceptance criteria)
Run before and after tests that would create and attach batches of e.g. 100G, 500G, 1T and 5T RBD volumes, noting space usage as reported by
ceph df
. The volumes would need to be attached for a while to allow for lazy inode table zeroing to complete in the before case.The text was updated successfully, but these errors were encountered: