-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd: Use assume_storage_prezeroed when formatting #4996
base: devel
Are you sure you want to change the base?
Conversation
internal/rbd/nodeserver.go
Outdated
@@ -101,7 +101,7 @@ var ( | |||
xfsHasReflink = xfsReflinkUnset | |||
|
|||
mkfsDefaultArgs = map[string][]string{ | |||
"ext4": {"-m0", "-Enodiscard,lazy_itable_init=1,lazy_journal_init=1"}, | |||
"ext4": {"-m0", "-Enodiscard,assume_storage_prezeroed=1"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sure that you only do this when mkfs.ext4
supports the option, you'll need to add some form of detection, ideally with something like sync.Once
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assume_storage_prezeroed
option became available in e2fsprogs 1.47.0 last year, so it should probably be "discovered" similar toxfsSupportsReflink()
.
My suggestion to do it similar to xfsSupportsReflink()
isn't viable because mkfs.ext4
doesn't include extended options in its help output, but trying with assume_storage_prezeroed=1
and falling back to lazy_itable_init=1,lazy_journal_init=1
upon parsing out Bad option(s) specified: assume_storage_prezeroed
error should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I and Madhu had a discussion on this. Since this flag is part of e2fsprogs >= 1.47 and we have control over which version of it we bundle in the container, do we really need this check?
Sidenote: Detection is possible by parsing the semver of e2fsprogs. It is standard flag so it must be there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for the version isn't always reliable in the face of downstream backports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as @idryomov mentioned we cannot depend on the version check, as the base image includes the required version, we could say what is supported by this version but that could be a problem if someone wants to build/test custom images. let's depend on the error check and fall back that works for all the cases.
60069c4
to
434ce4e
Compare
2699bc7
to
6b52f64
Compare
internal/rbd/nodeserver.go
Outdated
@@ -67,6 +67,10 @@ const ( | |||
xfsReflinkNoSupport | |||
xfsReflinkSupport | |||
|
|||
ext4PrezeroedUnset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restart counting again, use ext4PrezeroedUnset = itoa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go doesn't consider iota
twice in the same const block :(
internal/rbd/nodeserver.go
Outdated
@@ -748,6 +756,7 @@ func (ns *NodeServer) NodePublishVolume( | |||
return &csi.NodePublishVolumeResponse{}, nil | |||
} | |||
|
|||
//nolint:cyclop,gocyclo // function is required to have multiple conditional checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is rather ugly. Maybe it is not required if you add a function like
args := ns.GetMKFSDefaultArgs(fsType)
a function like that can add the assume_storage_prezeroed=1
option if support is detected.
internal/rbd/nodeserver.go
Outdated
if existingFormat == "" && !staticVol && !readOnly && !isBlock { | ||
args := mkfsDefaultArgs[fsType] | ||
if fsType == "ext4" && ns.ext4SupportsPrezeroed() { | ||
args = []string{"-m0", "-Enodiscard,assume_storage_prezeroed=1"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this drops the default options from line 112, is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, lazy_itable_init=1,lazy_journal_init=1
are not required when using assume_storage_prezeroed=1
. Rest of the options (-m0, -Enodiscard
) from line 112 are included.
internal/rbd/nodeserver.go
Outdated
return ext4HasPrezeroedSupport == ext4PrezeroedSupport | ||
} | ||
|
||
tempImage := "/tmp/prezeroed.img" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a temporary filename, please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
internal/rbd/nodeserver.go
Outdated
|
||
tempImage := "/tmp/prezeroed.img" | ||
ctx := context.TODO() | ||
_, _, err := util.ExecCommand(ctx, "dd", "if=/dev/zero", "of="+tempImage, "bs=1M", "count=1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of executing a dd
command, open the file, seek to 1mb and write a singled zero byte. This makes a sparse file, which should be much quicker and can easily be done with plain Go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this insight :)
3d4e46b
to
ed5b89a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, just a few nits and potential improvements
internal/rbd/nodeserver.go
Outdated
} | ||
|
||
// createSpareFile makes `file` a sparse file of size `sizeMB`. | ||
func createSparseFile(file *os.File, sizeMB int64) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not specific to the nodeserver, move it to the internal/util package please, and add a unit-test for it
internal/rbd/nodeserver.go
Outdated
out, err := diskMounter.Exec.Command( | ||
"mkfs.ext4", | ||
"-n", | ||
"-Enodiscard,assume_storage_prezeroed=1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is nice to have the options mentioned as a const somewhere, they are here, but also in getMkfsArgs()
.
Maybe keep the default map, and use ext4.prezeroed
as fstype to get the options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is nice to have the options mentioned as a const somewhere, they are here
I don't have an opinion about it being expressed as a const, but I think it should be a single option. This function is supposed to be checking whether ext4 supports "prezeroed", so it should be just that -- pass only assume_storage_prezeroed=1
, nodiscard
isn't relevant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the nodiscard
flag.
internal/rbd/nodeserver.go
Outdated
} | ||
// ext4HasPrezeroed is set by ext4SupportsPrezeroed(), use the function when | ||
// checking the support for assume_storage_prezeroed. | ||
ext4HasPrezeroedSupport = ext4PrezeroedUnset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand you copied the xfsHasReflink
idea. But I think it is cleaner if the NodeServer
struct has a field for these kind of detectable parameters. Have you thought about doing it that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored both the XFS and ext4 detections to use struct fields. PTAL.
// "assume_storage_prezeroed" option. It does this by creating a temporary | ||
// image file, attempting to format it with the ext4 filesystem using the | ||
// "assume_storage_prezeroed" option, and checking the output for errors. | ||
func (ns *NodeServer) ext4SupportsPrezeroed() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about if we check if this is supported or not in the driver.go(if we are stating it as nodeplugin) before starting the node plugin and store it in the Nodeserver struct like its suggested here https://github.com/ceph/ceph-csi/pull/4996/files#r1910308642?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the result for this is cached, let's check it when we need it instead of doing it preemptively? This way also has the added benefit of keeping driver.go
simple. WDYT?
ed5b89a
to
288c82e
Compare
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to mkfs.ext4, pass assume_storage_prezeroed=1 which is stronger and allows the filesystem to skip inode table zeroing completely instead of simply doing it lazily. The support for this flag is checked by trying to format a fake temporary image with mkfs.ext4 and checking its STDERR. Closes: ceph#4948 Signed-off-by: Niraj Yadav <[email protected]>
288c82e
to
d782438
Compare
} | ||
|
||
ctx := context.TODO() | ||
tempImgFile, err := os.CreateTemp(os.TempDir(), "prezeroed.img") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although unlikely, it is possible that this runs more than once at the same time. This is eventually called while attaching+mounting the RBD-image.
It is possible that the file is created and then:
- more than one thread passes here
- one thread finishes the function and deletes the file
- other threads continue to execute the
mkfs.ext4
command, with the file missing - the other threads will fail with an unexpected error
The prezeroed.img
filename should be unique, use os.CreateTemp()
. Or you can use an approach that builds on sync.Once
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a suffix. Each filename will be different.
t.Run(tt.name, func(t *testing.T) { | ||
t.Parallel() | ||
|
||
file, err := os.CreateTemp("", "test-sparse-") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
place this in the t.TempDir()
directory for automatic cleanup, no need to have to remove the file in a defer
.
Describe what this PR does
Instead of passing
lazy_itable_init=1
andlazy_journal_init=1
tomkfs.ext4
, passassume_storage_prezeroed=1
which isstronger and allows the filesystem to skip inode table zeroing
completely instead of simply doing it lazily.
The support for this flag is checked by trying to format a fake
temporary image with
mkfs.ext4
and checking itsSTDERR
.Closes: #4948