Support for additional compressors/decompressors#7978
Support for additional compressors/decompressors#7978rvrangel wants to merge 9 commits intovitessio:mainfrom
Conversation
Signed-off-by: Renan Rangel <renan@slack-corp.com>
66c7b9c to
4becf1b
Compare
|
Sorry for the delay in responding. I somehow missed the notification for reviewing this PR.
To start with we only need one flag - let us call it |
|
Things have changed a little bit now (see detail in #8175), so the design will need to change to accommodate a choice of builtin decompressors also. |
|
@deepthi I saw that, I will update this PR to take it into account, thanks! |
Signed-off-by: Renan Rangel <renan@slack-corp.com>
|
hi @deepthi, I have update the code with support for the two classes of compressors and decompressors:
|
This sounds good. Let us not add cgo right now, we want to avoid that if possible. |
|
@deepthi It should be ready for review now 👍, I would love to get your feedback on it |
go/vt/mysqlctl/compression.go
Outdated
| } | ||
| } | ||
|
|
||
| return "", fmt.Errorf("%w \"%s\"", errUnsupportedCompressionExtension, extension) |
There was a problem hiding this comment.
Anywhere where you have \"%s\" in a format string, you can replace it with %q, which adds the quotes automatically and escapes the contents of the string so they always look proper inside the quotes. 👌
There was a problem hiding this comment.
neat, I will change that!
|
Hiiii @rvrangel! Been looking at this PR. I'm afraid that your usage of From the
You can see that this API in the stdlib is very nuanced. I would recommend refactoring this implementation so that all Let me know if you get stuck! |
go/vt/mysqlctl/compression.go
Outdated
| } | ||
|
|
||
| func prepareExternalCompressionCmd(ctx context.Context, cmdStr string) (*exec.Cmd, error) { | ||
| cmdArgs := strings.Split(cmdStr, " ") |
There was a problem hiding this comment.
This isn't perfectly cromulent; it will corrupt argument strings with quotes or escaped whitespace. I would suggest bringing https://pkg.go.dev/github.com/google/shlex as a (tiny) dependency to parse the commandline properly.
There was a problem hiding this comment.
Sounds good, I will take a look at it and make the changes 👍
|
hi @vmg, thanks for the feedback. It is correct that we should not call |
|
I see now, I was a bit confused with the |
Signed-off-by: Renan Rangel <renan@slack-corp.com>
|
@vmg I updated the PR, let me know if that covers most of what you thought and if you have any additional feedback |
Signed-off-by: Renan Rangel <renan@slack-corp.com>
|
and thanks for the feedback @vmg! As a heads up, one limiting factor we encountered is the download speed for the restores when using S3, due to the fact vitess/go/vt/mysqlctl/s3backupstorage/s3.go Lines 191 to 208 in 440d2e6 I have some code that implements multipart download as a buffer in memory, but still needs some polishing, but will soon create another PR. We saw some improvements on the 2-4x range when using other algorithms since |
ajm188
left a comment
There was a problem hiding this comment.
just need the copyright notices, then good from my end.
| @@ -0,0 +1,281 @@ | |||
| package mysqlctl | |||
There was a problem hiding this comment.
this file is missing the copyright notice
| @@ -0,0 +1,199 @@ | |||
| package mysqlctl | |||
Signed-off-by: Renan Rangel <renan@slack-corp.com>
There was a problem hiding this comment.
Is there a reason the same support is not being added to the builtin engine? The compression options should be independent of the engine being used.
Also, more tests are needed. It should be fine to overload existing backup tests so that different tests use different options.
unit tests: https://github.com/vitessio/vitess/blob/main/go/vt/wrangler/testlib/backup_test.go
endtoend: https://github.com/vitessio/vitess/tree/main/go/test/endtoend/backup
go/vt/mysqlctl/xtrabackupengine.go
Outdated
| // switch which compressor/decompressor to use | ||
| builtinCompressor = flag.String("xtrabackup_builtin_compressor", "pgzip", "which builtin compressor engine to use") | ||
| builtinDecompressor = flag.String("xtrabackup_builtin_decompressor", "auto", "which builtin decompressor engine to use") | ||
| // use and external command to decompress the backups |
There was a problem hiding this comment.
While adding support to builtinbackupengine, all these flags should be renamed so that they are not xtrabackup specific.
There was a problem hiding this comment.
hi @deepthi thanks for the feedback. I will add support for it this week in the builtin engine and for the tests 👍
go/vt/mysqlctl/xtrabackupengine.go
Outdated
| xtrabackupStripeBlockSize = flag.Uint("xtrabackup_stripe_block_size", 102400, "Size in bytes of each block that gets sent to a given stripe before rotating to the next stripe") | ||
| // switch which compressor/decompressor to use | ||
| builtinCompressor = flag.String("xtrabackup_builtin_compressor", "pgzip", "which builtin compressor engine to use") | ||
| builtinDecompressor = flag.String("xtrabackup_builtin_decompressor", "auto", "which builtin decompressor engine to use") |
There was a problem hiding this comment.
It will be nice to add some help text on what "auto" means.
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
|
@rvrangel when you get back to this, we will also need a companion website PR to document the new options. |
|
Superseded by #10558 which has been merged. |
Signed-off-by: Renan Rangel renan@slack-corp.com
Description
This change will allow us to use an external decompressor when available as opposed to the internal
pgzipcurrently used. We ran some tests usingpigz -d -cand we get a nice speed bump on decompression, around 30% or more faster decompressing when restoring from a backup in S3. While decompressing is still single-threaded inpigz, it uses separate threads for reading, writing and checksum, and it results in faster performance.I wanted to get an initial feedback on this change, as we would like to also to plug our own compressor to use other encryption algorithms (like
lz4orzstd) with an external binary as this is a more flexible setup for users, but also work on the change to add built-in support for one of these compression algorithms.Related Issue(s)
#7802
Checklist
Deployment Notes
This should not have any affect on current deployments. To take advantage of this, one needs to pass the correct flags to
vttabletand the command also needs to exist in the OS, otherwise it will revert to the builtin decompressor.