Skip to content

Remove built-in decompression flag#10670

Merged
deepthi merged 20 commits intovitessio:mainfrom
planetscale:ExtCompressor
Aug 11, 2022
Merged

Remove built-in decompression flag#10670
deepthi merged 20 commits intovitessio:mainfrom
planetscale:ExtCompressor

Conversation

@rsajwani
Copy link
Copy Markdown
Contributor

@rsajwani rsajwani commented Jul 12, 2022

Signed-off-by: Rameez Sajwani rameezwazirali@hotmail.com

Description

This PR includes bunch of optimization and some changes around backward compatibility scenario.

1- Making code logic of deducing decompressor between Built-in and XtraBackup same.
2- We don't need decompressor flag given we are always going to write what is the built-in compression used in MANIFEST
3- Adding compression engine field in Xtrabackup MANIFEST
4- Addressing the scenario when MANIFEST file does not contain any value for compression engine. We should default to what we were using before which is "gzip". This can happen if the backup is done using vitess version <= v14 and we try to restore using main / v15
5- Introducing 'external' as one of possible value of compression engine. External specify if we are using custom flags or custom tool for compression and decompression.
6- Rename built-in compressor to more generic name 'compression-engine-name'
7- Change mysql version to 8.0.28 for upgrade-downgrade test

With this change here are the possible value of --compression-engine-name

  • pgzip
  • pargzip
  • lz4
  • zstd
  • external

where 'external' is set only when we are using compressor other than the ones mentioned above. If you want to use any of the built-in compressors, simply set one of the above values for --compression-engine-name. Value specified in --compression-engine-name is saved in MANIFEST, which is later read by restore routine to decide what engine to use for decompression. Default value for engine is 'pgzip'

Details

Xtrabackup engine was using file extension as a mean to determine the value of compressor during decompression, whereas built-in engine was using MANIFEST file to find out the value of compressor. This PR will make both engine use same logic which is to read it from MANIFEST file to determine the engine used during compression. As a result we found out that we don't need flags like Built-in decompressor. Furthermore we realized that Builtin-engine is used as a hint for decompressor to find out the compression engine name, so we changed its name to more generic 'compression-engine-name'. Here is the list of different scenarios (possible values) which client can provide to these flags.

Use Cases

  • 1 (Use pgzip as compressor and decompressor)
    compression-engine: pgzip
    External compression: “”
    External decompression: “”

  • 2 (Use some external command as compressor and decompressor)
    compression-engine: “external”
    External compression: externalTool -c -y
    External decompression: externalTool -d -y
    external-compressor-extension: .xyz

  • 3 (Use custom lz4 as compression and default built-in lz4 as decompressor)
    compression-engine: “lz4”
    External compression: lz4 -flag
    External decompression:

  • 4 (Use custom lz4 as compression and decompressor)
    compression-engine: “external”
    External compression: lz4 -flag
    External decompression: lz4 -flag

Migration Cases:

  • 5 (Changing from gzip to lz4 (built-in to built-in))

Original state:
compression-engine: “gzip”
External compression: “”
External decompression: “”

Final state:
compression-engine: “lz4”
External compression: “”
External decompression: “”

  • 6 (Changing from pgzip to pgzip custom (built-in to custom built-in))

Original state:
compression-engine: “pgzip”
External compression: “”
External decompression: “”

Final state:
compression-engine: “pgzip”
External compression: “pgzip -flag”
External decompression: “”

  • 7 (Changing from gzip to custom (built-in to custom))

Original state:
compression-engine: “pgzip”
External compression: “”
External decompression: “”

Final state:
compression-engine: “external”
External compression: “external -c -flag”
External decompression: “external -d -flag”

  • 8 (Changing from custom to lz4 (custom to built-in))

Original state:
compression-engine: “external”
External compression: “custom -c flag”
External decompression: “custom -d flag”

Intermediate state:
compression-engine: “lz4”
External compression: “”
External decompression: “external -d -flag”

finale state:
compression-engine: “lz4”
External compression: “”
External decompression: “”

  • 9 (Changing from custom-builtin to lz4 (custom built-in to built-in))

Original state:
compression-engine: “pgzip”
External compression: “pgzip -c”
External decompression: “”

final state:
compression-engine: “lz4”
External compression: “lz4 -c”
External decompression: “”

  • 10 (Changing from builtin to custom-built-in decompressor)

Original state:
compression-engine: “pgzip”
External compression: “”
External decompression: “”

Wont work:
compression-engine: “pgzip”
External compression: “”
External decompression: “pgzip -c”

final state:
compression-engine: “external”
External compression: “pgzip -c”
External decompression: “pgzip -c flag”

  • 11 (Changing from external to external)

Original state:
compression-engine: “external”
External compression: “tool -c”
External decompression: “tool -d”

final state:
compression-engine: “external”
External compression: “tool2 -c”
External decompression: “tool1 -d”

P.S: In this case #11 we will run into issue where some of the older backups won't get restore after final state unless tool1 and tool2 are compatible.

Related Issue(s)

#7802

Checklist

Deployment Notes

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
@vitess-bot
Copy link
Copy Markdown
Contributor

vitess-bot bot commented Jul 12, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a new flag is being introduced, review whether it is really needed. The flag names should be clear and intuitive (as far as possible), and the flag's help should be descriptive.
  • If a workflow is added or modified, each items in Jobs should be named in order to mark it as required. If the workflow should be required, the GitHub Admin should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should either include a link to an issue that describes the bug OR an actual description of the bug and how to reproduce, along with a description of the fix.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
rsajwani added 2 commits July 11, 2022 22:12
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
@rsajwani rsajwani marked this pull request as ready for review July 12, 2022 20:39
Copy link
Copy Markdown
Collaborator

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look fine to me.
Do we have an update/downgrade test that verifies that we can restore a backup taken with an older version binary even if we provide new flags to the binary we are restoring with?

Also need to update release notes summary file, and related website PR vitessio/website#1091 (cc @rvrangel)

@deepthi deepthi added the release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) label Jul 12, 2022
rsajwani added 2 commits July 12, 2022 16:44
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
@rvrangel
Copy link
Copy Markdown
Contributor

The idea behind a separate --builtin-decompressor flag is that you might want more control over your backup (for example, you want to do it at a minimum CPU priority or maybe a specific CPU affinity setting by using nice/taskset), but you don't care as much about the restore, since it when the host is restoring, it is not in service yet.

In that case we would benefit of being able to set an external compressor but use a more convenient builtin decompressor. This change introduces a new external engine that is set on the MANIFEST for any externally compressed backups, which prevents one from being able to use just the --external-compressor and --external-compressor-extension flags without having to set --external-decompressor as well. If that is the case, do we even care to have a --external-compressor-extension at all? If we always need to set both external compressor/decompressor, than the only reason it would exist is to make xtrabackup backups display that on the filename, as it is not really used when trying to detect which kind of decompression engine to use later on.

I think it is a good idea to also add the compression engine to the xtrabackup MANIFEST and to make the code logic of deducing the decompressor the same, but removing --builtin-decompressor takes away part of the flexibility of this change. Just my 2 cents.

@rsajwani
Copy link
Copy Markdown
Contributor Author

The idea behind a separate --builtin-decompressor flag is that you might want more control over your backup (for example, you want to do it at a minimum CPU priority or maybe a specific CPU affinity setting by using nice/taskset), but you don't care as much about the restore, since it when the host is restoring, it is not in service yet.

In that case we would benefit of being able to set an external compressor but use a more convenient builtin decompressor. This change introduces a new external engine that is set on the MANIFEST for any externally compressed backups, which prevents one from being able to use just the --external-compressor and --external-compressor-extension flags without having to set --external-decompressor as well. If that is the case, do we even care to have a --external-compressor-extension at all? If we always need to set both external compressor/decompressor, than the only reason it would exist is to make xtrabackup backups display that on the filename, as it is not really used when trying to detect which kind of decompression engine to use later on.

I think it is a good idea to also add the compression engine to the xtrabackup MANIFEST and to make the code logic of deducing the decompressor the same, but removing --builtin-decompressor takes away part of the flexibility of this change. Just my 2 cents.

The extension point you mentioned above does make sense and probably we can get rid of it. The problem with previous approach was consider a scenario where you decide to move from lz4 to pgzip. In that case the transition will be two phase first (step) you only change value of compressor to pgzip and keep the decompressor to still lz4 so latest backup can be decompressed and then as soon as you get your first backup with new compressor pgzip then you have to now change your decompressor to pgzip (second step). The whole idea of reading from manifest is to make logic simple else we will run into a lot of edge cases.

@rvrangel
Copy link
Copy Markdown
Contributor

If you want to move engines, you can (and should) keep --builtin-decompressor auto. so in your example, you change the compressor from lz4 to pgzip and set the builtin-decompressor to auto (if it is not already like that). that means the tablet should be able to decompress both the old and new backups without a problem (the engine used on the backup is set in the manifest after all).

Perhaps if you want to simplify the logic, another option would be to replace --external-compressor-extension with --external-compressor-engine so a compatible format that is supported by the builtin decompressors is written to the MANIFEST as a hint that can be used for decompressing and we skip dealing we extensions altogether. What do you think?

@rsajwani
Copy link
Copy Markdown
Contributor Author

Perhaps if you want to simplify the logic, another option would be to replace --external-compressor-extension with --external-compressor-engine so a compatible format that is supported by the builtin decompressors is written to the MANIFEST as a hint that can be used for decompressing and we skip dealing we extensions altogether. What do you think?

Why do you need --external-compressor-engine separately, you can use *builtin-compressor to write to manifest just like you are doing it in Builtin-Engine.

rsajwani added 4 commits July 28, 2022 17:39
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
rsajwani added 3 commits July 31, 2022 19:02
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Copy link
Copy Markdown
Collaborator

@maxenglander maxenglander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was asked to take a look at this and offer perspective on how this will affect PlanetScale and PS customers. Just a couple questions for now. Thanks!

@deepthi deepthi removed the release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) label Aug 8, 2022
@deepthi
Copy link
Copy Markdown
Collaborator

deepthi commented Aug 8, 2022

Removed the needs details label after verifying that changes to summary file are included in the PR.

@deepthi deepthi added the NeedsWebsiteDocsUpdate What it says label Aug 8, 2022
@maxenglander
Copy link
Copy Markdown
Collaborator

P.S: In this case #11 we will run into issue where some of the older backups won't get restore after final state unless tool1 and tool2 are compatible.

It would be nice if there was a way to deal with this with a set of flags, e.g. --external-decompressor-transition that could be used as a fallback decompressor if --external-decompressor didn't work. In absence of that, I think it would be helpful to add documentation to set users' expectations about this.

Copy link
Copy Markdown
Collaborator

@maxenglander maxenglander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from some notes I left on usability and documentation above, I think this PR will add the flexibility we need to address the first three bullet points in this issue.

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
@rsajwani rsajwani removed the NeedsWebsiteDocsUpdate What it says label Aug 9, 2022
@deepthi
Copy link
Copy Markdown
Collaborator

deepthi commented Aug 9, 2022

P.S: In this case #11 we will run into issue where some of the older backups won't get restore after final state unless tool1 and tool2 are compatible.

It would be nice if there was a way to deal with this with a set of flags, e.g. --external-decompressor-transition that could be used as a fallback decompressor if --external-decompressor didn't work. In absence of that, I think it would be helpful to add documentation to set users' expectations about this.

+1 on the docs. We'll need to do a website PR and add the details there.

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Copy link
Copy Markdown
Collaborator

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @rsajwani!
Thank you @rvrangel for all your help

@deepthi deepthi merged commit 0eb5fcb into vitessio:main Aug 11, 2022
@deepthi deepthi deleted the ExtCompressor branch August 11, 2022 16:38
systay pushed a commit to planetscale/vitess that referenced this pull request Aug 19, 2022
…itessio#950)

* Backup/Restore: add support for external compressors and decompressors (vitessio#10558)

* change to support an external decompressor

Signed-off-by: Renan Rangel <renan@slack-corp.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* add external compressor support + builtin additional compressors

Signed-off-by: Renan Rangel <renan@slack-corp.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* wrap external compressor/decompressor

Signed-off-by: Renan Rangel <renan@slack-corp.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* go mod tidy + comments

Signed-off-by: Renan Rangel <renan@slack-corp.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* add copyright notices

Signed-off-by: Renan Rangel <renan@slack-corp.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* add support for builtin engine

Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Adding test case for buckup compression

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing unit test and run mod tidy

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Removing unwanted unit tests

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Increase timeout of backup tests

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* fixing linter errors

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Change test logic to accomodate running selective tests

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* removing lint warning

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* fixing test failure

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Removing un-necessary test

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing code review feeback

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Change builtinEngine to consider 'auto' decompressor

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* fixing Upgrade/Downgrade test

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fix type & add summary under release notes

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing typos in summary

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing flag name typos

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

Co-authored-by: Renan Rangel <rrangel@slack-corp.com>
Co-authored-by: Renan Rangel <renan@slack-corp.com>
Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* cherry-pick some how didn't took the right code for test

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Remove built-in decompression flag (vitessio#10670)

* Remove built-in decompression flag

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fix test failures

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fix Helpoutput test

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing unit test

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Adding summary

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* code cleaning and better summary

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Change builtinCompressor to more generic compression engine name

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing / Adding new test case

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fix summary & static code analysis

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Adding fake backup impl in test

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Adding time sleep in between test

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing summary and adding comments

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Feedback on summary

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Code review feedback

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing comment

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing default value in summary

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing test cases

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* More summary fixes

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

* Fixing TestHelpOutput

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>

Signed-off-by: Rameez Sajwani <rameezwazirali@hotmail.com>
Co-authored-by: Renan Rangel <rrangel@slack-corp.com>
Co-authored-by: Renan Rangel <renan@slack-corp.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants