Skip to content

Fix error from xtrabackup with s3: fails to upload large files#5212

Closed
deepthi wants to merge 1 commit intovitessio:masterfrom
planetscale:ds-s3-partsize
Closed

Fix error from xtrabackup with s3: fails to upload large files#5212
deepthi wants to merge 1 commit intovitessio:masterfrom
planetscale:ds-s3-partsize

Conversation

@deepthi
Copy link
Copy Markdown
Collaborator

@deepthi deepthi commented Sep 23, 2019

Reported by Slack.

    upload id: ngnUHv2iPGYT5vuB11lJ1LLTzazXjRlYt1mVqfJnOG6aY_1aB_hCjNiKUx9H3HIgAHfaC3GWcoWCa6J1twO_uzYJG4w4dN45xUHk8IE.mRaBu_PS03WVWDsh3Mm5aTKq
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit: cannot copy output from xtrabackup command: MultipartUpload: upload multipart failed
    upload id: ngnUHv2iPGYT5vuB11lJ1LLTzazXjRlYt1mVqfJnOG6aY_1aB_hCjNiKUx9H3HIgAHfaC3GWcoWCa6J1twO_uzYJG4w4dN45xUHk8IE.mRaBu_PS03WVWDsh3Mm5aTKq
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit)> 

We were using DefaultUploadPartSize when the file size is unknown, which limits us to a file size of ~ 50 MB. Instead use a part size that will allow us to upload the largest possible file (5 TB).

Signed-off-by: deepthi deepthi@planetscale.com

…wn size, use a part size that allows us to upload the largest possible file (5 Tb)

Signed-off-by: deepthi <deepthi@planetscale.com>
@deepthi deepthi requested a review from sougou as a code owner September 23, 2019 18:26
@deepthi deepthi requested review from enisoc and rafael September 23, 2019 18:26
@rafael
Copy link
Copy Markdown
Member

rafael commented Sep 23, 2019

Do you know what are the implications of setting this to a large size? Is pretty innocuous or is there some change in behavior that could bite us?

@deepthi
Copy link
Copy Markdown
Collaborator Author

deepthi commented Sep 23, 2019

From the documentation:
The largest object that can be uploaded in a single PUT is 5 gigabytes. For objects larger than 100 megabytes, customers should consider using the Multipart Upload capability.

So at a part size of ~500 MB, we are exceeding the recommended size for a single operation, though the theoretical limit is 5 GB per part.
I don't know the answer to the question ^^ and information on the web on this is non-existent.

@deepthi
Copy link
Copy Markdown
Collaborator Author

deepthi commented Sep 23, 2019

there's another way to get around this, which is to use -xtrabackup_stripes to break up the backup into several files.

@derekperkins
Copy link
Copy Markdown
Member

HubSpot handled this for normal backups #3844

@enisoc
Copy link
Copy Markdown
Member

enisoc commented Sep 24, 2019

Since we don't know what real-world effect this new value will have, can we make the xtrabackup upload part size a flag that continues to default to whatever it is now? That way we don't risk breaking anyone who isn't already broken.

Another option could be to have the xtrabackup engine pass in an estimated total size. We'd need to make sure we don't rely on the passed-in size to be exact, and if so, document that's the case for future implementers of the interface. The xtrabackup engine could perhaps send the total uncompressed size of all files that the built-in backup engine would have uploaded (whatever is returned by findFilesToBackup()), divided by the number of stripes, as an upper bound. That should at least be close enough to get reasonable part sizes.

@rafael
Copy link
Copy Markdown
Member

rafael commented Sep 24, 2019

Agree with @enisoc. FWIW, using xtrabackup_stripes seems to have fixed the problem for us. From my perspective, I think we can close this for now and point people to that flag if they run into this issue.

@enisoc
Copy link
Copy Markdown
Member

enisoc commented Sep 24, 2019

From my perspective, I think we can close this for now and point people to that flag if they run into this issue.

From my side, I would still like to fix this if we can do it without risking harm to those who don't need it (either making it a flag, or trying to auto-detect the upper bound). We do use stripes as well, but we use a fixed number of stripes, so we'll still hit this at some point.

As a third, even more complex option that no one asked for, we could add in the xtrabackup plugin a concept of max file/object size. If a given stripe file reaches that size, we rotate to a new file for that stripe. Each stripe would then be a sequence of files, rather than just one file. In this way, we could ensure that the size of any one file/object does not exceed what can be uploaded with efficient multipart upload settings. This is different from just increasing the stripe count because all stripes have to be read and written concurrently, and we don't want to force ridiculously high concurrency for very large shards.

I'd probably recommend against this third option due to the complexity, but just writing it down in the name of brainstorming.

@deepthi
Copy link
Copy Markdown
Collaborator Author

deepthi commented Sep 24, 2019

Since we don't know what real-world effect this new value will have, can we make the xtrabackup upload part size a flag that continues to default to whatever it is now? That way we don't risk breaking anyone who isn't already broken.

Anyone who is using the builtin backup method won't be affected (with the tiny exception of the manifest file) because we pass in fileSize to AddFile and compute the part size using that.
The only time there is an impact is if we pass in 0 as fileSize. That is done by xtrabackup (because the file size is unknown) and also when we create the manifest files.

Another option could be to have the xtrabackup engine pass in an estimated total size. We'd need to make sure we don't rely on the passed-in size to be exact, and if so, document that's the case for future implementers of the interface. The xtrabackup engine could perhaps send the total uncompressed size of all files that the built-in backup engine would have uploaded (whatever is returned by findFilesToBackup()), divided by the number of stripes, as an upper bound. That should at least be close enough to get reasonable part sizes.

I like this idea. The problem with a command line option is that people have to get it right, and they don't know they have got it wrong until they have a failure.

@deepthi
Copy link
Copy Markdown
Collaborator Author

deepthi commented Oct 18, 2019

Do you know what are the implications of setting this to a large size? Is pretty innocuous or is there some change in behavior that could bite us?

One implication is the memory needed, especially when combined with xtrabackup_stripes. More reason to implement reasonable part sizes.

@deepthi
Copy link
Copy Markdown
Collaborator Author

deepthi commented Oct 24, 2019

Closing this in favor of #5351

@deepthi deepthi closed this Oct 24, 2019
@deepthi deepthi deleted the ds-s3-partsize branch October 24, 2019 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants