Backup to common cloud storage #89

tennix · 2019-12-08T14:14:02Z

BR support common cloud storage

Overview

Integrate BR with common cloud object storage (S3, GCS, Azure Blob storage etc).

Problem statement

Currently, BR supports local storage where backup files are stored on local directory. But the backup files need to be collected together and copied to every TiKV node. This is difficult to use in practice, so it's better to mount an NFS like filesystem to every TiKV node and BR node. However, mounting NFS to every node is difficult to set up and error-prone.

Alternatively, object storage is better for this scenario, especially that it's quite common to backup to S3/GCS on public cloud.

TODO list

S3 Support (2400 points)

Extend current backup protobuf to support S3 compatible (S3, Ceph object storage, Minio, Aliyun OSS) storage (900 points / medium) @tennix backup: extend backup storage for S3 kvproto#507
Store BR metadata to S3 compatible object storage (600 points / easy) @DanielZhangQD backup: store backupmeta to s3 #103
Store TiKV SST files to S3 compatible object storage (900 points / medium) @yiwu-arbug external_storage: add S3 support tikv/tikv#6209

GCS Support (1800 points)

Extend backup protobuf to support GCS (300 points / easy) @SunRunAway backup: add GCS proto kvproto#514
Store BR metadata to GCS (600 points / easy) @SunRunAway storage: support GCS backup and restore #102
Store TiKV SST files to GCS (900 points / medium) @yiwu-arbug external_storage: add GCS support tikv/tikv#6283

TiDB Operator integration (900 points)

Integrate BR with tidb-operator to allow easy deployment (900 points / medium) @DanielZhangQD support backup to S3 with br tidb-operator#1280

Test (2100 points)

Unit test coverage >= 60% (900 points / easy)
e2e test (1200 points / medium) [WIP] GCS: add e2e test #108

Mentors

Recommended skills

Go language
Rust language
Familiar with S3/GCS/Azure blob storage

yiwu-arbug · 2019-12-09T03:45:30Z

I want to join in this task~

SunRunAway · 2019-12-09T05:06:02Z

@francis0407 and I also have interest in this issue.

gregwebs · 2019-12-09T18:55:32Z

Another task that can be added to this is to ensure proper streaming to object storage without copying data in memory.

tennix · 2019-12-10T02:34:30Z

Another task that can be added to this is to ensure proper streaming to object storage without copying data in memory.

What do you mean "without copying data in memory"?

gregwebs · 2019-12-10T03:41:48Z

I guess if streaming is working well then perhaps it isn't necessary to state this. But we should avoid copying wherever possible. The current implementation uses read_to_end which I presume copies the data every time it re-sizes its vector.

SunRunAway · 2019-12-10T03:42:47Z

Another task that can be added to this is to ensure proper streaming to object storage without copying data in memory.

What do you mean "without copying data in memory"?

I think it is that uploading a file from disk without buffering all file data into memory.

gregwebs · 2019-12-10T04:01:42Z

I think it is that uploading a file from disk without buffering all file data into memory.

Even though it is an SST file, I think it is starting from in memory?

kennytm · 2019-12-10T04:06:37Z

The writer interface on TiKV is currently

    fn write(&self, name: &str, reader: &mut dyn Read) -> io::Result<()>;

so the cloud storage implementation decides how to read the SST file and upload. For local storage we use Rust's built-in std::io::copy which streams up to 8 KiB at a time.

yiwu-arbug · 2019-12-11T23:27:13Z

I think it is that uploading a file from disk without buffering all file data into memory.

Even though it is an SST file, I think it is starting from in memory?

Just checked and the file content is in memory before being uploaded.

gregwebs · 2019-12-13T18:18:16Z

@yiwu-arbug when the file is created in memory, can it be created in a streaming fashion?

kennytm · 2019-12-13T19:01:55Z

@gregwebs The SST writer is currently using an in-memory storage to speed up SST generation. This is the first copy.

https://github.com/tikv/tikv/blob/b147b38eeaae4129e7e3eb7c06e270b8e0682697/components/backup/src/writer.rs#L108-L117

We then write the content into a &mut Vec<u8> buffer. This is the second copy.

https://github.com/tikv/tikv/blob/b147b38eeaae4129e7e3eb7c06e270b8e0682697/components/engine_rocks/src/sst.rs#L209-L212

The buffer is turned into a rate-limited reader, and passed into write(). In tikv/tikv#6209, for simplicity, read_to_end() to used to extract the entire content into &mut Vec<u8> again. This is the third copy.

https://github.com/tikv/tikv/blob/b826af388050fc48de2ce36a7684d1309d00e83a/components/external_storage/src/s3.rs#L94

So in the current situation we will have to deal with 3N bytes at the time. The second and third copies can be eliminated by streaming, which reduces the memory usage to N bytes (assume buffer size ≪ N).

Typically N = region size = 96 MB, and the default concurrency is 4, so we're talking about using memory of ~1200 MB vs ~400 MB here.

@yiwu-arbug That said, for tikv/tikv#6209, the advantage of streaming over read_to_end() isn't about the 800 MB of memory, but that the rate limit is really being respected.

Suppose we set the rate limit to 10 MB/s. With streaming, we will upload 1 MB every 0.1s, and the attain the average speed of 10 MB/s uniformly. With read_to_end(), however, we will sleep for 9.6s, and then suddenly upload the entire 96 MB file at maximum speed. This doesn't sound like the desired behavior.

gregwebs · 2019-12-13T19:42:03Z

Yes, please get rid of read_to_end! My understanding is that it will re-size its buffer, so that means copying memory around during the operation, slowing things down further.
From my quick looking at the TiKV source code to follow things back to the source, I see the SST file being generated by impl SstWriter for RocksSstWriter using read_to_end! I am not sure if this is the right code path. However, the point is that it does seem that we should be able to stream data out of RocksDB straight into S3 without buffering all the data and copying.
As you said, besides making memory usage predictable it also makes network usage predictable. Additionally, if there is more network bandwidth available then it will be possible to increase the concurrency, whereas if we are loading up memory and periodically maxing out network usage the backup will have to stay throttled. The backup operation can slow down the rest of the queries by delaying the GC, so the sooner it can finish the better.

kennytm mentioned this issue Dec 8, 2019

Support writing to cloud storage (S3, GCS, ...) pingcap/dumpling#8

Open

tennix assigned yiwu-arbug, DanielZhangQD, francis0407, SunRunAway and tennix Dec 9, 2019

SunRunAway mentioned this issue Dec 10, 2019

backup: add GCS proto pingcap/kvproto#514

Merged

kennytm mentioned this issue Dec 11, 2019

external_storage: add S3 support tikv/tikv#6209

Merged

SunRunAway mentioned this issue Dec 11, 2019

storage: support GCS backup and restore #102

Merged

SunRunAway mentioned this issue Dec 12, 2019

backup: add GCS proto (#514) pingcap/kvproto#525

Merged

kennytm mentioned this issue Feb 9, 2020

Integration test for S3 #157

Closed

kennytm added difficulty/3-hard Hard issue Priority/P1 High priority issue. Must have an associated milestone labels May 28, 2020

kennytm unassigned francis0407 May 28, 2020

kennytm added this to the v4.0.1 milestone May 28, 2020

3pointer removed this from the v4.0.2 milestone Jun 19, 2020

DanielZhangQD removed their assignment Feb 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backup to common cloud storage #89

Backup to common cloud storage #89

tennix commented Dec 8, 2019 •

edited by overvenus

Loading

yiwu-arbug commented Dec 9, 2019

SunRunAway commented Dec 9, 2019

gregwebs commented Dec 9, 2019

tennix commented Dec 10, 2019

gregwebs commented Dec 10, 2019

SunRunAway commented Dec 10, 2019

gregwebs commented Dec 10, 2019

kennytm commented Dec 10, 2019

yiwu-arbug commented Dec 11, 2019

gregwebs commented Dec 13, 2019

kennytm commented Dec 13, 2019 •

edited

Loading

gregwebs commented Dec 13, 2019

Backup to common cloud storage #89

Backup to common cloud storage #89

Comments

tennix commented Dec 8, 2019 • edited by overvenus Loading

BR support common cloud storage

Overview

Problem statement

TODO list

S3 Support (2400 points)

GCS Support (1800 points)

TiDB Operator integration (900 points)

Test (2100 points)

Mentors

Recommended skills

yiwu-arbug commented Dec 9, 2019

SunRunAway commented Dec 9, 2019

gregwebs commented Dec 9, 2019

tennix commented Dec 10, 2019

gregwebs commented Dec 10, 2019

SunRunAway commented Dec 10, 2019

gregwebs commented Dec 10, 2019

kennytm commented Dec 10, 2019

yiwu-arbug commented Dec 11, 2019

gregwebs commented Dec 13, 2019

kennytm commented Dec 13, 2019 • edited Loading

gregwebs commented Dec 13, 2019

tennix commented Dec 8, 2019 •

edited by overvenus

Loading

kennytm commented Dec 13, 2019 •

edited

Loading