Ingesting a file causes high read pressure #3104

huachaohuang · 2017-11-01T09:07:05Z

In the current implementation, ingesting a file needs to check if the file is overlapped with each level. For each level, all files at that level will be opened and prefetched (512K), to collect all the delete ranges at that level. Then each delete range of that level will be checked to see if it is overlapped with the ingesting file.
For an instance with tens of thousands files, ingesting a file without overlap will open and prefetch all files, which causes tens of gigabytes read flow.
Do we really need to check all delete ranges of all files at a level to see whether the ingesting file is overlapped?
Is it possible to optimize it so that we just need to check some boundaries?
@ajkr

ajkr · 2017-11-01T18:59:44Z

file ingestion has an optimization that prevented me from just checking file endpoints. That optimization allows files to be ingested into lower levels even if their endpoints overlap with files at higher levels, as long as the higher-level file does not have any keys in the ingested file's key-range. So even before range deletion it would open the higher-level file to check whether this condition holds. I guess the problem is worse now.

Do you know where the tens of GBs is coming from? Is it only from opening the files and reading range tombstone block? We can likely optimize it.

huachaohuang · 2017-11-02T01:41:50Z

I captured a flame graph when the disk was reading at full speed here:
https://transfer.sh/ul9GV/kernel.svg

huachaohuang · 2017-11-06T01:35:00Z

Hi, any update or any idea how to optimize it? Anything we can help?
@ajkr

siddontang · 2017-11-07T12:54:06Z

@ajkr

Can we cache all the delete range blocks to avoid re-read from SST? Maybe like pin-l0-filter-and-index-blocks, we can add a pin-range-del-blocks.

siddontang · 2017-11-07T13:59:22Z

Maybe we can also use a map to cache the files which contain the range del block to avoid checking all files.

huachaohuang · 2017-11-07T15:08:57Z

That optimization allows files to be ingested into lower levels even if their endpoints overlap with files at higher levels, as long as the higher-level file does not have any keys in the ingested file's key-range.

Can we just check the file endpoints if we don't mind to disable this optimization?

Or, since we haven't used delete range in production yet, can we add an option to disable delete range, so that we don't need to check the delete ranges in file ingestion?

facebook#3104

zhangjinpeng87 · 2017-11-14T02:23:45Z

We only need read delete range marks from these files which overlap with the sst file is about to ingest.
@ajkr Is there anything can't satisfy if use this strategy?

zhangjinpeng87 · 2017-11-14T02:38:06Z

We found some conflict when compaction type is not universal, since we need check overlap here, the previous checks is redundant.

ajkr · 2017-11-14T05:22:06Z

The previous check is needed in order to ingest files into the lowest possible level. For example,

L1 has file1 containing keys A, B, C, Z
ingested file has keys D, E, F
L2 has file2 containing keys A, B

What we do is:

at L1, IngestedFileOverlapWithLevel sets overlap_with_level=false
at L1, IngestedFileFitInLevel returns false so we don't update target_level
at L2, IngestedFileOverlapWithLevel sets overlap_with_level=false
at L2, IngestedFileFitInLevel returns true so we update target_level
then the file can be ingested in L2

Whether or not we should keep this behavior, I'm not sure. But I do think that the second commit in your PR would prevent the file from ingesting into L2.

ajkr · 2017-11-14T05:28:18Z

It feels kind of surprising that we check more than one file at each level. cc @IslamAbdelRahman - can we just check the one that overlaps the ingested file (if multiple overlap that implies there must be keys overlapping the ingested file's range so no need to continue)?

mikhail-antonov · 2017-11-14T08:20:11Z

I'll look at the code around it, but first questions that comes to my mind

if the IngestExternalFile() is called with a long list of files, i expect from reading the above that we pay this price for every single file we ingest, and it happens sequentially?
for people who don't use range deletes, does this IO price still need to be paid, there's no flag to disable range deletes checking during the bulkload like as an expert user option (don't check that range deletes don't conflict as I know for sure there are no range deletions?)

ajkr · 2017-11-14T19:00:09Z

For the second point, I'd prefer to optimize it so range deletions don't add overhead when they're not used. I don't think there's anything fundamentally preventing it -- we already opened the file to check regular keys so we can easily see whether a range deletion meta-block exists without any extra I/O. We just need to find time to implement it.

ajkr · 2017-11-17T09:10:14Z

Hi, want to try out #3179? In my testing it only reads range deletions from the relevant files.

huachaohuang · 2017-11-17T15:55:15Z

Awesome! We will test it out ASAP and give some feedback later.

huachaohuang · 2017-11-22T04:01:10Z

Hi, we have been testing #3179 for a while, so far so good, it's very stable now, good job!

ajkr · 2017-12-04T09:14:06Z

Thanks @huachaohuang! Please let me know if you notice any more issues.

ajkr self-assigned this Nov 1, 2017

siddontang mentioned this issue Nov 7, 2017

tikv get disconnected often under heavy write pingcap/tidb#5040

Closed

huachaohuang added a commit to tikv/rocksdb that referenced this issue Nov 8, 2017

Disable delete range

bde010e

facebook#3104

huachaohuang mentioned this issue Nov 8, 2017

[WIP] Disable delete range tikv/rocksdb#3

Closed

zhangjinpeng87 mentioned this issue Nov 14, 2017

Reduce potential reading when ingest external sst file #3160

Closed

ajkr closed this as completed Dec 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingesting a file causes high read pressure #3104

Ingesting a file causes high read pressure #3104

huachaohuang commented Nov 1, 2017

ajkr commented Nov 1, 2017

huachaohuang commented Nov 2, 2017

huachaohuang commented Nov 6, 2017

siddontang commented Nov 7, 2017 •

edited

Loading

siddontang commented Nov 7, 2017

huachaohuang commented Nov 7, 2017 •

edited

Loading

zhangjinpeng87 commented Nov 14, 2017

zhangjinpeng87 commented Nov 14, 2017

ajkr commented Nov 14, 2017

ajkr commented Nov 14, 2017 •

edited

Loading

mikhail-antonov commented Nov 14, 2017

ajkr commented Nov 14, 2017

ajkr commented Nov 17, 2017

huachaohuang commented Nov 17, 2017

huachaohuang commented Nov 22, 2017

ajkr commented Dec 4, 2017

Ingesting a file causes high read pressure #3104

Ingesting a file causes high read pressure #3104

Comments

huachaohuang commented Nov 1, 2017

ajkr commented Nov 1, 2017

huachaohuang commented Nov 2, 2017

huachaohuang commented Nov 6, 2017

siddontang commented Nov 7, 2017 • edited Loading

siddontang commented Nov 7, 2017

huachaohuang commented Nov 7, 2017 • edited Loading

zhangjinpeng87 commented Nov 14, 2017

zhangjinpeng87 commented Nov 14, 2017

ajkr commented Nov 14, 2017

ajkr commented Nov 14, 2017 • edited Loading

mikhail-antonov commented Nov 14, 2017

ajkr commented Nov 14, 2017

ajkr commented Nov 17, 2017

huachaohuang commented Nov 17, 2017

huachaohuang commented Nov 22, 2017

ajkr commented Dec 4, 2017

siddontang commented Nov 7, 2017 •

edited

Loading

huachaohuang commented Nov 7, 2017 •

edited

Loading

ajkr commented Nov 14, 2017 •

edited

Loading