Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disk utilization of MANIFEST file increases after the restart #3666

Closed
z0mb1ek opened this issue Feb 15, 2020 · 5 comments
Closed

disk utilization of MANIFEST file increases after the restart #3666

z0mb1ek opened this issue Feb 15, 2020 · 5 comments
Assignees
Labels
area/docdb YugabyteDB core features community/request Issues created by external users priority/high High Priority

Comments

@z0mb1ek
Copy link

z0mb1ek commented Feb 15, 2020

Hi.
I create 100 tables and than restart cluster, I see that tserver folder increase:

root@dev:/mnt/disk0# du -hs yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243.intents/*
0 yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243.intents/000006.log
4.0K yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243.intents/CURRENT
4.0K yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243.intents/IDENTITY
0 yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243.intents/LOCK
4.0M yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243.intents/MANIFEST-000005
8.0K yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243.intents/OPTIONS-000010
8.0K yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243.intents/OPTIONS-000012

root@dev:/mnt/disk0# ls -lh yb-data/tserver/data/rocksdb/table-00004000000030008000000000004001/tablet-29483cb0f8224db1942c93b7b1506243/
total 4.1M
-rw-r--r-- 1 root root 0 Feb 15 15:38 000006.log
-rw-r--r-- 1 root root 16 Feb 15 15:38 CURRENT
-rw-r--r-- 1 root root 37 Feb 15 15:01 IDENTITY
-rw-r--r-- 1 root root 0 Feb 15 15:01 LOCK
-rw-r--r-- 1 root root 66 Feb 15 15:38 MANIFEST-000005
-rw-r--r-- 1 root root 4.3K Feb 15 15:38 OPTIONS-000010
-rw-r--r-- 1 root root 4.3K Feb 15 15:38 OPTIONS-000012

@kmuthukk reproduced same issue

I use docker cluster with Ubuntu 18.04 ext4 base disk and volumes mount

@yugabyte-ci yugabyte-ci added the community/request Issues created by external users label Feb 15, 2020
@kmuthukk
Copy link
Collaborator

kmuthukk commented Feb 15, 2020

Was able to repro the same issue with yb-ctl and 2.0.10 release.

Created a few tables first (no rows in them).

Initially, each tablet's size is pretty small:

$ du -hs ./*/*
28K     ./table-000030a9000030008000000000004000/tablet-559ef83ea95140fda198221028294d8e
28K     ./table-000030a9000030008000000000004000/tablet-559ef83ea95140fda198221028294d8e.intents
0       ./table-000030a9000030008000000000004000/tablet-559ef83ea95140fda198221028294d8e.snapshots
28K     ./table-000030a9000030008000000000004000/tablet-97a742f6ef1849249927ca35a26d5f37
28K     ./table-000030a9000030008000000000004000/tablet-97a742f6ef1849249927ca35a26d5f37.intents
0       ./table-000030a9000030008000000000004000/tablet-97a742f6ef1849249927ca35a26d5f37.snapshots
28K     ./table-000030a9000030008000000000004005/tablet-517ac10a7617490c8450befeee2fca11
28K     ./table-000030a9000030008000000000004005/tablet-517ac10a7617490c8450befeee2fca11.intents
0       ./table-000030a9000030008000000000004005/tablet-517ac10a7617490c8450befeee2fca11.snapshots
28K     ./table-000030a9000030008000000000004005/tablet-d1f7f413c4be4a7789c19a4450beff88
28K     ./table-000030a9000030008000000000004005/tablet-d1f7f413c4be4a7789c19a4450beff88.intents
0       ./table-000030a9000030008000000000004005/tablet-d1f7f413c4be4a7789c19a4450beff88.snapshots

Looking at one of the table directories, it showed the following:

Note: See that the "total" is 28K and each file is pretty small.

17:13 $ ls -lh
total 28K
-rw-r--r-- 1 centos centos    0 Feb 15 17:11 000003.log
-rw-r--r-- 1 centos centos   16 Feb 15 17:11 CURRENT
-rw-r--r-- 1 centos centos   37 Feb 15 17:11 IDENTITY
-rw-r--r-- 1 centos centos    0 Feb 15 17:11 LOCK
-rw-r--r-- 1 centos centos   20 Feb 15 17:11 MANIFEST-000001
-rw-r--r-- 1 centos centos 4.3K Feb 15 17:11 OPTIONS-000007
-rw-r--r-- 1 centos centos 4.3K Feb 15 17:11 OPTIONS-000009

Now, after a yb-ctl stop/start. Notice that the file sizes are still small, but the "total" goes up to 4.1M. And du also reports higher usage.

$ ls -lh ./tablet-d1f7f413c4be4a7789c19a4450beff88
total 4.1M
-rw-r--r-- 1 centos centos    0 Feb 15 17:16 000006.log
-rw-r--r-- 1 centos centos   16 Feb 15 17:16 CURRENT
-rw-r--r-- 1 centos centos   37 Feb 15 17:11 IDENTITY
-rw-r--r-- 1 centos centos    0 Feb 15 17:11 LOCK
-rw-r--r-- 1 centos centos   66 Feb 15 17:16 MANIFEST-000005
-rw-r--r-- 1 centos centos 4.3K Feb 15 17:16 OPTIONS-000010
-rw-r--r-- 1 centos centos 4.3K Feb 15 17:16 OPTIONS-000012

@kmuthukk kmuthukk added the priority/high High Priority label Feb 15, 2020
@kmuthukk kmuthukk changed the title Size of the tablets increases after the restart disk utilization of the tablets increases after the restart Feb 15, 2020
@kmuthukk
Copy link
Collaborator

kmuthukk commented Feb 15, 2020

This difference upon restart is specific to the MANIFEST* file.

See the difference in ls and du reported usage of the file (size of file vs. size on disk).

Size of file: 66 bytes

$ ls -l MANIFEST-000005
-rw-r--r-- 1 centos centos 66 Feb 15 18:29 MANIFEST-000005

and size on disk: 4MB

$ du -hs MANIFEST-000005
4.0M    MANIFEST-000005

@kmuthukk kmuthukk changed the title disk utilization of the tablets increases after the restart disk utilization of MANIFEST file increases after the restart Feb 15, 2020
@kmuthukk
Copy link
Collaborator

kmuthukk commented Mar 2, 2020

Using gdb on the yb-tserver restart, and setting a breakpoint on the fallocate call reveals that this stack is responsible for the 4M fallocate for the MANIFEST files:

Breakpoint 1, fallocate (fd=107, mode=1, offset=offset@entry=0, len=len@entry=4194304) at ../sysdeps/unix/sysv/linux/wordsize-64/fallocate.c:28
28      ../sysdeps/unix/sysv/linux/wordsize-64/fallocate.c: No such file or directory.
(gdb) where
#0  fallocate (fd=107, mode=1, offset=offset@entry=0, len=len@entry=4194304) at ../sysdeps/unix/sysv/linux/wordsize-64/fallocate.c:28
#1  0x00007ffff19193e4 in rocksdb::PosixWritableFile::Allocate (this=0x1292480, offset=0, len=4194304)
    at ../../src/yb/rocksdb/util/io_posix.cc:475
#2  0x00007ffff192247f in PrepareWrite (len=7, offset=<optimized out>, this=0x1292480) at ../../src/yb/rocksdb/env.h:631
#3  rocksdb::WritableFileWriter::Append (this=0x16bcee0, data=...) at ../../src/yb/rocksdb/util/file_reader_writer.cc:83
#4  0x00007ffff184dbd8 in rocksdb::log::Writer::EmitPhysicalRecord (this=this@entry=0x1c81d00, t=<optimized out>,
    ptr=ptr@entry=0x18a59c0 "\n\032leveldb.BytewiseComparator", n=n@entry=28) at ../../src/yb/rocksdb/db/log_writer.cc:140
#5  0x00007ffff184ddd3 in rocksdb::log::Writer::AddRecord (this=this@entry=0x1c81d00, slice=...) at ../../src/yb/rocksdb/db/log_writer.cc:96
#6  0x00007ffff187c7b5 in rocksdb::(anonymous namespace)::AddEdit (edit=..., db_options=0xeed240, log=log@entry=0x1c81d00)
    at ../../src/yb/rocksdb/db/version_set.cc:3325
#7  0x00007ffff187fcd9 in rocksdb::VersionSet::WriteSnapshot (this=this@entry=0x13d8dc0, log=0x1c81d00, flushed_frontier_override=...)
    at ../../src/yb/rocksdb/db/version_set.cc:3353
#8  0x00007ffff1888ceb in rocksdb::VersionSet::LogAndApply (this=0x13d8dc0, column_family_data=column_family_data@entry=0x1da0800,
    mutable_cf_options=..., edit=edit@entry=0x129f690, mu=mu@entry=0xeed520, db_directory=0x0, new_descriptor_log=<optimized out>,
    new_cf_options=0x0) at ../../src/yb/rocksdb/db/version_set.cc:2276
#9  0x00007ffff17fd2b5 in rocksdb::DBImpl::RecoverLogFiles (this=this@entry=0xeed200, log_numbers=...,
    max_sequence=max_sequence@entry=0x7fffe07da560, read_only=read_only@entry=false) at ../../src/yb/rocksdb/db/db_impl.cc:1720
#10 0x00007ffff17fdeb0 in rocksdb::DBImpl::Recover (this=this@entry=0xeed200, column_families=..., read_only=read_only@entry=false,
    error_if_log_file_exist=error_if_log_file_exist@entry=false) at ../../src/yb/rocksdb/db/db_impl.cc:1383
#11 0x00007ffff1806555 in rocksdb::DB::Open (db_options=..., dbname=..., column_families=..., handles=handles@entry=0x7fffe07daa30,
    dbptr=dbptr@entry=0x7fffe07db3a0) at ../../src/yb/rocksdb/db/db_impl.cc:6164
#12 0x00007ffff18076dc in rocksdb::DB::Open (options=..., dbname=..., dbptr=dbptr@entry=0x7fffe07db3a0) at ../../src/yb/rocksdb/db/db_impl.cc:6090
#13 0x00007ffff50b2808 in yb::tablet::Tablet::OpenKeyValueTablet (this=this@entry=0x1286400) at ../../src/yb/tablet/tablet.cc:605
#14 0x00007ffff50b303c in yb::tablet::Tablet::Open (this=this@entry=0x1286400) at ../../src/yb/tablet/tablet.cc:445
#15 0x00007ffff50cec90 in yb::tablet::TabletBootstrap::OpenTablet (this=this@entry=0x7fffe07dc030) at ../../src/yb/tablet/tablet_bootstrap.cc:426
#16 0x00007ffff514073b in yb::tablet::enterprise::TabletBootstrap::OpenTablet (this=0x7fffe07dc030)
    at ../../ent/src/yb/tablet/tablet_bootstrap_ent.cc:65
#17 0x00007ffff50d466f in yb::tablet::TabletBootstrap::Bootstrap (this=this@entry=0x7fffe07dc030,
    rebuilt_tablet=rebuilt_tablet@entry=0x7fffe07dc300, rebuilt_log=rebuilt_log@entry=0x7fffe07dc2a0,
    consensus_info=consensus_info@entry=0x7fffe07dc4e0) at ../../src/yb/tablet/tablet_bootstrap.cc:349
#18 0x00007ffff50d8e87 in yb::tablet::BootstrapTablet (data=..., rebuilt_tablet=rebuilt_tablet@entry=0x7fffe07dc300,
    rebuilt_log=rebuilt_log@entry=0x7fffe07dc2a0, consensus_info=consensus_info@entry=0x7fffe07dc4e0)
    at ../../src/yb/tablet/tablet_bootstrap_if.cc:89
#19 0x00007ffff598783d in yb::tserver::TSTabletManager::OpenTablet (this=0x1054380, meta=..., deleter=...)
    at ../../src/yb/tserver/ts_tablet_manager.cc:1130
#20 0x00007fffeddedda4 in yb::ThreadPool::DispatchThread (this=0x11d1200, permanent=false) at ../../src/yb/util/threadpool.cc:608
#21 0x00007fffeddea54f in operator() (this=0xecb5b8)
    at /home/centos/yugabyte-2.0.10.0/linuxbrew-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:2267
#22 yb::Thread::SuperviseThread (arg=0xecb560) at ../../src/yb/util/thread.cc:739
#23 0x00007fffe88a9694 in start_thread (arg=0x7fffe07dd700) at pthread_create.c:333
#24 0x00007fffe7fe641d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

@kmuthukk
Copy link
Collaborator

kmuthukk commented Mar 2, 2020

Looking a bit deeper into above stack:

db_options_->manifest_preallocation_size);

manifest_preallocation_size(4 * 1024 * 1024),

I think so fixing this should be a simple matter of making preallocate size of manifest files much smaller to something like 32KB (instead of 4MB).

@bmatican bmatican added the area/docdb YugabyteDB core features label Mar 2, 2020
amitanandaiyer added a commit that referenced this issue Mar 5, 2020
Summary: Set default manifest preallocation to 64k instead of 4M

Test Plan: eyeball

Reviewers: kannan, bogdan

Reviewed By: bogdan

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D8086
@kmuthukk
Copy link
Collaborator

kmuthukk commented Mar 9, 2020

Fixed in 50aaf8d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features community/request Issues created by external users priority/high High Priority
Projects
None yet
Development

No branches or pull requests

5 participants