-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC: Data may remain after garbage collecting everything #54
Comments
As a simple PoC,
(compare with running the same commands without |
Opened issue dgraph-io/badger#442 to track the minimum value log file size allowed as a possible workaround for the user who wishes to minimize the used disk space after garbage collection. |
I might be mistaken, but I don't think that badger's GC is triggered yet during IPFS' GC: #4300. Or, at least it doesn't seem to be yet on 0.4.14-rc2. |
@leerspace Thanks for pointing me to that PR which gave me more context on the matter. Badger's GC is now linked to IPFS GC through #4578 where
The offline GC tool that is mentioned in that PR seems to have the same constraint so I don't think it will provide a solution to this issue. Badger GC PoC scriptset -x
set -e
export IPFS_PATH=~/.ipfs_badger_test
ipfs init --profile=badgerds
du -hsc $IPFS_PATH
# 84K total
ipfs add -r --pin=false --silent -Q ~/go/src/github.com/ipfs/go-ipfs
ipfs repo stat
# NumObjects: 5560
# RepoSize: 140914167
du -hsc $IPFS_PATH
# 135M total
ls -lah $IPFS_PATH/badgerds
# 134M mar 19 15:47 000000.vlog <- Single log file.
ipfs repo gc -q > /dev/null # Not really that quiet.
ipfs repo stat
# NumObjects: 13
# RepoSize: 143021466
du -hsc $IPFS_PATH
# 137M total
# Repeat again but setting the value log size to 100 MB.
rm -rv $IPFS_PATH # WARNING! Check the env.var to not delete the actual IPFS root.
ipfs init --profile=badgerds
# Set the 'vlogFileSize' configuration to 100 MB. (May need to install jq.)
cp $IPFS_PATH/config $IPFS_PATH/config.back
cat $IPFS_PATH/config.back | jq '.Datastore.Spec.child.vlogFileSize = "100000000"' > $IPFS_PATH/config
ipfs add -r --pin=false --silent -Q ~/go/src/github.com/ipfs/go-ipfs
ipfs repo stat
# NumObjects: 5560
# RepoSize: 140918856
ls -lah $IPFS_PATH/badgerds
# 96M mar 19 15:47 000000.vlog <- First log file cut short at 100 MB (will be GC'ed).
# 39M mar 19 15:47 000001.vlog <- Second log file (will be left behind in the GC process).
ipfs repo gc -q > /dev/null # Not really that quiet.
ipfs repo stat
# NumObjects: 13
# RepoSize: 143023447
ls -lah $IPFS_PATH/badgerds
# 40M mar 19 15:47 000001.vlog
du -hsc $IPFS_PATH
# 41M total
# Multiple succesive calls to GC won't change this.
ipfs repo gc -q > /dev/null
ipfs repo gc -q > /dev/null
ipfs repo gc -q > /dev/null
ls -lah $IPFS_PATH/badgerds
# 40M mar 19 15:47 000001.vlog
du -hsc $IPFS_PATH
# 41M total |
So does badger do any GC at all, or is it just that it holds on to a bit of diskspace? |
@lgierth Yes, it does GC, it's just that it waits for certain conditions before doing it (like having more than one value log file and having those log files filled with deleted keys passed a certain percentage threshold). So if those conditions aren't met when the user calls This is expected as Badger's architecture is more complex than I think a simplified version of what I just explained should be clearly documented for users to prepare their disk-freeing expectations to the Badger scenario (which will definitely be different from what they're accustomed to with the current |
So when will it free diskspace then? I've just had these two new gateways run out of diskspace and it refuses to restart ( So right now I have no choice other than It would be good if |
Ah I see: https://github.com/dgraph-io/badger#garbage-collection This does sound like we can do online garbage collection at a time of our choice. |
In that case Badger should already be performing GC when running |
@Stebalien cleared this up for me on IRC:
I've put the gateways back on flatfs for the time being, I'll be on research retreat and then vacation for the next 4 weeks -- otherwise I'd be up for riding the edge :) |
GC is called on badger - see https://github.com/ipfs/go-ipfs/blob/master/pin/gc/gc.go#L115-L125 / https://github.com/ipfs/go-ds-badger/blob/master/datastore.go#L258 |
Got it. I hadn't realized that had been bubbled up yet. |
@lgierth Could you document here more about the server environment in which you had the GC problem that we discussed in IRC please? I think it may be an interesting test case to have under consideration (big database, almost no space left, and to reclaim it it was necessary to write delete entries before actually running Badger's GC). |
I'm not seeing any reclaimed free space after garbage collection using badger with v0.4.19, but maybe someone can tell whether this is "expected" behavior. For example ( $ ipfs init --profile=badgerds
# NumObjects: 28
# RepoSize: 49962
$ ipfs daemon &
$ ipfs get QmSnuWmxptJZdLJpKRarxBMS2Ju2oANVrgbr2xWbie9b2D #stopped just after 3.5 GB
# NumObjects: 16118
# RepoSize: 3798130246
$ ipfs repo gc
# NumObjects: 13
# RepoSize: 3798130246
$ ipfs shutdown
# NumObjects: 93
# RepoSize: 3806508960
$ ipfs repo gc # offline garbage collection still gets me no space reclaimed
# NumObjects: 13
# RepoSize: 3805143019 |
Yeah, that's a badger bug. Looks like dgraph-io/badger#718. |
Yeah, this is a lot more than 1GiB. GC doesn't appear to do much at all. |
I ran
ipfs repo gc was run right before the btrfs snapshot was taken. |
Related badger issue - dgraph-io/badger#767 |
@magik6k based on the resulting close of the issue you linked, it seems like badger does eventually reclaim the space. Were you able to notice any eventual space reclamation on your end? |
Triage: STILL BROKEN! Even after the periodic GC and repeated GC fixes. |
We had a similar experience (with badger v1 or v2). Forcing GC, or doing manual The only thing that worked was doing a backup and restore (offline). (~900mb -> ~8mb) |
That is very annoying. We're now doing repeated GC cycles until badger reports there is nothing to do and that's still not fixing it. |
Sharing this topic which might be interesting to keep track of. My pain-points are related to many vlogs, but the solution might indirectly improve some part of the problem. |
That's great to hear! However, it looks like we're going to need to switch to badger 2 to see those improvements. We're currently on badger 1 as we haven't tested badger 2 thoroughly. |
Another example (with more data):
Performance - even on this small machine is great with badgerds, but there seems to be zero bytes cleanup after I deleted data with the current settings. |
@RubenKelevra The same situation |
How about exposing those settings and make them accessible via the configuration file? The configuration files won't be updated on existing setups, we could just edit the badgerds profile, to include the changes on new setups. This way existing setups won't be affected at all.
Can you look into creating the changes? Would really enjoy to test these out in my setup :) |
…age collection fixes ipfs#54 settings are by courtesy of @jsign, see [his post here](dgraph-io/badger#1297 (comment))
Hey guys, we've closed dgraph-io/badger#1228 with fixes in master. We'll do a new badger release soon. It would be very useful if you can test the master branch. |
From what I can tell, that's mostly concerned with expiring keys, right? |
No. It's for deleted and expired keys. There was a bug in compaction because of which we would keep deleted/expired keys. We no longer do that. |
That sounds promising :) |
We've had issues with deleted keys, move keys in badger for a long time (this issue was created 2 years ago). I'm glad we fixed it :) Please do try it out. We're always looking for feedback 🌠 |
@jarifibrahim , do you have some release estimate? I know |
We wanted to do Badger release last week but some other things came up (we're a small team of engineers working on badger and dgraph). I'll try to get a new patch version for v1.x and v2.x before end of this week (worst case end of next week). Most of the tests for badger run as part of the CI. We have a bank test https://github.com/dgraph-io/badger/blob/master/badger/cmd/bank.go which checks for transactional issues and runs for (4h+4h) 8 hours at midnight every day. Which version of badger are you using? |
go-ipfs is using v1.6.1 and has been slowly adding support for v2. |
Any update ? |
Not yet. The next step here is to copy the badger plugin (https://github.com/ipfs/go-ipfs/tree/master/plugin/plugins/badgerds) as "badger2ds", and make it use github.com/ipfs/go-ds-badger2. Want to submit a PR to go-ipfs? |
Soooo ... how is this looking? Is it fixed? Can it be closed? |
Not in badger v1. This has been fixed in v3 (and maybe badger v2 but that has other issues).
|
@Stebalien but this PR sounds like this bug should be fixed: dgraph-io/badger#1354 |
after the cleanup bug has been fixed: ipfs/go-ds-badger#54 (comment)
That was a v2 fix, not a v1 fix. |
:'( Such a pity - we saw huge performance gains on ipfs-search but it is eating up storage an alarming rate. Will have to switch back to flatfs, for now.
(After ~ 24 hours of crawling.) |
Note: after shutting down ipfs and calling
|
Regarding a future transition to Badger, its garbage collection process bails out in the case there is only one value log file, and by default the size of a value log file is set to 1GB. This translates to the scenario where a user adds data and then "deletes" it (e.g., unpinning and garbage collecting it) and it will still have up to 1GB of disk taken with these values marked as deleted but not reclaimed by Badger's GC.
This is not an error, and for big players 1GB may be negligible, but it will raise a normal end user's attention (specially because now everything is deleted and the disk space gets freed), and should be taken in consideration (maybe in the form of clear documentation clarifying that, or catching the
ErrNoRewrite
error and printing a warning to the user when appropriate).The text was updated successfully, but these errors were encountered: