Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs repo gc very slow #6322

Closed
suutaku opened this issue May 12, 2019 · 6 comments
Closed

ipfs repo gc very slow #6322

suutaku opened this issue May 12, 2019 · 6 comments

Comments

@suutaku
Copy link

suutaku commented May 12, 2019

ipfs repo gc take too long (very slow. it already takes 30 min + and still no response at now )
it is a bug?
my ipfs version:
ipfs version 0.4.21-dev
my repo info:

NumObjects: 194147
RepoSize:   38532082735
StorageMax: 50000000000
RepoPath:   /cot_service/.ipfs
Version:    fs-repo@7

goroutine:
www.cotnetwork.com/goroutine

@suutaku suutaku changed the title CLOSE_WAIT after access via http api some problem with #6295 ipfs repo gc very slow May 12, 2019
@magik6k
Copy link
Member

magik6k commented May 12, 2019

What OS/Filesystem are you using?

@suutaku
Copy link
Author

suutaku commented May 12, 2019

@magik6k
OS
Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
file system

Filesystem     Type     1K-blocks     Used Available Use% Mounted on
udev           devtmpfs   4020568        0   4020568   0% /dev
tmpfs          tmpfs       816832     1124    815708   1% /run
/dev/sda2      ext4     306613176 90404756 200563648  32% /
tmpfs          tmpfs      4084160        0   4084160   0% /dev/shm
tmpfs          tmpfs         5120        0      5120   0% /run/lock
tmpfs          tmpfs      4084160        0   4084160   0% /sys/fs/cgroup
/dev/loop0     squashfs     91392    91392         0 100% /snap/core/6673
/dev/loop1     squashfs     93312    93312         0 100% /snap/core/6531
/dev/loop2     squashfs     91648    91648         0 100% /snap/core/6818
tmpfs          tmpfs       816832        0    816832   0% /run/user/1000

by the way, under normal conditions, how long it will takes?

update

finally,

time ipfs repo gc

real    41m46.876s
user    0m0.000s
sys     0m0.044s

and

ipfs repo stat
NumObjects: 194147
RepoSize:   38527092946
StorageMax: 50000000000
RepoPath:   /cot_service/.ipfs
Version:    fs-repo@7

@suutaku
Copy link
Author

suutaku commented May 13, 2019

goroutine pointed to:
github.com/ipfs/go-ipfs-blockstore.(*gclocker).GCLock

it's some kind of dead lock?
@Stebalien @magik6k

Update

after check the goroutine. I find 4 process execute gc function. but
unlocker := bs.GCLock() at go-ipfs/pin/gc/gc.go:45 just call it, not check if GCLock has been called by using GCRequested(). I think we should check it. if GCLock has been called, just return a err message?

@suutaku
Copy link
Author

suutaku commented May 13, 2019

add some code like this ?

        .........
        ctx, cancel := context.WithCancel(ctx)

	output := make(chan Result, 128)
	elock := log.EventBegin(ctx, "GC.lockWait")
	var unlocker bstore.Unlocker
	if bs.GCRequested() {
		elock.Done()
		cancel()
		output <- Result{Error: errors.New("GC already in processing")}
		close(output)
		return output 
	}else {
		unlocker = bs.GCLock()
	}
	

	elock.Done()
	elock = log.EventBegin(ctx, "GC.locked")
	emark := log.EventBegin(ctx, "GC.mark")

	bsrv := bserv.New(bs, offline.Exchange(bs))
	ds := dag.NewDAGService(bsrv)
        .......

@obo20
Copy link

obo20 commented May 13, 2019

I can also vouch that garbage collection still takes a very long time (15 minutes +) on current versions of IPFS (the last 3 releases at least).

The main issue here is still that the node is basically locked down during GC, otherwise this wouldn't be a huge problem.

@Stebalien requested I create this issue awhile back, bur I'm not sure how much work has been put into this type of enhancement.

@Stebalien
Copy link
Member

@obo20

None yet.


I believe the slowness here is that GC currently needs to read the entire datastore to traverse the graph. I'm closing this in favor of #4382. The current GC approach just doesn't scale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants