Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fsync directory after creating files #105

Closed
wants to merge 5 commits into from
Closed

Conversation

srh
Copy link

@srh srh commented Jul 17, 2017

Fixes #104.

I've assumed that removing files doesn't have the same urgency (I don't know Badger in and out, the assumption here is that starting the store doesn't infer anything from a file's mere existence).

Things might look slightly unfamiliar because it's based on the changes from the other two PR's I've made, #102 and #103.


This change is Reviewable

@manishrjain
Copy link
Contributor

Review status: 0 of 3 files reviewed at latest revision, 1 unresolved discussion.


kv.go, line 376 at r1 (raw file):

// Syncs two dirs.
func syncBothDirs(dir1, dir2 string) error {

Can't you just open directory with O_DSYNC? That way all creates, deletes etc. would automatically be synced; avoiding code complexity.


Comments from Reviewable

@srh
Copy link
Author

srh commented Jul 17, 2017

Review status: 0 of 3 files reviewed at latest revision, 1 unresolved discussion.


kv.go, line 376 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Can't you just open directory with O_DSYNC? That way all creates, deletes etc. would automatically be synced; avoiding code complexity.

I don't think that works. Being opened O_SYNC or O_DSYNC is part of the file descriptor, so you'd need a way to open the file using a directory file descriptor. You can't do write operations on a directory file descriptor, generally speaking. (You can only open a directory with O_RDONLY, in fact.) The system call openat(2) exists, but it's not exposed in Go and anyway, I doubt it has that semantic.


Comments from Reviewable

@manishrjain
Copy link
Contributor

Reviewed 3 of 3 files at r1.
Review status: all files reviewed at latest revision, 6 unresolved discussions.


kv.go, line 325 at r1 (raw file):

	// haven't specifically fsynced, are guaranteed to have their directory entry removal
	// persisted to disk.
	if err := syncBothDirs(s.opt.Dir, s.opt.ValueDir); err != nil {

Just call sync twice, and avoid the syncBothDirs function. This is not performance critical, and the function adds unnecessary code.


levels.go, line 315 at r1 (raw file):

	// sync the directory as early as possible for concurrency.)
	for x := 0; x < i; x++ {
		<-fileCreated

This isn't required. the <-che below is doing the same waiting. You can call syncDir after that.


levels.go, line 327 at r1 (raw file):

	}

	if syncErr != nil && firstErr == nil {

This can be rewritten:

if firstErr == nil {
firstErr = syncDir(s.kv.opt.Dir)
}

You anyways only want to call syncDir if firstErr was nil. (no need for syncErr var).


value.go, line 650 at r1 (raw file):

				return errors.Wrapf(err, "While creating new value log: %q", newlf.path)
			}
			if err := syncDir(l.dirPath); err != nil {

if l.opt.SyncWrites is false, then do we need to do this?


value.go, line 653 at r1 (raw file):

				return errors.Wrapf(err,
					"Could not sync directory entry of value log: %q",
					newlf.path)

can fit in last line.


Comments from Reviewable

Sam Hughes added 2 commits July 18, 2017 12:32
It seems not to help performance or memory usage substantially.  Maybe
it does not help at all.
- We now close fd inside of OpenTable upon error.

- At many callsites (where we call OpenTable many times in a loop) we
now cleanup resources properly.
@srh
Copy link
Author

srh commented Jul 18, 2017

Review status: 0 of 3 files reviewed at latest revision, 6 unresolved discussions.


kv.go, line 325 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Just call sync twice, and avoid the syncBothDirs function. This is not performance critical, and the function adds unnecessary code.

Done.


levels.go, line 315 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This isn't required. the <-che below is doing the same waiting. You can call syncDir after that.

Done.


levels.go, line 327 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This can be rewritten:

if firstErr == nil {
firstErr = syncDir(s.kv.opt.Dir)
}

You anyways only want to call syncDir if firstErr was nil. (no need for syncErr var).

Done.


value.go, line 650 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

if l.opt.SyncWrites is false, then do we need to do this?

I guess not. Changing that to call conditionally. But also, I'll add a syncDir() call to valueLog::sync().


value.go, line 653 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

can fit in last line.

So for 100 columns, we're basing that on there being 4 spaces per tab? I was doing 8. Doing 4 now.


Comments from Reviewable

@manishrjain
Copy link
Contributor

:lgtm:


Reviewed 3 of 3 files at r2.
Review status: all files reviewed at latest revision, 1 unresolved discussion.


value.go, line 653 at r1 (raw file):

Previously, srh (Sam Hughes) wrote…

So for 100 columns, we're basing that on there being 4 spaces per tab? I was doing 8. Doing 4 now.

I think gofmt takes care of all that. But, Go doesn't enforce any column size, so we set it to 100 for Dgraph.


Comments from Reviewable

We don't fsync the directory when removing files, because it's safe to
crash if you haven't removed a file.  (We do though, when cleanly
shutting down the KV-store.)
@srh srh closed this Jul 19, 2017
@srh srh deleted the sam/dirsync branch July 19, 2017 20:29
@srh srh mentioned this pull request Jul 21, 2017
spongedu pushed a commit to spongedu/badger that referenced this pull request Jan 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants