Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make watermark not use timer #1366

Closed
wants to merge 10 commits into from
Closed

Make watermark not use timer #1366

wants to merge 10 commits into from

Conversation

srh
Copy link

@srh srh commented Aug 28, 2017

This avoids the question of fixing issue #1345 -- it keeps watermark behavior the same.


This change is Reviewable

The one about being consistent is fixed, the other replaced
by non-TODO comments.
@srh
Copy link
Author

srh commented Aug 28, 2017

Review status: 0 of 13 files reviewed at latest revision, 4 unresolved discussions.


client/mutations.go, line 375 at r1 (raw file):

	if req.line != 0 && req.mark != nil {
		atomic.AddUint64(&d.rdfs, uint64(req.size()))
		req.mark.Done(req.line)

Wound up encapsulating the channel.


cmd/dgraph/main_test.go, line 1374 at r1 (raw file):

	// we need watermarks for reindexing
	x.AssertTrue(!x.IsTestRun())

I removed the SetTestRun/IsTestRun stuff. I don't know what exactly its reason for existing was -- tests still work.


worker/groups.go, line 664 at r1 (raw file):

				return
			}
			n.syncAllMarks(ctx, lastIndex)

syncAllMarks never returned an error before.


worker/index.go, line 90 at r1 (raw file):

}

func (n *node) waitForAppliedMark(ctx context.Context, lastIndex uint64) {

The downside of centralizing WaitForMark is that we do lose the specific message.


Comments from Reviewable

@manishrjain
Copy link
Contributor

Nice refactoring. Have some comments to simplify the waiter code.

Also, I think the reason for those test run cases was that query package was failing randomly. You might want to run the tests multiple times to ensure that the failures don't reoccur. We might have modified the tests in some way, which might have fixed the issue; but hard to say.


Reviewed 12 of 13 files at r1, 1 of 1 files at r2.
Review status: all files reviewed at latest revision, 6 unresolved discussions.


worker/index.go, line 91 at r2 (raw file):

func (n *node) waitForAppliedMark(ctx context.Context, lastIndex uint64) {
	n.applied.WaitForMark(lastIndex)

Can remove this single line function.


x/watermark.go, line 179 at r2 (raw file):

				// Some partly duplicated code with what's below.
				heap.Pop(&waiterIndices)
				toNotify := waiters[min]

This logic seems weird. We might have someone waiting for 101, but the heap only sees 10, 50 and 150. This map lookup would then not trigger; or so it seems.

A simpler solution might be to keep a heap of waiters and compare the waiter index against the min done index. Not sure if a heap allows duplication, but if it does, then we'd avoid having a map of list of chan; which is another benefit to readability.

P.S. Another thought was that you can attach a waiter to an index in the heap itself, thus making the wait done trigger even simpler. Just add the index as a done index when attaching waiter to the heap element.


Comments from Reviewable

@srh
Copy link
Author

srh commented Aug 28, 2017

I'll run that test in a loop overnight.


Review status: 12 of 13 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


worker/index.go, line 91 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Can remove this single line function.

Done.


x/watermark.go, line 179 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This logic seems weird. We might have someone waiting for 101, but the heap only sees 10, 50 and 150. This map lookup would then not trigger; or so it seems.

A simpler solution might be to keep a heap of waiters and compare the waiter index against the min done index. Not sure if a heap allows duplication, but if it does, then we'd avoid having a map of list of chan; which is another benefit to readability.

P.S. Another thought was that you can attach a waiter to an index in the heap itself, thus making the wait done trigger even simpler. Just add the index as a done index when attaching waiter to the heap element.

If we have someone waiting for 101, then the heap will contain 101. waiterIndices is a different heap from indices.

We could keep a heap of waiters -- it can surely have duplicates -- the main reason I didn't do that was because I just mirrored the existing implementation with pending/indices.

If we put waiters on the pending/indices heap I think it would make the 10/50/150 while somebody waits on 101 case much more difficult.


Comments from Reviewable

@srh
Copy link
Author

srh commented Aug 29, 2017

I have not seen test failures.


Review status: 12 of 13 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


Comments from Reviewable

@manishrjain
Copy link
Contributor

Review status: 12 of 13 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


x/watermark.go, line 179 at r2 (raw file):

If we put waiters on the pending/indices heap I think it would make the 10/50/150 while somebody waits on 101 case much more difficult.

Not really, in fact it should be simplified. Say, the heap has 10, 50 and 150. It's waiting on 10 right now. Then we ask to wait for 101. We add 101 in the heap (as done), and attach the waiting channel to it. Once 10 is done, and then 50 is done, heap moves to 101 -- sees that it's already done, and writes to the channel, and moves onto 150. This is just simpler logic to think through and hence code.


Comments from Reviewable

@srh
Copy link
Author

srh commented Aug 29, 2017

Review status: 12 of 13 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


x/watermark.go, line 179 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

If we put waiters on the pending/indices heap I think it would make the 10/50/150 while somebody waits on 101 case much more difficult.

Not really, in fact it should be simplified. Say, the heap has 10, 50 and 150. It's waiting on 10 right now. Then we ask to wait for 101. We add 101 in the heap (as done), and attach the waiting channel to it. Once 10 is done, and then 50 is done, heap moves to 101 -- sees that it's already done, and writes to the channel, and moves onto 150. This is just simpler logic to think through and hence code.

What if after we consider 101 to be done, 101 begins?


Comments from Reviewable

@manishrjain
Copy link
Contributor

manishrjain commented Aug 29, 2017 via email

@srh
Copy link
Author

srh commented Aug 29, 2017

The watermark code has an assert that we don't create entries below doneUntil. It does not assert that a waiter for level L can be added after a begin Mark for L has been made, but never before.


Review status: 12 of 13 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


Comments from Reviewable

@manishrjain
Copy link
Contributor

Review status: 12 of 13 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


x/watermark.go, line 125 at r3 (raw file):

	// pending maps raft proposal index to the number of pending mutations for this proposal.
	pending := make(map[uint64]int)
	waiters := make(map[uint64][]chan struct{})

Just use one chan, and close it to represent done for all waiters.


x/watermark.go, line 208 at r3 (raw file):

	for mark := range w.markCh {
		if mark.waiter != nil {

Maybe better to do this:

if mark.index > 0 {
  ...
  continue
}
if len(mark.indices) > 0 {
  ...
  continue
}
if mark.waiter != nil {
   ...
  continue
}

Reads better. Also, mark.index is the most frequent code path, followed by indices, followed by waiter. This puts them in that order.


Comments from Reviewable

@manishrjain
Copy link
Contributor

:lgtm:


Reviewed 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 7 unresolved discussions, some commit checks failed.


Comments from Reviewable

@srh
Copy link
Author

srh commented Aug 29, 2017

In master as of 5e1709e.

@srh srh closed this Aug 29, 2017
@srh srh removed the in progress label Aug 29, 2017
@srh srh deleted the sam/watermark branch August 29, 2017 23:19
jarifibrahim pushed a commit that referenced this pull request Jun 16, 2020
This commit brings the following changes from badger

c45d966 Fix assert in background compression and encryption. (#1366)
14386ac GC: Consider size of value while rewriting (#1357)
b2267c2 Restore: Account for value size as well (#1358)
b762832 Tests: Do not leave behind state goroutines (#1349)
056d859 Support disabling conflict detection (#1344)
fd89894 Compaction: Expired keys and delete markers are never purged (#1354)
543f353 Fix build on golang tip (#1355)
a7e239e StreamWriter: Close head writer (#1347)
da80eb9 Iterator: Always add key to txn.reads (#1328)
7e19cac Add immudb to the project list (#1341)
079f5ae DefaultOptions: Set KeepL0InMemory to false (#1345)
jarifibrahim pushed a commit that referenced this pull request Jun 17, 2020
This commit brings the following changes from badger
```
c45d966 Fix assert in background compression and encryption. (#1366)
14386ac GC: Consider size of value while rewriting (#1357)
b2267c2 Restore: Account for value size as well (#1358)
b762832 Tests: Do not leave behind state goroutines (#1349)
056d859 Support disabling conflict detection (#1344)
fd89894 Compaction: Expired keys and delete markers are never purged (#1354)
543f353 Fix build on golang tip (#1355)
a7e239e StreamWriter: Close head writer (#1347)
da80eb9 Iterator: Always add key to txn.reads (#1328)
7e19cac Add immudb to the project list (#1341)
079f5ae DefaultOptions: Set KeepL0InMemory to false (#1345)
```
jarifibrahim pushed a commit that referenced this pull request Jun 18, 2020
This change brings following new commits from badger
```
dd332b0 Avoid panic in filltables() (#1365)
c45d966 Fix assert in background compression and encryption. (#1366)
```
jarifibrahim pushed a commit that referenced this pull request Jul 11, 2020
This commit brings following new changes from badger
This commit also disable conflict detection in badger to save memory.

```
0dfb8b4 Changelog for v20.07.0 (#1411)
03ba278 Add missing changelog for v2.0.3 (#1410)
6001230 Revert "Compress/Encrypt Blocks in the background (#1227)" (#1409)
800305e Revert "Buffer pool for decompression (#1308)" (#1408)
63d9309 Revert "fix: Fix race condition in block.incRef (#1337)" (#1407)
e0d058c Revert "add assert to check integer overflow for table size (#1402)" (#1406)
d981f47 return error if the vlog writes exceeds more that 4GB. (#1400)
7f4e4b5 add assert to check integer overflow for table size (#1402)
8e896a7 Add a contribution guide (#1379)
b79aeef Avoid panic on multiple closer.Signal calls (#1401)
717b89c Enable cross-compiled 32bit tests on TravisCI (#1392)
09dfa66 Update ristretto to commit f66de99 (#1391)
509de73 Update head while replaying value log (#1372)
e013bfd Rework DB.DropPrefix (#1381)
3042e37 pre allocate cache key for the block cache and the bloom filter cache (#1371)
675efcd Increase default valueThreshold from 32B to 1KB (#1346)
158d927 Remove second initialization of writech in Open (#1382)
d37ce36 Tests: Use t.Parallel in TestIteratePrefix tests  (#1377)
3f4761d Force KeepL0InMemory to be true when InMemory is true (#1375)
dd332b0 Avoid panic in filltables() (#1365)
c45d966 Fix assert in background compression and encryption. (#1366)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants