Skip to content

miner: exit loop when downloader Done or Failed#21653

Closed
meowsbits wants to merge 17 commits intoethereum:masterfrom
etclabscore:fix/miner-dl-stop-break-foundation
Closed

miner: exit loop when downloader Done or Failed#21653
meowsbits wants to merge 17 commits intoethereum:masterfrom
etclabscore:fix/miner-dl-stop-break-foundation

Conversation

@meowsbits
Copy link
Copy Markdown
Contributor

Following the logic of the comment at the method,
this fixes a regression introduced at 7cf56d6, which would allow external parties to DoS with blocks, preventing mining progress.

Rel

Following the logic of the comment at the method,
this fixes a regression introduced at 7cf56d6
, which would allow external parties to DoS with
blocks, preventing mining progress.

Signed-off-by: meows <b5c6@protonmail.com>
Signed-off-by: meows <b5c6@protonmail.com>
Signed-off-by: meows <b5c6@protonmail.com>
Comment thread miner/miner.go Outdated
Signed-off-by: meows <b5c6@protonmail.com>
This helper function would return an affirmation
on the first positive match on a desired bool.

This was imprecise; it return false positives
by not waiting initially for an 'updated' value.

This fix causes TestMiner_2 to fail, which is
expected.

Signed-off-by: meows <b5c6@protonmail.com>
This test demonstrated the imprecision of the test
helper function waitForMiningState. This function
has been fixed with 6d365c2, and this test test
may now be removed.

Signed-off-by: meows <b5c6@protonmail.com>
See comment for logic.

Signed-off-by: meows <b5c6@protonmail.com>
We expect that once the downloader emits a DoneEvent,
signaling a successful sync, that subsequent StartEvents
are not longer permitted to stop the miner.

This prevents a security vulnerability where forced syncs via
fake high blocks would stall mining operation.

Signed-off-by: meows <b5c6@protonmail.com>
- Break downloader event handling into event
separating Done and Failed events. We need to
treat these cases differently since a DoneEvent
should prevent the miner from being stopped on
subsequent downloader Start events.

- Use canStop state to handle the one-off
case when a downloader first succeeds.

Signed-off-by: meows <b5c6@protonmail.com>
Signed-off-by: meows <b5c6@protonmail.com>
@meowsbits
Copy link
Copy Markdown
Contributor Author

meowsbits commented Oct 5, 2020

@holiman PTAL. Please note that the tests were broken (waitForMiningState returned false positives) and I've fixed that, as well as adding a test for what I believe to be the proper logic:

  • Downloader events
    • StartEvent should cause mining to stop IFF the downloader has not yet emitted a successful DoneEvent (ie has only emitted FailedEvent events). This prevents the bespoke security vulnerability where the appearance of high blocks (probably fake ones) and their associated (presumably failing) sync operations would stall mining operation.
    • FailedEvent should cause mining to start if it is requested.
    • DoneEvent should cause mining to start, and should disable mining stopping on any subsequent StartEvent.

Signed-off-by: meows <b5c6@protonmail.com>
@meowsbits meowsbits force-pushed the fix/miner-dl-stop-break-foundation branch from 558efc3 to bfe2728 Compare October 5, 2020 14:29
This makes mining pause/start logic regarding downloader
events more explicit. Instead of eternally handling downloader
events after the first done event, the subscription is closed
when downloader events are no longer actionable.

Signed-off-by: meows <b5c6@protonmail.com>
Signed-off-by: meows <b5c6@protonmail.com>
Signed-off-by: meows <b5c6@protonmail.com>
@MariusVanDerWijden
Copy link
Copy Markdown
Member

Hey @meowsbits Thank you for fixing my shit code and sorry for all the troubles I caused.
I generally like your approach (and I think your logic is correct now) but starting another go-routine to handle the events introduces races on the variables canStart and miningRequested. See:

go test ./... --race
==================
WARNING: DATA RACE
Write at 0x00c000196938 by goroutine 32:
  github.com/ethereum/go-ethereum/miner.(*Miner).update.func2()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:108 +0x35b
  github.com/ethereum/go-ethereum/miner.(*Miner).update.func3()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:133 +0x69

Previous write at 0x00c000196938 by goroutine 31:
  github.com/ethereum/go-ethereum/miner.(*Miner).update()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:144 +0x593

Goroutine 32 (running) created at:
  github.com/ethereum/go-ethereum/miner.(*Miner).update()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:128 +0x2eb

Goroutine 31 (running) created at:
  github.com/ethereum/go-ethereum/miner.New()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:77 +0x2f3
  github.com/ethereum/go-ethereum/miner.createMiner()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner_test.go:245 +0x8e9
  github.com/ethereum/go-ethereum/miner.TestMiner()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner_test.go:83 +0x3f
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:992 +0x1eb
==================
--- FAIL: TestMiner (0.06s)
    testing.go:906: race detected during execution of test
==================
WARNING: DATA RACE
Read at 0x00c0001968f1 by goroutine 86:
  github.com/ethereum/go-ethereum/miner.(*Miner).update()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:140 +0x575

Previous write at 0x00c0001968f1 by goroutine 87:
  github.com/ethereum/go-ethereum/miner.(*Miner).update.func2()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:105 +0x332
  github.com/ethereum/go-ethereum/miner.(*Miner).update.func3()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:133 +0x69

Goroutine 86 (running) created at:
  github.com/ethereum/go-ethereum/miner.New()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:77 +0x2f3
  github.com/ethereum/go-ethereum/miner.createMiner()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner_test.go:245 +0x8e9
  github.com/ethereum/go-ethereum/miner.TestStartWhileDownload()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner_test.go:164 +0x50
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:992 +0x1eb

Goroutine 87 (running) created at:
  github.com/ethereum/go-ethereum/miner.(*Miner).update()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:128 +0x2eb
==================
--- FAIL: TestStartWhileDownload (0.07s)
    testing.go:906: race detected during execution of test
==================
WARNING: DATA RACE
Read at 0x0000012bf5c0 by goroutine 27:
  github.com/ethereum/go-ethereum/miner.(*worker).newWorkLoop()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/worker.go:380 +0xc77

Previous write at 0x0000012bf5c0 by goroutine 103:
  [failed to restore the stack]

Goroutine 27 (running) created at:
  github.com/ethereum/go-ethereum/miner.newWorker()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/worker.go:228 +0xa4d
  github.com/ethereum/go-ethereum/miner.New()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner.go:75 +0x122
  github.com/ethereum/go-ethereum/miner.createMiner()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner_test.go:245 +0x8e9
  github.com/ethereum/go-ethereum/miner.TestMiner()
      /home/matematik/go/src/github.com/ethereum/go-ethereum/miner/miner_test.go:83 +0x3f
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:992 +0x1eb

Goroutine 103 (finished) created at:
  testing.(*T).Run()
      /usr/lib/go/src/testing/testing.go:1043 +0x660
  testing.runTests.func1()
      /usr/lib/go/src/testing/testing.go:1285 +0xa6
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:992 +0x1eb
  testing.runTests()
      /usr/lib/go/src/testing/testing.go:1283 +0x527
  testing.(*M).Run()
      /usr/lib/go/src/testing/testing.go:1200 +0x2ff
  main.main()
      _testmain.go:76 +0x223
==================
--- FAIL: TestGenerateBlockAndImportEthash (2.10s)
    testing.go:906: race detected during execution of test
FAIL
FAIL    github.com/ethereum/go-ethereum/miner   10.414s
FAIL

The go routine handling the downloader events handling
vars in parallel with the parent routine, causing a
race condition.

This change, though ugly, remove the condition while
still allowing the downloader event subscription to be
closed when the miner has no further use for it (ie DoneEvent).
@MariusVanDerWijden
Copy link
Copy Markdown
Member

I created a fix that passes your test cases here: https://github.com/MariusVanDerWijden/go-ethereum/tree/miner_updateloop_fix
Only caveat there is, that I don't unsubscribe when I received the downloader event.

@MariusVanDerWijden
Copy link
Copy Markdown
Member

Ah just saw that you pushed something. I think our solutions are mostly identical now (you do have better naming though :D)

@meowsbits
Copy link
Copy Markdown
Contributor Author

meowsbits commented Oct 5, 2020

Nice catch @MariusVanDerWijden! 68946a9 fixes that, although it's certainly not the prettiest code, having two near-duplicate for/select loops. I also considered using atomics/mutexes to hold the contended variables, but given the apparent reasoning behind the initial PR (replacing atomics with channels) that seemed backward-working ;)

If you have an alternative or alternative preference, let me know!

@meowsbits
Copy link
Copy Markdown
Contributor Author

meowsbits commented Oct 5, 2020

Ah! Same place, same time! 😃

@meowsbits
Copy link
Copy Markdown
Contributor Author

meowsbits commented Oct 5, 2020

Just to note as well that although I had indeed introduced another race, there is a pre-existing race condition in the miner package on the master branch:

ubp52 ~/go/src/github.com/ethereum/go-ethereum foundation-master=                                                                                                                                                                                                             
> git rev-parse HEAD                                                                                                                                                                                                                                                            
9d1e2027a05ed099b9ef43021a6cb09839e94a3a
ubp52 ~/go/src/github.com/ethereum/go-ethereum foundation-master=
> go test -race ./miner/...
==================
WARNING: DATA RACE
Read at 0x000001389720 by goroutine 31:
  github.com/ethereum/go-ethereum/miner.(*worker).newWorkLoop()
      /home/ia/go/src/github.com/ethereum/go-ethereum/miner/worker.go:380 +0xcbd

Previous write at 0x000001389720 by goroutine 80:
  [failed to restore the stack]

Goroutine 31 (running) created at:
  github.com/ethereum/go-ethereum/miner.newWorker()
      /home/ia/go/src/github.com/ethereum/go-ethereum/miner/worker.go:228 +0xa90
  github.com/ethereum/go-ethereum/miner.New()
      /home/ia/go/src/github.com/ethereum/go-ethereum/miner/miner.go:75 +0x128
  github.com/ethereum/go-ethereum/miner.createMiner()
      /home/ia/go/src/github.com/ethereum/go-ethereum/miner/miner_test.go:182 +0x926
  github.com/ethereum/go-ethereum/miner.TestMiner()
      /home/ia/go/src/github.com/ethereum/go-ethereum/miner/miner_test.go:83 +0x44
  testing.tRunner()
      /home/ia/go1.15.linux-amd64/src/testing/testing.go:1108 +0x202

Goroutine 80 (finished) created at:
  testing.(*T).Run()
      /home/ia/go1.15.linux-amd64/src/testing/testing.go:1159 +0x796
  testing.runTests.func1()
      /home/ia/go1.15.linux-amd64/src/testing/testing.go:1430 +0xa6
  testing.tRunner()
      /home/ia/go1.15.linux-amd64/src/testing/testing.go:1108 +0x202
  testing.runTests()
      /home/ia/go1.15.linux-amd64/src/testing/testing.go:1428 +0x5aa
  testing.(*M).Run()
      /home/ia/go1.15.linux-amd64/src/testing/testing.go:1338 +0x4eb
  main.main()
      _testmain.go:71 +0x236
==================
--- FAIL: TestGenerateBlockAndImportEthash (2.08s)
    testing.go:1023: race detected during execution of test
FAIL
FAIL    github.com/ethereum/go-ethereum/miner   10.502s
FAIL

@MariusVanDerWijden
Copy link
Copy Markdown
Member

Yeah, the existing race seems to be on w.chainConfig but I cant figure out where it comes from. Will try to sleep on it

@holiman
Copy link
Copy Markdown
Contributor

holiman commented Oct 5, 2020

Instead of having two near-identical loops, something like this can be used:

diff --git a/miner/miner.go b/miner/miner.go
index c8bb7fbca9..98506a4f17 100644
--- a/miner/miner.go
+++ b/miner/miner.go
@@ -127,12 +127,16 @@ func (miner *Miner) update() {
 	// withDownloaderState loop handles channeled events while the miner cares about the downloader events.
 	// This loop will be broken once the downloader emits a DoneEvent, signaling a successful sync.
 	// The following unnamed loop should be equivalent to this loop, but without the downloader event handling.
-withDownloaderState:
+	var ignoreDownloader = false
 	for {
 		select {
 		case ev := <-downloaderEvents.Chan():
+			if ignoreDownloader {
+				continue
+			}
 			if ev == nil {
-				break withDownloaderState
+				ignoreDownloader = true
+				continue
 			}
 			handleDownloaderEvents(ev)
 		case addr := <-miner.startCh:
@@ -149,24 +153,6 @@ withDownloaderState:
 			return
 		}
 	}
-
-	// Same loop as above, minus downloader event handling.
-	for {
-		select {
-		case addr := <-miner.startCh:
-			if !isDownloading {
-				miner.SetEtherbase(addr)
-				miner.worker.start()
-			}
-			miningRequested = true
-		case <-miner.stopCh:
-			miningRequested = false
-			miner.worker.stop()
-		case <-miner.exitCh:
-			miner.worker.close()
-			return
-		}
-	}
 }
 
 func (miner *Miner) Start(coinbase common.Address) {

@meowsbits
Copy link
Copy Markdown
Contributor Author

@holiman Reading from a closed channel will return the zero value, so this will create a busy loop.

@MariusVanDerWijden
Copy link
Copy Markdown
Member

What about this:
It has no busy waiting, minimal code duplication, passes the tests and no races.
Branch: https://github.com/MariusVanDerWijden/go-ethereum/tree/miner_updateloop_fix

for {
		if canSync {
			select {
			case ev := <-events.Chan():
				if ev == nil {
					// Subscription closed, don't poll
					canSync = false
				}
				handleDownloaderEvent(ev)
			case addr := <-miner.startCh:
				if canStart {
					miner.SetEtherbase(addr)
					miner.worker.start()
				}
				shouldStart = true
			case <-miner.stopCh:
				shouldStart = false
				miner.worker.stop()
			case <-miner.exitCh:
				miner.worker.close()
				return
			}
		} else {
			select {
			case addr := <-miner.startCh:
				if canStart {
					miner.SetEtherbase(addr)
					miner.worker.start()
				}
				shouldStart = true
			case <-miner.stopCh:
				shouldStart = false
				miner.worker.stop()
			case <-miner.exitCh:
				miner.worker.close()
				return
			}
		}
	}

@MariusVanDerWijden
Copy link
Copy Markdown
Member

BTW: I fixed the third race (the one thats on master) here: #21664

@meowsbits
Copy link
Copy Markdown
Contributor Author

I can't say I think this is really any better than as-is. It merges two for loops into one, which indeed removes a couple lines of code, but doesn't really change the essential duplicity of the original, and comes at the expense of quite non-idiomatic code structure, and another function-scoped (and mysteriously named 😉 ) bool. We would also want to continue in the ev == nil condition.

What was the reasoning behind moving from atomics to channels in the first place? If the urge to refactor the duplicity from the code as-is is strong enough, and function-scoped atomics are OK, then I would recommend moving the miningRequested and isDownloading vars to int32 and using them with atomic.Load|Store combined with the go func() { for ev := range downloaderEvents.Chan() }() pattern.

With that said, I'm OK with the code as-is since I do think it's readable, tested, and documented, and I'm not a fanatic about DRYness in the first place. I also have the relative urgency of the need for a fix in mind.

If either of you do feel strongly one way or another, I'll be happy to follow your leads.

Copy link
Copy Markdown
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think I'd slightly prefer a version which doesn't use labels, but instead has the upper for-loop use some condition which is set to false, instead of break withDownloaderState. But that's just nitpicks

Copy link
Copy Markdown
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with it too

@holiman
Copy link
Copy Markdown
Contributor

holiman commented Oct 6, 2020

Would be good if either @karalabe or @fjl gave it a thumb up or down too

@holiman holiman added this to the 1.9.23 milestone Oct 7, 2020
@fjl fjl self-assigned this Oct 13, 2020
@holiman holiman removed this from the 1.9.23 milestone Oct 13, 2020
@karalabe
Copy link
Copy Markdown
Member

Merged in #21701

@karalabe karalabe closed this Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants