Skip to content

consensus/ethash: better block submission pow verification#19899

Closed
de1acr0ix wants to merge 2 commits intoethereum:masterfrom
btccom:light-pow-check
Closed

consensus/ethash: better block submission pow verification#19899
de1acr0ix wants to merge 2 commits intoethereum:masterfrom
btccom:light-pow-check

Conversation

@de1acr0ix
Copy link
Copy Markdown
Contributor

@de1acr0ix de1acr0ix commented Jul 30, 2019

The block submission RPC in current geth takes some non-trivial time (>1s).

  • It uses the full DAG to verify PoW solution.
  • The full DAG is actually mmaped. The random access of DAG causes swap I/O operations.

This PR addresses the issue by making the following changes.

  • Uses light version of PoW verification for block submission.
  • Disable mmap for PoW verification cache/dataset if we are not storing them on disk.

In worst case it is using mmaped light cache to verify block submission, which is way smaller and the speed would be faster than full DAG verification. And it is still not bad comparing full DAG if no mmap is used. A modern machine can do a light verification within 2ms and it is trivial for block submission, which occurs only every 10 to 15 seconds.

@de1acr0ix de1acr0ix requested a review from karalabe as a code owner July 30, 2019 02:52
@karalabe
Copy link
Copy Markdown
Member

If you are mining, we assume that you have enough memory to allocate the entire full dag. The reason we're using the full one is because it is significantly faster (1-2 orders of magnitude) once it's loaded into memory. If your machine is swapping, it means you don't have enough memory for the DAG, so you'll have bigger problems down the line.

@karalabe
Copy link
Copy Markdown
Member

@ppratscher Did you notice any similar issues on your pool by any chance? AFAIK this was a welcome change for your pool.

@de1acr0ix
Copy link
Copy Markdown
Contributor Author

de1acr0ix commented Jul 30, 2019

If you are mining, we assume that you have enough memory to allocate the entire full dag. The reason we're using the full one is because it is significantly faster (1-2 orders of magnitude) once it's loaded into memory. If your machine is swapping, it means you don't have enough memory for the DAG, so you'll have bigger problems down the line.

Indeed it is significantly faster than light verification. But the problem is that mmap is OS controlled and mmap the whole file does not mean it will get loaded into process memory space immediately. We have some 32G RAM nodes and it still costs 1 to 2 seconds to submit a block. You need to at least use the first of my commits to ensure that the DAG is in memory. Right now using light verification costs less than 2ms on our nodes, and it is trivial for the whole block submission (around 50ms after applying the changes).

}
// If we don't store anything on disk, generate and return.
if dir == "" {
if dir == "" || limit <= 0 {
Copy link
Copy Markdown
Contributor

@CoreyLin CoreyLin Aug 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of func (c *cache) generate(dir string, limit int, test bool) has already guaranteed that cache is generated before use. This method would firstly load the cache file and memory map it. If the cache file doesn't exist, it would create a new one and memory map it. So it seems that the mechanism here has no problem.

The parameter limit here indicates that how many cache files are allowed to exist on disk. Even if limit <= 0 , it doesn't directly indicate that no cache file exists. At least using limit <= 0 here is not rigorous logically.

Copy link
Copy Markdown
Contributor Author

@de1acr0ix de1acr0ix Aug 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention here is to reuse the limit flag so that there is a way to disable mmap, which is impossible for existing logic. Of course we can introduce a new flag but I believe implicitly disabling mmap when limit is not a positive number is a reasonable choice.

@fjl fjl changed the title consensus: better block submission pow verification consensus/ethash: better block submission pow verification Oct 28, 2019
@fjl
Copy link
Copy Markdown
Contributor

fjl commented Oct 28, 2019

We need to make a decision here. If using the DAG isn't faster in practice, we might want to switch back to using the cache again.

@de1acr0ix
Copy link
Copy Markdown
Contributor Author

We need to make a decision here. If using the DAG isn't faster in practice, we might want to switch back to using the cache again.

Technically speaking the major problem here is that mmap() does not guarantee that whole DAG is going to be in the memory.

@karalabe
Copy link
Copy Markdown
Member

karalabe commented Nov 5, 2019

Yes, that's a fair point. Though I think we would need to be a lot more explicit here so that it's not some black magic flag combo. I guess the valuable observation here is that mining with a DAG (or remote PoW verification) should force keeping the DAG in memory, otherwise it may be worse than a cache.

Perhaps adding a --ethash.dagnommap flag would make it clear and obvious that you're willing to spend your RAM on keeping huge DAGs in memory, and it should be obvious to miners/pools that you want this.

Would such a CLI change make sense to you?

@de1acr0ix
Copy link
Copy Markdown
Contributor Author

Yes, that's a fair point. Though I think we would need to be a lot more explicit here so that it's not some black magic flag combo. I guess the valuable observation here is that mining with a DAG (or remote PoW verification) should force keeping the DAG in memory, otherwise it may be worse than a cache.

Perhaps adding a --ethash.dagnommap flag would make it clear and obvious that you're willing to spend your RAM on keeping huge DAGs in memory, and it should be obvious to miners/pools that you want this.

Would such a CLI change make sense to you?

Sorry @karalabe I left blockchain industry and hadn't look into this PR for a while. Your proposal makes sense to me. And adding another parameter to select whether to use full DAG may also be considered. According to our profiling, even not using full DAG, PoW verification is not a major factor while submitting a block. I'll create a new PR during vacation as I am not under https://github.com/btccom any more and cannot update this PR now.

@de1acr0ix
Copy link
Copy Markdown
Contributor Author

@karalabe I created #20484 to add the command line options.

@de1acr0ix
Copy link
Copy Markdown
Contributor Author

#20484 is merged and we can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants