Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] Fix verifier cache keys by hashing full Output #3613

Closed
wants to merge 7 commits into from

Conversation

trevyn
Copy link
Contributor

@trevyn trevyn commented Mar 22, 2021

Per #3606 and discussion with @antiochp, this includes:

Outstanding questions:

We get most of the benefit from the verifier cache internally in the mempool. We do a lot of tx verification in various mempool operations. For example after aggregating txs together we verify the aggregated tx. Each verification leverages the rangeproof verifier cache as we do not want to verify the rangeproofs repeatedly each time.

But we may want to consider disabling the cache when we actually verify a new block.
Blocks are consensus critical, transactions are not.
We would need to take care to handle edge cases where a potentially invalid tx entered the mempool.
This makes me hesitate a little.

@trevyn
Copy link
Contributor Author

trevyn commented Mar 23, 2021

Thanks @DavidBurkett ! All comments addressed.

@antiochp antiochp self-requested a review March 23, 2021 10:55
@antiochp
Copy link
Member

antiochp commented Mar 23, 2021

It would be good to add a test here that exercises the full block processing pipeline resulting in an InvalidBlockProof.
Something along the lines of the test in nrd_validation_rules.rs

Ideally we can reproduce the inflation bug running the test with 5.0.1 codebase.
Then confirm the inflation bug is no longer present in 5.0.4.
And finally that reintroducing the verifier cache logic does not also reintroduce the inflation bug.

We will have to jump through some hoops to build an "invalid" block (presumably we need to swap out a valid rangeproof for an invalid one).
This is worth spending some time on as we definitely have less test coverage (and less ability to add them) around "invalid blocks" like this.

tl;dr This was probably the most serious bug we have discovered in mainnet and its worth the additional effort and we should take the opportunity to improve our testing of the block processing pipeline, specifically around our ability to setup tests with invalid data.

@trevyn
Copy link
Contributor Author

trevyn commented Mar 24, 2021

It would be good to add a test here that exercises the full block processing pipeline resulting in an InvalidBlockProof.
Something along the lines of the test in nrd_validation_rules.rs

Got it. I looked the relevant code and I think I'll be able to make progress on this. Will ping when ready for a review, or with any questions that arise. Thanks!

@antiochp
Copy link
Member

We should pull the test from #3621 into this PR and consolidate the work.
@snape479 @trevyn

@antiochp
Copy link
Member

antiochp commented Mar 31, 2021

The more I look at the "verifier cache" the less I like our current implementation.
I'm leaning toward simply getting rid of it entirely.
There's a lot of cloning going on internally as it was written with a poor understanding of iterators and allocations in rust (I'm allowed to saw this as I wrote the code...)

Note: the test coverage implemented by @trevyn is still very much useful and I am not proposing we discard this.

I'm going to take a pass at what the code would look like with the verifier cache removed entirely.
Not saying we should definitely remove it at this point - lets see what it would involve before making that decision.

If we subsequently want to reintroduce some level of caching we should do this at the txpool level and keep it isolated there, rather than threading this global "cache" state all over the codebase.

@trevyn
Copy link
Contributor Author

trevyn commented Mar 31, 2021

I definitely support removing the cache. Do we have a sense of the actual, measured performance impact?

@antiochp
Copy link
Member

I definitely support removing the cache. Do we have a sense of the actual, measured performance impact?

For a handful of txs per block like we have currently and a txpool that handles those few txs the performance impact is going to be close to zero.
The verifier cache was originally put in place to handle future increased tx volume.

@antiochp
Copy link
Member

PR to remove the verifier cache - #3628

@trevyn
Copy link
Contributor Author

trevyn commented Apr 1, 2021

PR to remove the verifier cache - #3628

👍🏼. Will sort out bringing the tests over once that is merged.

@trevyn
Copy link
Contributor Author

trevyn commented Apr 2, 2021

Closing in favor of #3628.

The only new test in this PR required direct access to the verifier cache, which no longer exists. Tests from #3621 should still work, I'll look at porting them over.

@trevyn trevyn closed this Apr 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants