Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor DBLayer to use in-memory Checkpoints #2942

Merged
merged 3 commits into from
Dec 8, 2021

Conversation

HeinrichApfelmus
Copy link
Contributor

Issue number

ADP-1169

Overview

Previous work in epic ADP-1043 introduced delta encodings, DBVars, and an embedding of the wallet state and its delta encodings into a database table. It's time to integrate these tools with the wallet code. To facilitate code review, the integration proceeds in a sequence of refactorings that do not change functionality and pass all unit tests.

In this step, we introduce a type Checkpoints a which stores all checkpoints of the wallet state in memory. The current ad-hoc cache of the latest checkpoints is replaced by a DBVar that stores these checkpoints instead.

Details

  • Keep the Checkpoints type stupid and simple for now. It will be replaced later.
  • This pull request will (temporarily) increase the memory requirements for the wallet, as the collection of checkpoints is kept in-memory. However, memory usage will go back to normal (and better) once delta encodings are integrated.

Comments

Merge PR #2841 before this one, because this pull request is based on the branch of the former.

@HeinrichApfelmus HeinrichApfelmus self-assigned this Oct 1, 2021
@HeinrichApfelmus HeinrichApfelmus marked this pull request as draft October 1, 2021 15:39
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/dbvar branch from 591dfa9 to a08ab95 Compare October 22, 2021 12:00
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/integration-1169 branch 3 times, most recently from 166705d to 7a063f5 Compare October 22, 2021 14:35
@HeinrichApfelmus HeinrichApfelmus marked this pull request as ready for review October 22, 2021 16:00
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/integration-1169 branch from 7a063f5 to 07a3cf8 Compare October 25, 2021 16:17
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/dbvar branch 3 times, most recently from f6bec2c to fa3161d Compare November 1, 2021 13:04
Base automatically changed from HeinrichApfelmus/ADP-1043/dbvar to master November 1, 2021 17:57
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/integration-1169 branch 3 times, most recently from 98e45d3 to 0d71e67 Compare November 3, 2021 13:32
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/integration-1169 branch 2 times, most recently from 4ea79b3 to 7e42a1c Compare December 1, 2021 12:52
* Keep the `Checkpoints` type stupid and simple for now. It will be replaced later.
* This replaces the cache of the latest checkpoint with a cache of all checkpoints. This will increase memory requirements for now, but this will go back to normal (and better) once delta encodings are integrated.
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/integration-1169 branch 2 times, most recently from 7beb6bf to af8b05f Compare December 1, 2021 14:12
Copy link
Contributor

@paweljakubas paweljakubas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run through logic and it seems to proceed as I would expect. If all tests pass it is giving crucial confidence that replacing SlotNo with SlotPoint and BlockHeader with ChainPoint is not doing any harm. What about db migrations? No way they are going to be affected? Maybe it would be good if @piotr-iohk plays with migrations using this branch.

I will give another look, even more detailed and style related, tomorrow (until now I was mainly occupied with how it works and why)

@HeinrichApfelmus
Copy link
Contributor Author

I run through logic and it seems to proceed as I would expect. If all tests pass it is giving crucial confidence that replacing SlotNo with SlotPoint and BlockHeader with ChainPoint is not doing any harm. What about db migrations? No way they are going to be affected? Maybe it would be good if @piotr-iohk plays with migrations using this branch.

I will give another look, even more detailed and style related, tomorrow (until now I was mainly occupied with how it works and why)

Thanks Pawel! 😊

The DB migrations should not be affected — this PR does not change the database format in any way. The database is actually a little buggy in that it stores a checkpoint for the first slot after genesis, At (SlotNo 0), but no checkpoint for genesis, Origin. This does not affect testnet or mainnet, but it has affected local clusters that start a new network, and where there is a rollback to genesis.

I don't think that it makes to fix this in the database file format right now, because a future PR will radically change the format anyway, adding incremental updates. However, I do think that it makes sense to at least use the correct representation for the in-memory type. Unit tests will pass as long as they never try to roll back to Origin rather than to At (SlotNo 0). In fact, before the change in types, the unit tests were not actually able to rollback further than At (SlotNo 0).

This commit partially addresses the issue where the DB layer historically did not distinguish between the genesis point and the block with SlotNo 0, which comes directly after genesis.

* The types of `rollbackTo` and `listCheckpoints` can now distinguish these points.
* For reasons of correctness, we need to use `Slot` in the `Checkpoints` type.
* However, the DB file format still uses slot numbers only. But as we want to revamp the format anyway, the plan is to keep it as it is for now, and remove the issue with the revamp.
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/integration-1169 branch 4 times, most recently from 32b569c to 237d8ec Compare December 6, 2021 11:47
The genesis block has no parent block, and this fact is best represented by `Nothing` instead of a magic value `hashOfNoParent`.

The magic value is still used for serialization and deserialization in the database.
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-1043/integration-1169 branch from 237d8ec to 46241d7 Compare December 6, 2021 15:18
Copy link
Contributor

@paweljakubas paweljakubas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@rvl rvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great 👍

-> SlotNo
-> ExceptT ErrNoSuchWallet stm BlockHeader
-- ^ Drops all checkpoints and transaction data after the given slot.
-> Slot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we don't require the database to be rolled back further than a few thousand blocks, a value of Origin should never be passed for this argument of rollbackTo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this will not happen in practice on testnet and mainnet. However, the ouroboros-network library does represent points on the blockchain with an explicit origin, and a rollback to genesis can happen in principle (and does happen on local test clusters). By using a representation that is more closely aligned with ouroboros-network (i.e. Slot instead of SlotNo), the code actually becomes simpler, as we can remove conversion functions such as

    pseudoPointSlot :: ChainPoint -> SlotNo
    pseudoPointSlot ChainPointAtGenesis = W.SlotNo 0
    pseudoPointSlot (ChainPoint slot _) = slot

Comment on lines +2068 to +2069
Historical hack. The DB layer can't represent 'Origin' in the database,
instead we have mapped it to 'SlotNo 0', which is wrong.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't matter; we don't need rollback to genesis.

@HeinrichApfelmus
Copy link
Contributor Author

bors merge

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Dec 8, 2021

Build succeeded:

@iohk-bors iohk-bors bot merged commit 7fde05b into master Dec 8, 2021
@iohk-bors iohk-bors bot deleted the HeinrichApfelmus/ADP-1043/integration-1169 branch December 8, 2021 12:57
WilliamKingNoel-Bot pushed a commit that referenced this pull request Dec 8, 2021
iohk-bors bot added a commit that referenced this pull request Dec 10, 2021
3046: Reorganize modules in Cardano.Wallet.DB.* r=HeinrichApfelmus a=HeinrichApfelmus

### Issue number

ADP-1288

### Overview

Previous work in epic ADP-1043 introduced delta encodings, DBVars, and an embedding of the wallet state and its delta encodings into a database table. It's time to integrate these tools with the wallet code. To facilitate code review, the integration proceeds in a sequence of refactorings that do not change functionality and pass all unit tests.

In this step, we reorganize the modules in the `Cardano.Wallet.DB.*` hierarchy; the aim is to be able to quickly swap out the implementation of the `mkStoreWalletsCheckpoints` function by changing one import. We want to keep both the previous implementation and a subsequent new implementation around for some time.

### Details

* The migration function `migrateManually` is moved to a separate module `Cardano.Wallet.DB.Sqlite.Migration`. This has the beneficial side effect that this migration function cannot be accidentally be affected by refactoring in modules that are not imported by this module.
* The creation of the checkpoints store, `mkStoreWalletsCheckpoints`, is moved to a separate module. We can swap out its implementation by importing a different module in the future.

### Comments

Merge PR #2942 before this one, because this pull request is based on the branch of the former.


Co-authored-by: Heinrich Apfelmus <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants