Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow keys to be unpersisted #1705

Closed

Conversation

tnull
Copy link
Contributor

@tnull tnull commented Sep 8, 2022

Previously the KVStorePersister only allowed to persist a Writeables under a given key. However, we may want to be able to remove this set key after the fact. Here we therefore extend the trait with a unpersist method and implement it for FilesystemPersister.

I also added a commit rewording the FilesystemPersister docs to reflect the broader use case it fulfils by now.

Previously the `KVStorePersister` only allowed to `persist` a
`Writeables` under a given key. However, we may want to be able to
remove this set key after the fact.

Here we therefore extend the trait with a `unpersist` method and implement it
for `FilesystemPersister`.
@tnull
Copy link
Contributor Author

tnull commented Sep 8, 2022

Let me know if I should incorporate anything discussed in #1417/#1426/#1427 in this PR, e.g., a list() method.

@tnull tnull force-pushed the 2022-09-kvstorepersister-unpersist branch from 9bd88ca to f0373ee Compare September 8, 2022 13:50
Since `FilesystemPersister` is used for different types of `Writeables`,
not only channel data, we can now reword the docs to reflect its more
general use case.
@tnull tnull force-pushed the 2022-09-kvstorepersister-unpersist branch from f0373ee to 511434f Compare September 8, 2022 13:52

/// Unpersist (i.e., remove) the writeable previously persisted under the provided key.
/// Returns `true` if the key was present, and `false` otherwise.
fn unpersist(&self, key: &str) -> io::Result<bool>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add motivation for adding this and where it will or can be used?

  • Its one thing if its for space optimization (delete items no longer being used)
  • And another if we are going to be relying on this function for immediate delete of items.
    Depending on use-case, Ideally i would like it to be just former, to support some of the systems which support async delete. (And in this case we might not care much if it doesn't succeed immediately or is not strongly consistent.)

Copy link
Contributor Author

@tnull tnull Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean add to motivation in comments, in the commit message, or here on GitHub?

I generally think that it would be nice to have list and unpersist functionalities, if the persister is going to be used for a wider array of objects. Moreover, the sample node and LdkLite currently just bypass the FilesystemPersister for initalization (i.e., they read the directory contents via std::fs methods), which is no option if we'd ever want to have another persister backend.

However, my concrete motivation was the initial draft of a PaymentInfoStore for LdkLite (lightningdevkit/ldk-node#13) which uses the persister, where I however didn't want to store one huge blob of data under a single key. I therefore now store each payment info separately using the payment hash as a key. It would however be nice to be able to prune that store eventually, hence the need for an unpersist method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant in PR description.(for future reference as well)

Got it, from your description our intention is space optimization. (i.e. pruning)
It would be best if we add to documentation of this api that this will be used for pruning. and hence we can utilize async deletes or eventually consistent deletes.
This also means:

  1. Logic correctness shouldn't depend on immediate delete.
  2. Even if we miss on deleting an item, worst that will happen is some wasted storage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Logic correctness shouldn't depend on immediate delete.
  2. Even if we miss on deleting an item, worst that will happen is some wasted storage.

I tend to disagree. Yes, the current motivation for this is pruning some data which is fine to be delayed, which however might not be the case for other future use cases. It would also pretty annoying to just silently fail to unpersist for some reason. But, more importantly, I think it would be very confusing to have an interface in which different but very similarly named methods behave differently or provide very different consistency guarantees. I'd therefore argue that while data removal is almost always not as critical as persistence, we should try to align the guarantees two methods as far as possible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way is to have a flag, which allows for both. (if you think there will be future requirement for it and feel strongly about it) (imo we should keep it for pruning purpose to start with or by Default)
Not all systems provide same guarantees while deleting and we should think beyond filesystem-persistence.

And in LN and LDK usecase, persist and delete can live with different consistency requirements. In all case we need persist to be strongly consistent, whereas its not the same for delete.

The concern for "methods behave differently" is valid but that doesn't mean we should make api or consistency decisions based on that.
In which case do we expect immediate delete?
We already have big overhaul ongoing for async persist, and if we can be conservative while adding delete functionality then why not ?

Copy link
Collaborator

@TheBlueMatt TheBlueMatt Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, my concrete motivation was the initial draft of a PaymentInfoStore for LdkLite (lightningdevkit/ldk-node#13) which uses the persister, where I however didn't want to store one huge blob of data under a single key. I therefore now store each payment info separately using the payment hash as a key. It would however be nice to be able to prune that store eventually, hence the need for an unpersist method.

I don't think a separate key per payment is the right approach here, either? That is a lot of keys, and with very small values. Generally many filesystems are really bad at folders with lots of keys, as many operations expect to load the full set of keys in a folder at a time, so I don't think we can do that in general.

I agree that the motivation here needs to be scoped a bit better. Concretely, we want to move towards removing ChannelMonitors in the future, but I'm not actually sure that we really ever want to fully remove a monitor - we just want to "archive" them. I worry if we have a bug we may need a monitor to recover and claim some funds in the future, so would prefer we just archive and not delete.

Copy link
Contributor Author

@tnull tnull Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a separate key per payment is the right approach here, either? That is a lot of keys, and with very small values. Generally many filesystems are really bad at folders with lots of keys, as many operations expect to load the full set of keys in a folder at a time, so I don't think we can do that in general.

Yes, but I think the current PaymentInfoStore draft should really be a starting point, since we want a database backend for these kinds of data anyways. So regarding that concrete use case I'd argue we still want to add unpersist to the trait, but implement a KVStorePersister with a different backend in LDKLite (probably using the sled database at least initially, since it's already in use for BDK persistence).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but then presumably the unpersist function doesn't belong in the upstream trait but rather in your local trait for database logic, as its not reasonable to expect all implementors to be used for that functionality?

Copy link
Contributor Author

@tnull tnull Sep 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, but I don't see why we then have an upstream trait at all, if it doesn't capture functionalities that ~all implementations would need? Even more so than for unpersist this is true for list and get functionalities, which literally every user/implementation will need. Take for example ldk-sample which of course also needs these functionalities for initialization, but solves the problem by simply bypassing the trait and even FilesystemPersister by just going through std::fs and reading the data stored at the specific hardcoded locations. However, this of course doesn't work for all backend implementations, so while the trait is supposed to enable different backends, it currently just works if we assume storing data in files.

In particular if we want to make the persister configurable for the user, e.g., in LDKLite, the KVStorePersister trait needs to offer a way to list and retrieve the data stored for a specific key. Of course this could be solved by yet another augmenting downstream trait specific to LDKLite, but this is not the idea of having a general KVStorePersister trait?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, but I don't see why we then have an upstream trait at all, if it doesn't capture functionalities that ~all implementations would need?

The upstream trait should capture the things that upstream needs. If users want to add methods, they should totally do that, but its not of upstream's concern?

Even more so than for unpersist this is true for list and get functionalities, which literally every user/implementation will need.

Yes, I think we should list and get to the upstream trait, and then use that to implement initialization methods upstream, as utilities for the user. Thus using those methods :)

@tnull
Copy link
Contributor Author

tnull commented Sep 15, 2022

Talked about this offline, closing this for now, will open a follow-up for get/list and implement a Unpersist trait downstream.

@tnull tnull closed this Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants