Conversation
Pull reviewers statsStats of the last 30 days for lnd:
|
9c090b1 to
f35b72e
Compare
yyforyongyu
left a comment
There was a problem hiding this comment.
Really like the uniformed StateMachine🤩 I think the blockbeat from #7951 can even fit in this picture - that there exists a set of universal events such as a new block event, and we force every state machine to process it. My main question is whether we could stop generalizing at ProcessEvent, and leave the implementations of executeDaemonEvent to specific subsystems? This naturally leads to the question of whether we need this DaemonAdapters interface, as it seems it's not a common functionality that's shared by all subsystems.
Still need to think through, but a few ideas,
- we could make
StateMachinean interface, and maybe add something likeBaseMachinethat has the minimal methods such asdriveMachine. - I like that
Stateis an interface which makes writing the tests much easier. It's just that the name is a bit confusing I guess, as it's sort like a processor, and each state has its own processor.
My understanding of the design is, an event-driven machine that's pipelined with state processors, the machine doesn't care about the specifics of the event, instead, it's the state processor's responsibility to handle the event and instruct a new state. I think we could stop here without distinguishing interval vs external events, apply it to a few subsystems to see its effect.
protofsm/state_machine.go
Outdated
| // executeDaemonEvent executes a daemon event, which is a special type of event | ||
| // that can be emitted as part of the state transition function of the state | ||
| // machine. An error is returned if the type of event is unknown. | ||
| func (s *StateMachine[Event, Env]) executeDaemonEvent(event DaemonEvent) error { |
There was a problem hiding this comment.
feels like it's leaking the implementation details from other subsystems
There was a problem hiding this comment.
I think you'll have a better idea of the interaction once the new co-op close stuff is up, but the general idea is that:
- All the state machine state transitions are pure functions
- They emit events for the executor (prob should rename this struct slightly) to apply themselves
- Something needs to be aware of the boundary between the pure state machine, and the daemon execution env it runs in
- This thing handles that role of knowing all the global I/O or daemon actions to execute itself, and potentially emit an event back into the state machine (post execution hook)
Otherwise, what do you think should be handling the I/O between the daemon and the state machine?
There was a problem hiding this comment.
Maybe it could hidden behind currentState.ProcessEvent? Since it generates the transition, it might as well process it based on the new transition, like broadcast or send message.
There was a problem hiding this comment.
I think putting more things behind ProcessEvent would negatively impact testability. With this construction we can test the state transitions themselves in a pure environment and then wire up the execution of the generated events separately.
There was a problem hiding this comment.
Maybe it could hidden behind currentState.ProcessEvent? Since it generates the transition, it might as well process it based on the new transition, like broadcast or send message.
So the idea is that the actual state transitions never need to concern themselves with any of these details. They just emit the event, then wait for w/e new event to be sent in. There's no leakage of implementation details at this StateMachine level, as we'll pass in a concrete implementation based on lnd later, here's an idea of what that looks like: ce75ef8.
This is to be considered universal, just like the POSIX interface we all know and love today. In this case, our processes re these FSMs, and the syscalls ways to interact with the chain or daemon.
Why do you think this should be an interface? The goal here is to provide a generic implementation that can drive any FSM, which is defined from that starting/initial state, and all the state transition functions. If you look at the test, it takes that mock state machine, and is able to drive that with the shared semantics of: terminal states, clean up functions, pure state transitions that emit any side effects as events, etc.
The goal of those was to implement all the side effects we'd ever need in a single place. The daemon events added were just the ones I needed to implement the new co-op close state machine nearly from scratch. I think if we look at all the state machines we've written in the codebase, maybe there's ~10 daemon level adapters that are used continuously. One that's missing right now is requesting to be notified of something confirming. |
ProofOfKeags
left a comment
There was a problem hiding this comment.
I did a no-nit high level review here. My biggest squint was around the SendWhen impure pseudo-predicate. Not gonna lie, I don't like it. However, I suspect that the reason you went this route is that making it pure would require hooks into the state changes of surrounding subsystems in ways that would require significant changes to the overall LND codebase before this could be inserted.
That said, there still may be no way around it. The main concern here is that the polling approach may miss the opportunities it needs to send the message out. The example here is OnCommit/OnFlush where we poll and still owe a commitment so we can't send, but then we do the commit and immediately follow up with another state change, thereby re-falsifying the SendWhen predicate before the next poll cycle.
In the case of shutdown and the coop close negotiations, this technically violates the spec. Idk what the practical consequences of that would be (they may be benign), but unless we can synchronize directly into the channel update lifecycle, we can't really be spec compliant.
On the other hand, you could make the argument that it isn't the state machine's responsibility to understand when a message should be synchronized into the message stream at all. It's job is simply to generate the response and the caller would queue it for sending at the next possible opportunity. This is the approach I took with the coop close v1: The ChanCloser is completely unaware of how the messages are dispatched, it just knows what to send, not when or how.
protofsm/state_machine.go
Outdated
| // executeDaemonEvent executes a daemon event, which is a special type of event | ||
| // that can be emitted as part of the state transition function of the state | ||
| // machine. An error is returned if the type of event is unknown. | ||
| func (s *StateMachine[Event, Env]) executeDaemonEvent(event DaemonEvent) error { |
There was a problem hiding this comment.
I think putting more things behind ProcessEvent would negatively impact testability. With this construction we can test the state transitions themselves in a pure environment and then wire up the execution of the generated events separately.
|
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
f35b72e to
1d1c138
Compare
|
PTAL. |
e0265c1 to
057c481
Compare
ProofOfKeags
left a comment
There was a problem hiding this comment.
Main thing is I think we want to make state machines not able to "throw a disable" to another channel.
|
Pushed up a new set of commits with some bug fixes and some additional functionality that came in handy when starting to hook up the new RBF coop close state machine to the peer struct. |
|
@yyforyongyu: review reminder |
Crypt-iQ
left a comment
There was a problem hiding this comment.
LGTM once CI addressed
ProofOfKeags
left a comment
There was a problem hiding this comment.
I think this is good to go broadly speaking. There's some cleanup that needs to happen for CI and some nice simplifications using SpewLogClosure but we can get this approved today.
e8c30d5 to
66815e5
Compare
In this PR, we create a new package, `protofsm` which is intended to
abstract away from something we've done dozens of time in the daemon:
create a new event-drive protocol FSM. One example of this is the co-op
close state machine, and also the channel state machine itself.
This packages picks out the common themes of:
* clear states and transitions between them
* calling out to special daemon adapters for I/O such as transaction
broadcast or sending a message to a peer
* cleaning up after state machine execution
* notifying relevant callers of updates to the state machine
The goal of this PR, is that devs can now implement a state machine
based off of this primary interface:
```go
// State defines an abstract state along, namely its state transition function
// that takes as input an event and an environment, and returns a state
// transition (next state, and set of events to emit). As state can also either
// be terminal, or not, a terminal event causes state execution to halt.
type State[Event any, Env Environment] interface {
// ProcessEvent takes an event and an environment, and returns a new
// state transition. This will be iteratively called until either a
// terminal state is reached, or no further internal events are
// emitted.
ProcessEvent(event Event, env Env) (*StateTransition[Event, Env], error)
// IsTerminal returns true if this state is terminal, and false otherwise.
IsTerminal() bool
}
```
With their focus being _only_ on each state transition, rather than all
the boiler plate involved (processing new events, advancing to
completion, doing I/O, etc, etc).
Instead, they just make their states, then create the state machine
given the starting state and env. The only other custom component needed
is something capable of mapping wire messages or other events from the
"outside world" into the domain of the state machine.
The set of types is based on a pseudo sum type system wherein you
declare an interface, make the sole method private, then create other
instances based on that interface. This restricts call sites (must pass
in that interface) type, and with some tooling, exhaustive matching can
also be enforced via a linter.
The best way to get a hang of the pattern proposed here is to check out
the tests. They make a mock state machine, and then use the new executor
to drive it to completion. You'll also get a view of how the code will
actually look, with the focus being on the: input event, current state,
and output transition (can also emit events to drive itself forward).
In this commit, we add an optional daemon event that can be specified to dispatch during init. This is useful for instances where before we start, we want to make sure we have a registered spend/conf notification before normal operation starts. We also add new unit tests to cover this, and the prior spend/conf event additions.
In this commit, we add the ability for the state machine to consume wire messages. This'll allow the creation of a new generic message router that takes the place of the current peer `readHandler` in an upcoming commit.
This'll be used later to uniquely identify state machines for routing/dispatch purposes.
We'll use this to be able to signal to a caller that a critical error occurred during the state transition.
Adding this makes a state machine easier to unit test, as the caller can specify a custom polling interval.
In this commit, we add the SpendMapper which allows callers to create custom spent events. Before this commit, the caller would be able to have an event sent to them in the case a spend happens, but that event wouldn't have any of the relevant spend details. With this new addition, the caller can specify how to take a generic spend event, and transform it into the state machine specific spend event.
In this commit, we update the execution logic to allow multiple internal events to be emitted. This is useful to handle potential out of order state transitions, as they can be cached, then emitted once the relevant pre-conditions have been met.
This fixes an isuse that can occur when we have concurrent calls to `Stop` while the state machine is driving forward.
In this PR, we create a new package,
protofsmwhich is intended toabstract away from something we've done dozens of time in the daemon:
create a new event-drive protocol FSM. One example of this is the co-op
close state machine, and also the channel state machine itself.
This packages picks out the common themes of:
broadcast or sending a message to a peer
The goal of this PR, is that devs can now implement a state machine
based off of this primary interface:
With their focus being only on each state transition, rather than all
the boiler plate involved (processing new events, advancing to
completion, doing I/O, etc, etc).
Instead, they just make their states, then create the state machine
given the starting state and env. The only other custom component needed
is something capable of mapping wire messages or other events from the
"outside world" into the domain of the state machine.
The set of types is based on a pseudo sum type system wherein you
declare an interface, make the sole method private, then create other
instances based on that interface. This restricts call sites (must pass
in that interface) type, and with some tooling, exhaustive matching can
also be enforced via a linter.
The best way to get a hang of the pattern proposed here is to check out
the tests. They make a mock state machine, and then use the new executor
to drive it to completion. You'll also get a view of how the code will
actually look, with the focus being on the: input event, current state,
and output transition (can also emit events to drive itself forward).