-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregate rehydration (retrieving entire stream history) #280
Comments
Here's the basic interface I'm trying to implement using Liftbridge: // EventStore persists streams of aggregate events.
type EventStore interface {
GetStream(ctx context.Context, id string) (Stream, error)
}
// Stream contains all events published by an aggregate instance.
type Stream interface {
Events() []proto.Message
Publish(ctx context.Context, event proto.Message) error
} The obvious problem comes when streams get too big. But that should be solvable using snapshots and then only fetching events since the latest snapshot's offset. |
If I understand your point correctly, it is a dedicated RPC to subscribe until a specific offset you want ? E.g // SubscribeRequest
Subscribe(req *client.SubscribeRequest, out client.API_SubscribeServer) error And for // SubscribeRequest is sent to subscribe to a stream partition.
message SubscribeRequest {
int64 endOffset = 4 [jstype=JS_STRING]; // Offset to end consuming from
} And the value of the Because by saying "all" of a stream, we mean to the end of the stream ? ( Do I get your point correctly ? |
Yeah, ideally I'd be able to set a "start" and an "end" offset, where for each of those values I might just want to say "the earliest" and "the latest" rather than specifying an actual offset. This may just be possible using the existing Subscribe rpc by adding some new SubscribeRequest fields: // SubscribeRequest is sent to subscribe to a stream partition.
message SubscribeRequest {
// existing fields 1 - 7...
StopPosition stopPosition = 8; // Where to stop consuming
int64 stopOffset = 9 [jstype=JS_STRING]; // Offset to stop consuming at
int64 stopTimestamp = 10 [jstype=JS_STRING]; // Timestamp to stop consuming at
}
// StopPosition determines the stop-position type on a subscription.
enum StopPosition {
OFFSET = 0; // Stop at a specified offset
LATEST = 1; // Stop at the newest message
TIMESTAMP = 2; // Stop at a specified timestamp
} Then in my code I could do something like this: messages := []*lift.Message
wg := sync.WaitGroup{}
wg.Add(1)
client.Subscribe(ctx, "myStream", func(msg *lift.Message, err error) {
if err != nil {
if err == lift.ErrStopPositionReached {
wg.Done()
return
}
return
}
messages = append(messages, msg)
}, lift.StartAtEarliestReceived(), lift.StopAtLatestReceived())
// wait until end position reached
wg.Wait()
process(messages) |
How should the That I'm no expert in Event Sourcing, but what you describe looks more like a "batch" than a stream ? Are you doing that just in case of failure recovery or on first time bootstrap of the system ? Would it make sense to publish one aggregated message upon every received event even in the replay (rehydration) ? |
This would be the latest offset of the partition at the moment the subscribe request is received. The same value as if you were to have called Just so we're on the same page with Event Sourcing, here's a definition:
How I'm imaging this working in Liftbridge is that each aggregate instance within an application would have its own Liftbridge stream that it publishes events to. When a command is received for a particular aggregate instance, the application must first "rehydrate" that aggregate before executing the command. Rehydration involves fetching all events from the instance's stream and replaying them in-order until the current state is reached. Once the aggregate is rehydrated to the current state the newly received command can be executed, and the subsequent event published back onto the instance's stream. With the command executed and event published, the aggregate instance can be discarded from memory. This would result in a great number of small streams that are published to infrequently. |
I think this can be done using a "stop offset" on subscribe as you indicate. In fact, this is something I've considered in the past to help simplify consumer logic but have just never got to it. I like the idea and symmetry of having a |
If you're both okay with the proposed proto above, then I can open a draft PR to liftbridge-api and then look at implementing this in the server and clients |
I'm good with the proposed proto changes. The only piece I'm not sure about is the behavior of
After thinking more about it, I'm hesitant to expose both as options to the user mainly because it adds a lot of cognitive load to the API, but I'm not totally sure what the "right" choice is. Consuming up to the high watermark would guarantee the consumer does not wait for messages but it might miss yet-to-be-committed messages. Consuming up to the latest offset might result in the consumer waiting for messages (which could be prolonged in pathological cases) but would ensure the consumer reads all messages in the log at the time of subscribe. That said, because new messages may be getting added to the log continually, I'm not sure if this matters. However, if the idea is to use #54 to publish new messages, then the latter option might be the correct choice after all. Does that make sense @alexrudd? |
Yeah I understand your concerns. I'd say that if the stream is being used as the source of truth for what happens in a system (as is the case with event sourcing), then an event hasn't truly happened until it's been committed to the log. In which case I'd say I think actually that it would be fine either way. If |
I guess that the first logical step would be to add support for the I agree with the above opinion, saying that expose |
@alexrudd I'm not sure I understand what you're asking here. Are you referring to message acks?
I think I'm leaning towards this for the same reason @LaPetiteSouris said above. |
Have opened a small PR for this #282
I probably just need to read the docs but my assumption is that a publish to Liftbridge results in.
My question was whether the publish to NATS happened before or after the commit to log |
Liftbridge streams actually consume from NATS subjects. The flow is basically this:
Alternatively, you can also publish directly to the NATS subject (this allows Liftbridge to record any plain old NATS subject):
Messages are committed once replication completes. Acks are sent via NATS and proxied by Liftbridge if using the publish APIs. Will take a look at the PR in a bit. |
Was just thinking about this in regards to optimistic concurrency (#54). I guess this means that messages would be published to NATS regardless of whether they pass an optimistic concurrency check? |
It is still under discussion, but very likely that would be the case as of now it seems we don't have many choices. Refer here FYI: liftbridge-io/liftbridge-api#46 |
That is right. The optimistic concurrency check would apply to the Liftbridge log, not the NATS subject. If you wanted that check to apply to a NATS subject, you could put a Liftbridge stream in front of the NATS subject:
I suppose an interesting point for discussion is whether it would be useful to have the ability for Liftbridge to "forward" committed stream messages to a separate NATS subject. I have not put any thought into that though. |
This is an interesting idea. I'll keep an eye on the OCC work as that's pretty exciting. Now that #286 is merged, happy for this to be closed |
Hey,
I'm experimenting with event sourcing and using LiftBridge as an event store. For this I need to be able to retrieve all events for a particular aggregate instance, and then replay those events to get the most recent aggregate state. Once the aggregate is "rehydrated", it's ready to run a command, and the resulting event can be published back to the stream.
This is the same use case as mentioned here: #54 (comment)
I've gotten this working thanks to the recent fetch partition metadata rpc, but it seems more difficult than it ought to be. The calls I'm making to fetch all events that may exist for a stream (which could be zero) are:
CreateStream
- ensure the stream actually exists, continuing onErrStreamExists
FetchPartitionMetadata
- so I have access toNewestOffset
and can then tell when I've reached the end of the streamSubscribe
- reading messages from the stream until I see a message with an offset that matchesNewestOffset
(This doesn't account for events that might arrive between the
FetchPartitionMetadata
call and theSubscribe
call, but that would be solved by the proposed optimistic concurrency support.)I'm opening this issue to check if what I'm doing here makes sense? And see if there's support for adding a dedicated RPC to synchronously retrieve all or a chunk of a stream?
The text was updated successfully, but these errors were encountered: