Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint room state #232

Closed
kegsay opened this issue Jul 27, 2023 · 3 comments · Fixed by #363
Closed

Checkpoint room state #232

kegsay opened this issue Jul 27, 2023 · 3 comments · Fixed by #363
Labels
bug Something isn't working poller

Comments

@kegsay
Copy link
Member

kegsay commented Jul 27, 2023

Users can join and leave rooms. If all users who are using the proxy leave a room, then Stuff happens and someone rejoins, the poller will be resent the entire room state from fresh. We currently do not use this correctly, and instead treat it as a "gappy sync", meaning we take what we previously had as the room state, and "roll forward" using what we are told in the state block.

We shouldn't do that, as it means that we can't handle state resets correctly (i.e we never remove state by rolling forwards), and if we calculate the wrong state in the past (not uncommon), we cannot self-heal by using the latest authoritative answer from upstream.

To do this is reasonably easy. All we need to do is, in Initialise:

  • is there an m.room.create even in the state block?
    • No: gappy sync, roll forward.
    • Yes: this sync contains everything, do a complete snapshot "rolling forwards" from the empty set [].
@DMRobertson
Copy link
Contributor

DMRobertson commented Sep 20, 2023

All we need to do is

Plus: after making the new snapshot, emit a cache invalidation payload (#294).

@DMRobertson
Copy link
Contributor

I think what we need Initialise to do is then:

	// 0. If the state block is empty, bail.
	//
	// 1. Capture the current snapshot ID, possibly zero.
	//
	// 2. Determine the kind of state block. Is it:
	//    a. A complete snapshot of the room, containing m.room.create?
	//    b. B "gappy sync" state delta, not containing m.room.create?
	//
	// 3. Does this alter the current state of the room from the proxy's POV?
	//    a. Complete snapshot: state changes iff current snapshot's contents != state block.
	//       (What if this comes from a slow poller? Ahh, it'll catch up with later events and self-heal via Accumulate.)
	//    b. Gappy sync: state changes iff some state block event is not in the current snapshot.
	//    If the state hasn't altered, bail.
	//
	// 4. Create new snapshot.
	//    a. Complete snapshot: trust the state block verbatim.
	//    b. Gappy sync: build a map from (type, state_key) to nids for the current snapshot. Overwrite its entries
	//       with the new things in `state`. (There must be similar logic already in Accumulate for this?)
	//    Store the snapshot. Mark the room's current state as being this snapshot.
	//
	// 5. If the starting snapshot ID was not zero, emit a cache invalidation payload.

@DMRobertson
Copy link
Contributor

DMRobertson commented Sep 20, 2023

After discussion:

	// 0. If the state block is empty, bail.
	//
	// 1. Capture the current snapshot ID, possibly zero. If it is zero, ensure that the 
    //     state block contains a `create event`.
    // 
    // 2. Fetch the current state of the room, as a map from (type, state_key) to event.
    //      If there is no existing state snapshot, this map is the empty map.
	//
	// 3. Does the state block alter the current state of the room from the proxy's POV?
	//     This is the case iff some event in the state block is unknown to the proxy.
    //     (We choose to ignore the possibility that a poller is slow and gives us e.g. a subset of current room state.)
	//     If the state hasn't altered, bail.
	//
	// 4. Create new snapshot.
	//     Update the map from (2) with the events in  in `state`.  (There must be similar logic already in Accumulate for this?)
	//     Store the snapshot. Mark the room's current state as being this snapshot.
	//
	// 5. If the starting snapshot ID was not zero, emit a cache invalidation payload.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working poller
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants