Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't load events when there's a gap between known events #300

Merged
merged 22 commits into from
Sep 20, 2023

Conversation

DMRobertson
Copy link
Contributor

@DMRobertson DMRobertson commented Sep 13, 2023

Fixes #283
Part of #294.
Is going to conflict with #296.

Recommend commitwise review. I was going to write an E2E test, but it's fiddley to get a gappy sync deterministically, so opted for an integration test.

state/event_table.go Outdated Show resolved Hide resolved
@DMRobertson DMRobertson force-pushed the dmr/invalidate-timelines branch 4 times, most recently from 1a654bf to f6783af Compare September 13, 2023 18:16
@DMRobertson DMRobertson changed the title WIP: don't load events when there's a gap between known events WIP: Don't load events when there's a gap between known events Sep 13, 2023
@DMRobertson DMRobertson marked this pull request as ready for review September 13, 2023 18:24
Copy link
Member

@kegsay kegsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall methodology looks great.

state/event_table.go Outdated Show resolved Hide resolved
internal/types.go Outdated Show resolved Hide resolved
// filled in by other pollers.
//
// A: E1 persisted, E2 omitted and unknown, E3 unknown (missing previous)
// B: E1 persisted, E2 omitted and known, E3 unknown (not missing previous)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terminology is too confusing. Hard to grok what:

  • omitted
  • unknown
  • missing previous

are all supposed to mean. The proxy has either seen the event or not, why do we have 3^2 states?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two edge cases I have in mind here. Starting with the simpler of the two:

Case B. Suppose Alice's poller sees E1, E2, and E3 all in the same timeline. All is well; nothing is MissingPrevious. Then suppose Bob's poller sees E1 in a first sync, followed by a limited (gappy) sync beginning with E3. In this situation:

  • The Accumulate logic doesn't know that E2 directly preceeds E3. (For all it knows, there could be an E2.5 between them.) Therefore it has to consider E3 as MissingPrevious.
  • However the proxy already knows that E2 comes before E3. Therefore E3 should not be marked as MissingPrevious in the database. (This should be fine because the event upsert logic says ON CONFLICT DO NOTHING.)

Case A. Suppose that Alice's poller sees E1 and E2 in the same timeline. Neither is MissingPrevious. Suppose Bob's sees the same two responses as before: E1 in a first sync, followed by a limited (gappy) sync beginning with E3. This time:

  • The Accumulate logic still doesn't know that E2 directly preceeds E3, and has to consider E3 as MissingPrevious.
  • Nor does the proxy already knows that E2 comes before E3. Therefore E3 should be marked as MissingPrevious in the database.

A follow-up: if some future poller (for Chris, say) saw E2 and E3 adacent in a timeline it could upsert the row for E3 to mark it as not MissingPrevious. However I'm not sure that's worth the effort.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a5c02b2 tries to explain this better.

state/accumulator_test.go Show resolved Hide resolved
state/accumulator.go Outdated Show resolved Hide resolved
state/event_table.go Show resolved Hide resolved
state/event_table_test.go Outdated Show resolved Hide resolved
state/event_table_test.go Show resolved Hide resolved
state/event_table_test.go Outdated Show resolved Hide resolved
@DMRobertson DMRobertson changed the title WIP: Don't load events when there's a gap between known events Don't load events when there's a gap between known events Sep 19, 2023
@DMRobertson
Copy link
Contributor Author

DMRobertson commented Sep 20, 2023

Upgrade test is failing. After the upgrade:

Sync v3 [0.99.10] (262f8cb)
Debug=true LogLevel= MaxConns=0
panic: pq: column "missing_previous" of relation "syncv3_events" does not exist

goroutine 1 [running]:
github.com/jmoiron/sqlx.MustExec(...)
	/home/runner/go/pkg/mod/github.com/jmoiron/[email protected]/sqlx.go:722
github.com/jmoiron/sqlx.(*DB).MustExec(0x30?, {0xdbf993?, 0xc0003c4c60?}, {0x0?, 0xc0002ef900?, 0x40fd47?})
	/home/runner/go/pkg/mod/github.com/jmoiron/[email protected]/sqlx.go:366 +0x46
github.com/matrix-org/sliding-sync/state.NewEventTable(...)
	/home/runner/work/sliding-sync/sliding-sync/state/event_table.go:116
github.com/matrix-org/sliding-sync/state.NewStorageWithDB(0xc0003c4c90, 0x0)
	/home/runner/work/sliding-sync/sliding-sync/state/storage.go:85 +0x70
github.com/matrix-org/sliding-sync.Setup({0xc00004248e?, 0xd7fc8b?}, {0xc00004806a, 0x4c}, {0xc00003c00e, 0xa}, {0x0, 0x0, 0x0, 0x3b9aca00, ...})
	/home/runner/work/sliding-sync/sliding-sync/v3.go:106 +0x189
main.main()
	/home/runner/work/sliding-sync/sliding-sync/cmd/syncv3/main.go:213 +0x175b

which is

func NewEventTable(db *sqlx.DB) *EventTable {
// make sure tables are made
db.MustExec(`
CREATE SEQUENCE IF NOT EXISTS syncv3_event_nids_seq;
CREATE TABLE IF NOT EXISTS syncv3_events (
event_nid BIGINT PRIMARY KEY NOT NULL DEFAULT NEXTVAL('syncv3_event_nids_seq'),
event_id TEXT NOT NULL UNIQUE,
before_state_snapshot_id BIGINT NOT NULL DEFAULT 0,
-- which nid gets replaced in the snapshot with event_nid
event_replaces_nid BIGINT NOT NULL DEFAULT 0,
room_id TEXT NOT NULL,
event_type TEXT NOT NULL,
state_key TEXT NOT NULL,
prev_batch TEXT,
membership TEXT,
is_state BOOLEAN NOT NULL, -- is this event part of the v2 state response?
event BYTEA NOT NULL,
missing_previous BOOLEAN NOT NULL DEFAULT FALSE
);

That's a surprise to me because I thought CREATE TABLE IF NOT EXISTS would be a no-op if the table already exists!

@DMRobertson
Copy link
Contributor Author

That's a surprise to me because I thought CREATE TABLE IF NOT EXISTS would be a no-op if the table already exists!

Oh this will be the COMMENT ON..., won't it?

Copy link
Member

@kegsay kegsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give it a go.

sync2/client.go Outdated
@@ -4,7 +4,7 @@ import (
"context"
"encoding/json"
"fmt"
"github.com/matrix-org/sliding-sync/internal"
"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has this appeared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some kind of merge faffing with #301 I think?

state/event_table.go Outdated Show resolved Hide resolved
state/event_table.go Outdated Show resolved Hide resolved
},
}

for _, tc := range testcases {
// We're using the notation (X, Y] for a half-open interval excluding X but including Y.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer rust's notation of X..=Y

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X..=Y includes X, but I want to exclude X here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be worse---the French style of this notation is ]X, Y] (!)

tests-integration/poller_test.go Outdated Show resolved Hide resolved
tests-integration/poller_test.go Outdated Show resolved Hide resolved
{ID: "G", MissingPrevious: true},
{ID: "H", MissingPrevious: true},
{ID: "G-gap", MissingPrevious: true},
{ID: "H-gap", MissingPrevious: true},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm just having a lot of mental block here because I see A B C D and instinctively think there is no gap. When you then put H-gap I think "uhh before or after"? Can we +2 letters to hint at the gap maybe? E.g A B C F-gap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think rename these to e.g. chunkX-eventY , so that gaps only appear when we change chunks.

state/event_table_test.go Outdated Show resolved Hide resolved
state/event_table_test.go Outdated Show resolved Hide resolved
@DMRobertson
Copy link
Contributor Author

Only EW failed, but it suceeded on the prior commit. The only change I made was in unit tests, so I assume this is some Cypressy flake.

@DMRobertson DMRobertson merged commit e75a462 into main Sep 20, 2023
6 of 7 checks passed
@DMRobertson DMRobertson deleted the dmr/invalidate-timelines branch September 20, 2023 13:29
@DMRobertson DMRobertson mentioned this pull request Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Token expirations cause incorrect timelines (exacerbated by OIDC refreshing tokens)
2 participants