Dynamic federation remotes by fisx · Pull Request #3235 · wireapp/wire-server

fisx · 2023-04-19T14:42:21Z

https://wearezeta.atlassian.net/browse/FS-1115

store federation remotes in cassandra in brig, in the same form as the remote search policies in the config file.

the idea is that this will become the source of truth for all services which nodes we federate with, and that it can be dynamically updated (via internal end-points for now, and possibly by events like "node down for too long" in the future).

it's not ideal that we store this in brig, but federator doesn't have a cassandra instance.

[not on this PR yet] we're also working on a TVar maintained by all federators and other services that are interested (including brig) that keeps a cache of the cassandra table, and a rabbitMQ broadcast queue that propagates updates to the brig pod siblings that haven't received it from the operator, and to the other services. the federator config will be simplified to not contain a list of federating nodes any more; federators will also get them from brig.

migration: my current idea is to always consider the union of brig's config file and the cassandra table the global truth. this way the operator can safely remove the list from the config file after the cassandra table has been updated (next deployment or so). the downside of this is that removing edges that are in the config file will silently fail (we could make it fail loudly, of course).

Checklist

Add a new entry in an appropriate subdirectory of changelog.d
Read and follow the PR guidelines

services/brig/src/Brig/API/Internal.hs

akshaymankar · 2023-04-20T10:08:49Z

services/brig/src/Brig/API/Internal.hs


+federationRemotesAPI :: ServerT BrigIRoutes.FederationRemotesAPI (Handler r)
+federationRemotesAPI =
+  Named @"get-federation-remotes" (lift $ FederationDomainConfigs <$> wrapClient Data.getFederationRemotes) -- TODO: get this from TVar!  also merge in config file!


Don't see the code for this, but perhaps this would require a full-table scan, which Cassandra doesn't like. If we have to list all the domain, perhaps we should store them differently.

full table table scan is bad even if the table is <100 entries? then i'm not sure what the alternative is...

It can be done, but needs some clever, not obvious things like this: https://nblair.github.io/2017/02/16/cassandra-full-table-scan/

An alternative is to store the list in 1 row with id = 1. I think we already do this somewhere.
Then have a separate table to remember all the configuration for each endpoint. This also has the benefit of not forgetting preferences for a remote even if you stop federating with them.

If we use RabbitMQ with mqtt, we can use retain flags to hold onto the last value sent for domain updates. When the clients connect, they will receive this message and can immediately update themselves to the latest state.

https://nblair.github.io/2017/02/16/cassandra-full-table-scan/ says it's fine to pull 10k rows in one query. so wouldn't c553c0e be a good solution?

akshaymankar · 2023-04-20T10:10:32Z

migration: my current idea is to always consider the union of brig's config file and the cassandra table the global truth. this way the operator can safely remove the list from the config file after the cassandra table has been updated (next deployment or so). the downside of this is that removing edges that are in the config file will silently fail (we could make it fail loudly, of course).

Perhaps we don't need to worry about this as federation is not very seriously used by anyone, so causing a bit a breakage while someone makes an API call to load the list in the DB is acceptable? If so we can get rid of this whole thing.

akshaymankar · 2023-04-20T10:33:58Z

[not on this PR yet] we're also working on a TVar maintained by all federators and other services that are interested (including brig) that keeps a cache of the cassandra table, and a rabbitMQ broadcast queue that propagates updates to the brig pod siblings that haven't received it from the operator, and to the other services. the federator config will be simplified to not contain a list of federating nodes any more; federators will also get them from brig.

I am assuming we're looking at storing this in TVar and using RabbitMQ to propagate changes in order to reduce latency which goes with checking whether or not a domain is allowed to be federated with. Do you have ballpark estimate of how much we're saving by doing this? To me this sounds like a lot of upfront optimization, or perhaps I am missing something.

fisx · 2023-04-20T11:17:21Z

[not on this PR yet] we're also working on a TVar maintained by all federators and other services that are interested (including brig) that keeps a cache of the cassandra table, and a rabbitMQ broadcast queue that propagates updates to the brig pod siblings that haven't received it from the operator, and to the other services. the federator config will be simplified to not contain a list of federating nodes any more; federators will also get them from brig.

I am assuming we're looking at storing this in TVar and using RabbitMQ to propagate changes in order to reduce latency which goes with checking whether or not a domain is allowed to be federated with. Do you have ballpark estimate of how much we're saving by doing this? To me this sounds like a lot of upfront optimization, or perhaps I am missing something.

we could make everybody poll once every second, but i have a feeling that using a queue isn't much harder, just a lot faster.

come to think of it, we probably need to make sure that we don't have a (unlikely) race condition where pod A writes to cassandra, then pod B writes to cassandra, then pod B notifies everybody of the update, then pod A does. hum. would that be resolved by polling? i guess so.

akshaymankar · 2023-04-20T12:56:15Z

we could make everybody poll once every second, but i have a feeling that using a queue isn't much harder, just a lot faster.

I meant what is the point of the TVar, everybody who needs to know can just make an internal call to brig? How slow is that and is it too slow? As in do you have an SLO in mind?

lepsa · 2023-04-21T04:15:51Z

From my talks with Matthias, using a message queue simplifies propagating the domain update notice to all services that need it. Since the various services can only talk within their node makes coordination hard. Having a system sitting outside of these VMs, like the databases, that can fan out messages would be greatly helpful. The K8S pods wouldn't have to know how many other pods are running, or where they are running, all updates for federation domains can be passed through an MQ broker that handles fanout for us.

fisx · 2023-04-21T10:13:26Z

From my talks with Matthias, using a message queue simplifies propagating the domain update notice to all services that need it. Since the various services can only talk within their node makes coordination hard. Having a system sitting outside of these VMs, like the databases, that can fan out messages would be greatly helpful. The K8S pods wouldn't have to know how many other pods are running, or where they are running, all updates for federation domains can be passed through an MQ broker that handles fanout for us.

I think @akshaymankar you are suggesting that we do a rest api request to brig and table lookup every time the current code does an MVar lookup in federator or brig. Yes, that could be done, but wouldn't that add a few (10s? 100s?) milliseconds to every remote request or notification? I'm not aware if we have specified acceptable lag limits, but i feel it's easier to get this right to begin, doing lots of api requests isn't that much easier.

Instead of using a rabbit-queue we could also make every pod poll the table via rest once a second or so, for caching in a TVar. This is certainly slower during changes of the connection graph, but that's a rare operation and we need to allow for race conditions there anyway.

I still like the fastest option best. @akshaymankar do you really think it'll be expensive to implement? Maybe discuss on the phone monday?

akshaymankar · 2023-04-24T11:45:27Z

I still like the fastest option best. @akshaymankar do you really think it'll be expensive to implement? Maybe discuss on the phone monday?

From discussin in JIRA I guess y'all already have a decision, let me know if you still want to have a call.

fisx · 2023-05-01T13:05:50Z

closing in favor of #3260 (which contains this branch in full).

zebot added the ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist label Apr 19, 2023

fisx added 5 commits April 20, 2023 11:47

Cassandra schema update.

01c32e4

Internal routes for CRUDding the new table.

2bd4c6c

Fixup

6bb1d27

Integration tests.

59dcbf0

Simple handlers.

d369479

fisx changed the title ~~Dynamic federator remotes~~ Dynamic federation remotes Apr 20, 2023

Fixup

f8d8b58

fisx force-pushed the dynamic-federator-remotes branch from 3bc7926 to f8d8b58 Compare April 20, 2023 10:06

akshaymankar reviewed Apr 20, 2023

View reviewed changes

Fixup

d80cef9

Attempt at making C* full table scan acceptable.

c553c0e

fisx mentioned this pull request Apr 21, 2023

[FS-1845] Deploy RabbitMQ on k8s for CI #3236

Merged

fisx closed this May 1, 2023

Conversation

fisx commented Apr 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

Uh oh!

akshaymankar Apr 20, 2023

Choose a reason for hiding this comment

Uh oh!

fisx Apr 20, 2023

Choose a reason for hiding this comment

Uh oh!

akshaymankar Apr 20, 2023

Choose a reason for hiding this comment

Uh oh!

akshaymankar Apr 20, 2023

Choose a reason for hiding this comment

Uh oh!

lepsa Apr 21, 2023

Choose a reason for hiding this comment

Uh oh!

fisx Apr 21, 2023

Choose a reason for hiding this comment

Uh oh!

akshaymankar commented Apr 20, 2023

Uh oh!

akshaymankar commented Apr 20, 2023

Uh oh!

fisx commented Apr 20, 2023

Uh oh!

akshaymankar commented Apr 20, 2023

Uh oh!

lepsa commented Apr 21, 2023

Uh oh!

fisx commented Apr 21, 2023

Uh oh!

akshaymankar commented Apr 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fisx commented May 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fisx commented Apr 19, 2023 •

edited

Loading

akshaymankar commented Apr 24, 2023 •

edited

Loading