Skip to content

Comments

[FS-1940] Start sending backend notifications through rabbitMQ and consuming them#3276

Merged
akshaymankar merged 54 commits intodevelopfrom
backend-notification-rises
May 24, 2023
Merged

[FS-1940] Start sending backend notifications through rabbitMQ and consuming them#3276
akshaymankar merged 54 commits intodevelopfrom
backend-notification-rises

Conversation

@elland
Copy link
Contributor

@elland elland commented May 9, 2023

https://wearezeta.atlassian.net/browse/FS-1940

Checklist

  • Add a new entry in an appropriate subdirectory of changelog.d
  • Read and follow the PR guidelines

@zebot zebot added the ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist label May 9, 2023
@elland elland changed the title [FS-1207] Start sending backend notifications through rabbitMQ and consuming them [FS-1930] Start sending backend notifications through rabbitMQ and consuming them May 9, 2023
@elland elland changed the title [FS-1930] Start sending backend notifications through rabbitMQ and consuming them [FS-19340] Start sending backend notifications through rabbitMQ and consuming them May 9, 2023
@elland elland changed the title [FS-19340] Start sending backend notifications through rabbitMQ and consuming them [FS-1940] Start sending backend notifications through rabbitMQ and consuming them May 9, 2023
@elland elland force-pushed the backend-notification-rises branch from 14c7228 to eb93c00 Compare May 9, 2023 09:35
elland added 2 commits May 9, 2023 14:24
Send onUserDeleteConnections notification using rabbitmq
@elland elland force-pushed the backend-notification-rises branch from eb93c00 to 8fa3b8d Compare May 9, 2023 12:29
@CLAassistant
Copy link

CLAassistant commented May 10, 2023

CLA assistant check
All committers have signed the CLA.

@elland elland force-pushed the backend-notification-rises branch from 9641a09 to a0ec535 Compare May 10, 2023 08:41
@elland elland force-pushed the backend-notification-rises branch 2 times, most recently from 57eea77 to 85680e2 Compare May 11, 2023 13:27
@elland elland force-pushed the backend-notification-rises branch from 85680e2 to 9ca845e Compare May 11, 2023 13:53
@elland elland force-pushed the backend-notification-rises branch from a219655 to 7a1ea7e Compare May 15, 2023 08:53
@elland elland force-pushed the backend-notification-rises branch from dec03a3 to 93179f9 Compare May 15, 2023 11:23
@akshaymankar akshaymankar force-pushed the backend-notification-rises branch from f755d44 to d668582 Compare May 23, 2023 08:17
@akshaymankar akshaymankar marked this pull request as ready for review May 23, 2023 14:18
@akshaymankar akshaymankar requested a review from fisx May 23, 2023 14:18
@akshaymankar akshaymankar force-pushed the backend-notification-rises branch from 484fb92 to ef32b84 Compare May 23, 2023 14:54
@smatting smatting self-requested a review May 24, 2023 07:51

The above are the default values (except for secrets, which do not have
defaults), if they work they are not required to be configured.
`background-worker.config.remoteDomains` should contain all the remote domains
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will change again with #3260, but let's worry about that there.

When federation is enabled, wire-server will require running RabbitMQ. The helm
chart in `rabbitmq` can be used to install RabbitMQ. Please refer to the
documentation at https://docs.wire.com to install RabbitMQ in Kubernetes. These
new configurations are required:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the yaml snippet to docs.wire.com? or if it's already there: link to it rather than redundantly copy it? on the other hand, that would complicate the situation in which we change docs.wire.com in a later release, and somebody upgrades to this release afterwards.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way docs.wire.com tells people about these options is by telling them to copy some example config from wire-server-deploy. I think that is quite useless for people doing upgrades. So, I added this yaml here. As you say it is also useful for someone doing upgrades 1 version at a time, which something could have changed in future in docs.wire.com.

rollingUpdate:
maxUnavailable: 0
maxSurge: {{ .Values.replicaCount }}
# Ensures only one version of the background worker is running at any given
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Ensures only one version of the background worker is running at any given
# Ensures only one instance (or k8s pod) of the background worker is running at any given

is that what you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, many K8s pods can run at once. This way of deploying just deletes previous version before deploying the new version. Basically does the opposite of a rolling deployment.

limits:
memory: "512Mi"
# TODO(elland): Create issue for a metrics endpoint
# FUTUREWORK: Implement metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do this in a separate PR, but we should make sure we won't forget and have a ticket:

https://wearezeta.atlassian.net/browse/FS-2020

btw, which ticket is this PR about?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is this: https://wearezeta.atlassian.net/browse/FS-1940
It was in the title, also added it to description now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating the issue, I had asked Marco Conti to create it as I wasn't sure how to get it in right JIRA places. The one you create doesn't seem to be in the right epic. Maybe someone will notice and put it there? Is this our process?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a parent of relation, maybe that's all that was needed 🤞


build-depends:
aeson
, amqp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this package is starting to bundle a lot of unrelated dependencies together, and inheriting them to a lot of places. but i don't think we should worry about this yet.

@@ -0,0 +1,98 @@
{-# LANGUAGE NumericUnderscores #-}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put this into the cabal files in the future? i don't see any harm in it being the default.

deriving (Arbitrary) via (GenericUniform Component)
deriving (ToJSON, FromJSON) via (Schema Component)

instance ToSchema Component where
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FedAwareService? (anything but component. it's neither the term we usually use, nor is it very expressive.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be so much easier once we merge these services 😄

Comment on lines -160 to -161
test p "delete with connected remote users" $ testDeleteWithRemotes opts b,
test p "delete with connected remote users and failed remote notifcations" $ testDeleteWithRemotesAndFailedNotifications opts b c,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you've removed these, but not added them again to the new integration test suite? is this not useful any more?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first test is already being tested in end-to-end tests. It is very hard to test in the integration tests.

The second case is not valid anymore because we cannot fail due to remote backend being down, we can fail due to rabbitmq being down, but that is just a 500. I didn't think we should test 500 scenarios.

'AWS_SECRET_ACCESS_KEY': "dummysecret",
'RABBITMQ_USERNAME': 'guest',
'RABBITMQ_PASSWORD': 'alpaca-grapefruit'
'RABBITMQ_PASSWORD': 'guest'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggested to change this to something less obvious earlier. it's an easy way to counter quite a few potential security issues. why did you change it back?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was botched merge, perhaps this happened due to that. I will add it back.

Comment on lines +70 to +71
-- this message blocks the whole queue. Perhaps there is a better way to
-- deal with this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe blocking the queue is the safer choice than data loss until the FUTUREWORK is completed? It would only be blocked until a newer worker is able to ack the new message, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of a reverting a deployment, it would get blocked forever.

Comment on lines +82 to +83
-- FUTUREWORK: Recosider using 1 channel for many consumers. It shouldn't matter
-- for a handful of remote domains.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that result in a blocked (or very slow) channel in case a domain is unreachable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channel itself doesn't block when a domain is unreachable. It only blocks the particular consumer who is supposed to be pushing notifications to the unreachable domain. Other consumers keep going.

I suspect that this will reach some limit of consumers or just be slow when we have hundreds of consumers running. But perhaps we can solve that problem when we get there.

let returnSuccessSecondTime _ =
atomicModifyIORef isFirstReqRef $ \isFirstReq ->
if isFirstReq
then (False, ("text/html", "<marquee>down for maintenance</marquee>"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;)

Comment on lines +35 to +36
defaults), if they work they are not required to be configured.
`background-worker.config.remoteDomains` should contain all the remote domains
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should help customers setting all the federation domain related configuration. We already have

  • federator.config.optSettings.federationStrategy.allowedDomains
  • setFederationDomain in various places
  • brig.config.optSettings.setFederationDomainConfigs
  • galley.config.settings.featureFlags.classifiedDomains

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe #3260 is going to solve this.

@akshaymankar akshaymankar force-pushed the backend-notification-rises branch from c85e708 to 41d4c81 Compare May 24, 2023 11:54
@akshaymankar akshaymankar merged commit 7b37e4e into develop May 24, 2023
@akshaymankar akshaymankar deleted the backend-notification-rises branch May 24, 2023 12:41
akshaymankar added a commit that referenced this pull request May 31, 2023
akshaymankar added a commit that referenced this pull request Jun 5, 2023
…gh RabbitMQ (#3333)

* galley: Send on-user-deleted-conversations backend notification through RabbitMQ

* charts/galley: Support configuring rabbitmq

* brig: Remove unnecessary annotation for on-user-deleted-connections

It was removed in #3276

* brig: validate options like galley does
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants