Rocket.Chat on mongo replica set and read/write preference consequences #1867

georgiosd · 2016-01-12T13:23:07Z

Hey all,

We have recently deployed RocketChat to a 2 node+arbiter mongo setup. Both nodes have a single meteor instance and are in front of a load balancer and the arbiter is a smaller VM there for voting purposes only.

I have currently set the RS read preference to nearest and write preference to majority thinking that it's optimal in terms of guaranteeing writes and it helps balance the database load on the reads with some presumably minor latency effect.

Any thoughts on this setup for or against?

Thanks
Georgios

The text was updated successfully, but these errors were encountered:

geekgonecrazy · 2016-01-12T17:54:26Z

I think this is the first time i've heard of someone setting up like this.

Are you seeing any performance hits / gains? What kind of load?

@RocketChat/core

georgiosd · 2016-01-12T18:00:11Z

It depends on what you compare it to.

Our old set up was on a standalone VM/mongo instance on an Azure VM with 2x AMD cores/7GB RAM.

The new setup has the above mongo setup on 2x load balanced Azure VMs with 1x Xeon core+3.5GB RAM each. We've also set up RAID-0 with 2x cloud disks for the data (previously they were on the OS disk) and done several other OS tweaks such as swap on SSD drive etc.

Between the two we have seen at 3-4x improvement on the amount of active concurrent users.

What is the typical kind of setup you are implicitly referring to?

geekgonecrazy · 2016-01-12T18:07:33Z

@georgiosd I was referring specifically to the ReplicaSet settings. Most don't go so far as to detail those settings. So first i've seen specifically those settings. I personally have not done much with mongo replicasets. So others would be more qualified to answer this. :)

But sounds like a solid setup.

engelgabriel · 2016-01-12T18:17:56Z

Thanks for the tip! We are planning some more performance enhancements that should improve the amount of active concurrent users another 10 fold.

georgiosd · 2016-01-12T18:19:06Z

Oh ok :)

It seems to me like the minimum configuration for having any reliability if a VM fails for any reason.

What we have noticed is that the "user is typing" feature is limited to the users connected to the particular node. Meaning, if I can connected to node B, I will only see "x is typing" for users also connected to node B.

I presume that this is because of the read preference being set to nearest but perhaps you know something different?

engelgabriel · 2016-01-12T18:25:35Z

That is strange.. that should not be the case. All nodes broadcast the "user is typing" to the other running nodes via DDP... it that is not working, we have a bug.

georgiosd · 2016-01-12T18:26:55Z

If it does not depend on the database, then yes, I would agree :)

Sing-Li · 2016-01-12T18:33:32Z

Are you saying that you've git clone RC-source-code and running meteor on each of the node, with each one also running a local MongoDB instance?

georgiosd · 2016-01-12T18:36:37Z

Yes, and all mongodb instances are connected to the same replica set (2x nodes + arbiter)

Sing-Li · 2016-01-12T18:41:13Z

The round-robin load balancer is in front of the RC/meteor instances (port 3000)?

engelgabriel · 2016-01-12T18:44:46Z

@georgiosd are you running git clone RC-source-code and running meteor rather than downloading the latest version from https://rocket.chat/releases and running it using pm2?

The meteor command is only meant for development.. it will run much slower.

Sing-Li · 2016-01-12T18:48:24Z

Also - we might not have tested the typing... message when running multiple independent meteor instances operating in development mode on separate machines.

georgiosd · 2016-01-12T18:52:29Z

Might have rushed my responses a little.

The production instances are executed on node 0.10.40 like so:

node $APP_DIR/bundle/main.js

Package is built with meteor build

Sing-Li · 2016-01-12T18:54:51Z

Ah. OK 😃 👍 Is the load balancer in front of the 2 x RC instances ?

georgiosd · 2016-01-12T18:57:58Z

Yes, so both nodes have 8080 listening for requests, there's an nginx reverse proxy (redirects HTTP requests to HTTPS and proxies HTTPS requests to localhost:8080). The load balancer will distribute the load on port 443 for both nodes based on ClientIP (sticky).

Sing-Li · 2016-01-12T19:00:55Z

And the RC instances loop back again, via IP, round-robin to the two mongo instances co-located on the same 2 x VMs?

georgiosd · 2016-01-12T19:06:02Z

Not if I understood the question correctly.

This is the mongo config that I've found for replica set meteor environments:

MONGO_URL="mongodb://primary:27017,secondary:27017,arb:27017/$DB?replicaSet=rs0&readPreference=nearest&w=majority"
MONGO_OPLOG_URL="mongodb://primary:27017,secondary:27017,arb:27017/local"

Basically mongo is accessed directly by virtual network ip, it doesn't go through a load balancer.
Each node should pick the local mongo instance to read from given readPreference=nearest

Makes sense?

Sing-Li · 2016-01-12T19:09:41Z

Yep 😄 Very interesting. Thanks for sharing!

We find that number of active users handled scales with number of mid-tier RC instances. On demo.rocket.chat - we're running 4 x RC instances.

Certainly the repl set + arbiter config provide additional up-time availability benefits.

georgiosd · 2016-01-12T20:04:36Z

That makes sense. Is that one instance per core? It's the recommendation I've found in my research for node/meteor.

Given that at least on Azure, 1-core VM is half the price of a 2-core VM and 1/4 of the price of a 4-core VM, I thought it's better to increase uptime and keep cost the same.

If we need more juice in the near-term, I will turn the arbiter VM into a full node so we should gain another 1-2x.

The other problem with the kind of setup that you're describing, at least on Azure, is that the load balancer has a fixed destination port so if you have multiple VMs with multiple instances each, you'd have to have a load balancer on the virtual network level and another on the VM level to distribute the load between instances and that's when things start getting iffy. Too bad node can't just spawn workers.

Sing-Li · 2016-01-13T03:23:47Z

Yes. One instance per core. Server-side node.js is single-threaded 😄

Thanks for sharing. Yes, a mix of horizontally scaled + vertically scaled nodes typically results in super brittle configs.

georgiosd · 2016-01-13T17:11:31Z

I am happy to "donate" my Azure resource scripts for this setup and some basic bash scripts I've made to set things up if they'd help you guys at all.

Are you planning on having automated testing that will include environment testing (beyond unit/integrations tests)?

Sing-Li · 2016-01-13T18:04:25Z

@georgiosd -- That will simply be A-W-E-S-O-M-E ! Thanks in advance.

Please create a page here on our wiki https://github.com/RocketChat/Rocket.Chat/wiki

Any format is fine, and our documentation specialist team member will tidy it up and integrate it into our soon-to-be-available documentation website.

Testing - yes, including distributed load testing - but only in the long term plans. Is that what you mean by environment testing?

Thanks again.

georgiosd · 2016-01-14T10:06:32Z

Ok, sure. However because it's multiple files, I'd recommend something in the form of a pull request?

Environment testing: yes, distributed load testing would be a part of it. With software like RC it's often useful to deploy different configurations and test features against them.

Is unit/integration testing (local) part of your plans? If so, what kind of timeframe are we looking at?

Sing-Li · 2016-01-14T11:36:35Z

Great idea. Please submit a PR here, with an Azure subdirectory.

https://github.com/RocketChat/Deploy.to.Cloud

re: environment testing. If you mean different networking/clustering topologies and automated, I think we might be a generation away yet in terms of capabilities and resources available. k8s and docker swarm hold some promise, albiet still semi-simulation. It is only in distributed load testing - where Rocket.Chat can possibly deliver some breakthroughs as the command and payload switching fabric.

Unit and integration testing are not only in our plans, but (some) already in our existing source code. It is no secret that testing is the Achilles heel of Meteor based reactive systems in general; we're ready for MDG's new testing integration in 2016 when it becomes ready.

georgiosd · 2016-01-14T16:27:36Z

Ok, will do! Give me a few days, currently swamped.

Re unit testing: are you referring to spotify tests? They're the only tests I can see on the repo - am I missing anything?

What's the "MDG testing integration" you're referring to? Not sure if I've come across it.

We are actually having some problems with the current config too. I can't be sure what exactly is going on but I'd guess it's something to do with the Azure load balancer and the health probe.

The load balancer will point to an nginx process on each node (necessary to offload SSL) which will reverse proxy to the node instance. The health probe points to the node instances however because node could be down and nginx will still respond with a 502.

So in this setup, when something goes wrong and I say have to restart nginx, all hell breaks loose. The site gets unresponsive but without errors. Go figure.

I wanted to avoid using a paid load balancer which will offload SSL at that point so you can point it directly to the node instances but it seems like the only way to go.

Any ideas welcome.

engelgabriel · 2016-08-09T19:10:45Z

Hi, please follow https://github.com/RocketChat/Rocket.Chat.Docs/issues/68

vikas0121 · 2017-10-07T16:07:01Z

Hi i'm setting the MONGOURL in the forever service
export MONGO_URL=mongodb://10.xx.xx.xxx:27017,10.xx.xx.xxx:27017/Chat?replicaSet=rs1&readPreference=nearest
not working for me..help needed

richardwlu · 2017-10-07T16:27:55Z

@vikas0121 It may help if you can explain what error you are receiving?

Are you pointing to the correct database name and replicaSet name?

This is taken from our environment:
"MONGO_URL": "mongodb://mongochat01:27017,mongochat02:27017,mongochat03:27017/rocketchat?replicaSet=001-rs&readPreference=primaryPreferred&w=majority"

vikas0121 · 2017-10-07T16:34:45Z

@richardwlu thanks for the reply.
actually, im using forever service to run the application and using export command.
And not using double quotes for the url as shown above. The error is improper MONGOURL.
May be it is not accepting the parameters like "&" and all the data after "&" sign.

geekgonecrazy · 2017-11-11T22:48:48Z

Just saw this again. Might be worth following: #8064

Basically some issues in a few cases when reading from a secondary.

marceloschmidt added type: discussion subj: deployment labels Jan 13, 2016

engelgabriel mentioned this issue Mar 4, 2016

High Availability #520

Closed

simonclausen mentioned this issue Jun 4, 2016

High Availability Documentation #2964

Closed

engelgabriel closed this as completed Aug 9, 2016

marceloschmidt modified the milestone: 0.37.0 Sep 5, 2016

engelgabriel mentioned this issue Apr 26, 2022

High Availability deployment RocketChat/docs-old#2037

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rocket.Chat on mongo replica set and read/write preference consequences #1867

Rocket.Chat on mongo replica set and read/write preference consequences #1867

georgiosd commented Jan 12, 2016

geekgonecrazy commented Jan 12, 2016

georgiosd commented Jan 12, 2016

geekgonecrazy commented Jan 12, 2016

engelgabriel commented Jan 12, 2016

georgiosd commented Jan 12, 2016

engelgabriel commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

engelgabriel commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 13, 2016

georgiosd commented Jan 13, 2016

Sing-Li commented Jan 13, 2016

georgiosd commented Jan 14, 2016

Sing-Li commented Jan 14, 2016

georgiosd commented Jan 14, 2016

engelgabriel commented Aug 9, 2016

vikas0121 commented Oct 7, 2017

richardwlu commented Oct 7, 2017

vikas0121 commented Oct 7, 2017

geekgonecrazy commented Nov 11, 2017

Rocket.Chat on mongo replica set and read/write preference consequences #1867

Rocket.Chat on mongo replica set and read/write preference consequences #1867

Comments

georgiosd commented Jan 12, 2016

geekgonecrazy commented Jan 12, 2016

georgiosd commented Jan 12, 2016

geekgonecrazy commented Jan 12, 2016

engelgabriel commented Jan 12, 2016

georgiosd commented Jan 12, 2016

engelgabriel commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

engelgabriel commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 12, 2016

georgiosd commented Jan 12, 2016

Sing-Li commented Jan 13, 2016

georgiosd commented Jan 13, 2016

Sing-Li commented Jan 13, 2016

georgiosd commented Jan 14, 2016

Sing-Li commented Jan 14, 2016

georgiosd commented Jan 14, 2016

engelgabriel commented Aug 9, 2016

vikas0121 commented Oct 7, 2017

richardwlu commented Oct 7, 2017

vikas0121 commented Oct 7, 2017

geekgonecrazy commented Nov 11, 2017