Yeah, we went into it a bit in the "What We Learned" section, but that was most ...

atombender · on April 13, 2017

Surprised you're using RabbitMQ. It's one of those things which work great until they don't (clustering is particularly bad), and then you have almost zero insight into the issue, and have to resort to the Pivotal mailing list.

Have you looked at NATS at all? We're using it as a message bus for one app and it's been fantastic. It is, however, an in-memory queue, and the current version cannot replace Rabbit for queues that require durability.

fernandotakai · on April 13, 2017

i've been using rabbitmq heavily (as in, the whole infrastructure is based on two rabbitmq servers) for a long time and i've never seen it fail.

tbh, i never used clustering (because it's one of the shittiest clustering implementations i've ever seen) but we do use two servers (publishers connect to one randomly and consumers connect to both) and it seems to handle millions of messages without any issues.

of all servers i've ever used, rabbitmq is by far the most stable (together with ejabberd).

atombender · on April 13, 2017

RabbitMQ is decent if you don't use clustering (which, I agree, is shitty). I have some quibbles with the non-clustered parts, but nothing big.

Right now, the main annoyance is that it's impossible, as far as I understand, to limit its memory usage. You can set a "VM high watermark" and some other things, but beyond that, it will — much like, say, Elasticsearch — use a large amount of mysterious memory that you have no control over. You can't just say "use 1GB and nothing more", which is problematic on Kubernetes where you want to pack things a bit tightly. This happens even if all the queues are marked as durable.

fernandotakai · on April 13, 2017

yeah we have dedicated machines to rabbitmq because it's basically memory hungry. but i like it that way because it's only going to crash if the machine crashes.

d23 · on April 14, 2017

Hmm, I'll take a look at that. For websockets, non-durability seems like a fine tradeoff, so it sounds interesting. Thanks!

atombender · on April 14, 2017

Note that NATS is currently pub/sub, which is a "if a tree falls in the forest" situation. Messages don't go anywhere if nobody is subscribing.

So it's awesome for realtime firehose-type use cases where a websocket client connects, receives messages (every client gets all the messages, although NATS also supports load-balanced fanout) for a while, then eventually disconnects.

NATS is ridiculously fast [1], too.

There's an add-on currently in beta, NATS Streaming [1], which [2] has durability, acking/redelivery and replay, so covers most of what you get from both RabbitMQ and Kafka. It looks very promising.

[1] http://bravenewgeek.com/tag/nats/

[2] https://nats.io/documentation/streaming/nats-streaming-intro...

manigandham · on April 15, 2017

NATS is only a pub/sub system. NATS Streaming uses an embedded NATS server while building queuing and persistence on top. It works well.

atombender · on April 15, 2017

https://news.ycombinator.com/item?id=14111856

manigandham · on April 15, 2017

RabbitMQ is absolute crap. Surprised anyone uses it in production.

If you already have a good Redis infrastructure then you can just use the pub/sub features built into it for your websockets communication.

eatitraw · on April 13, 2017

Ah okay then!

BTW, thanks for the great post, it was a very interesting read.

not_kurt_godel · on April 14, 2017

> RabbitMQ

You're already on AWS, why not use SQS instead?