Building a High Availability RabbitMQ Cluster

Krzysztof Słomka
3 min readMar 16, 2023

--

RabbitMQ is a powerful and widely-used open-source message broker that enables asynchronous communication between distributed applications. Which I wrote about here:

In a production environment, ensuring high availability (HA) is crucial for maintaining reliable and fault-tolerant messaging systems. In this post, we will discuss the process of setting up an HA RabbitMQ cluster, covering the necessary steps and considerations.

High Availability Overview

In RabbitMQ, high availability is achieved by replicating queues across multiple nodes in a cluster. This replication process, called queue mirroring, ensures that if a node fails or experiences downtime, another node in the cluster can take over message processing without losing data. Queue mirroring is controlled by a policy that defines which queues should be mirrored and the number of replicas to maintain.

Cluster Setup

You can find a working example here: https://github.com/kisztof/rabbitmq-ha-demo

1. Install RabbitMQ on each node

First, install RabbitMQ on each server (node) that will be part of the cluster. Follow the official RabbitMQ installation guide for your operating system: Installing RabbitMQ

2. Configure RabbitMQ on each node

Next, configure RabbitMQ on each node by updating the rabbitmq.config or rabbitmq.conf file (depending on the RabbitMQ version) to enable clustering. Ensure that the loopback_users configuration includes an empty list ([]) or omits the default 'guest' user to allow remote access.

3. Set up the RabbitMQ cluster

Now, create the RabbitMQ cluster by joining each node to the first node (referred to as the “seed node”). Run the following commands on each node except the seed node:

rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@seed_node_hostname
rabbitmqctl start_app

Replace rabbit@seed_node_hostname with the actual hostname of the seed node.

4. Create a high-availability policy

Create a RabbitMQ policy to enable queue mirroring for the desired queues. Use the following command:

rabbitmqctl set_policy ha-all ".*" '{"ha-mode":"all","ha-sync-mode":"automatic"}'

This command creates a policy named “ha-all” that applies to all queues (matching the regular expression “.*”) and mirrors them across all nodes in the cluster. The “automatic” synchronization mode ensures that new replicas automatically synchronize with the primary queue.

Load Balancing and Failover

To distribute client connections evenly across the cluster and handle failover scenarios, consider using a load balancer. Popular load-balancing solutions include HAProxy, NGINX, and AWS Elastic Load Balancing (ELB).

Configure the load balancer to distribute client connections to the RabbitMQ nodes using a round-robin or least-connections algorithm. Additionally, set up health checks to monitor the status of each RabbitMQ node and remove unresponsive nodes from the pool.

Monitoring and Maintenance

It’s essential to monitor the RabbitMQ cluster and its nodes to ensure optimal performance and availability. Use the RabbitMQ management plugin to access a web-based interface for monitoring and managing the cluster. The management plugin provides metrics, statistics, and visualization tools that help diagnose issues, optimize resource usage, and maintain a healthy cluster.

Perform regular maintenance on your RabbitMQ cluster, such as upgrading to the latest RabbitMQ version, applying security patches, and optimizing configurations. Back up the RabbitMQ data and configurations to ensure a smooth recovery in case of failures or data loss.

Conclusion

Setting up a high-availability RabbitMQ cluster is essential for building reliable and fault-tolerant messaging systems in production environments. By following the steps outlined in this article, you can create an HA RabbitMQ cluster that ensures message processing continuity in the event of node failures or downtime. Additionally, using a load balancer to distribute client connections and handle failover scenarios further enhances the resiliency of your messaging system.

Remember to monitor and maintain your RabbitMQ cluster using the management plugin, which provides valuable insights into the performance and health of the system. Regular maintenance, including software updates, security patches, and configuration optimizations, is crucial for keeping your cluster running smoothly.

--

--

Krzysztof Słomka
Krzysztof Słomka

Written by Krzysztof Słomka

My name is Krzysztof, I'm a software architect and developer, with experience of leading teams and delivering large scalable projects for over 13 years...