RabbitMQ is an open-source message-broker software

Enter Docker container

If you run Rabbit as docker container, first enter the docker container

docker ps
CONTAINER ID IMAGE                        NAMES
12345        rabbitmq:3.7.20-management   rabbitmq-0

docker exec -ti rabbitmq-0 bash

Get the cluster status

rabbitmqctl cluster_status
rabbitmq-diagnostics cluster_status

Network partitions

Rabbit seems to very sensible to network interruptions between several nodes of a cluster. Especially if the nodes do not have the majority of the nodes of the cluster. For example with 2 nodes this is always the case why you should not have a cluster of 2 nodes. But it might always happen that none of the nodes can talk to any other node so none of them has the majority. In this case YOU have to join them manually and you will loose events.

# autoheal Will loose events on re join but no manual steps required
# pause_minority Stops all clusters that do not have the majority of total nodes, dangerous, might stop everything. Will also need cluster_partition_handling.pause_if_all_down.recover
# pause_if_all_down You provide reference nodes, if they are down you are also stopped
# ignore Keeps on running but manual join required
cluster_partition_handling = autoheal


Make a detached node join the cluster again

rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl join_cluster rabbit@my-rabbit-server-02.example.com
rabbitmqctl start_app
rabbitmqctl cluster_status


Enable Prometheus /metrics backend to monitor what your rabbit is doing

rabbitmq-plugins enable rabbitmq_prometheus

Configuration options

prometheus.return_per_object_metrics = false
prometheus.path = /metrics
prometheus.tcp.port =  15692