How Message Queues Work

Understanding Publishing, Storage, Consumption, and Fault tolerance in Message Queues

Jan 24, 2025

Message queues enable asynchronous communication between services in modern distributed systems. They decouple producers (services that send messages) from consumers (services that process messages), enhancing scalability, resilience, and overall system performance

Publishing Messages

When a producer application wants to send data to another service, it doesn't call that service directly. Instead, it sends a message to a specific queue. This message includes the data being sent (the payload) and details about the message (the metadata), such as its type, priority, and sender. The producer uses a client library from the message queue system to connect and send the message. The message queue broker acknowledges the producer after successfully receiving the message. This decoupling allows the producer to continue working without waiting for the consumer to handle the message.

Message Storage

Once a message reaches the message queue broker, it must be stored efficiently and reliably. Many modern message queue systems employ the concept of topics to achieve this. A topic acts as a logical channel for messages of a specific type. Producers publish messages to a specific topic, and consumers subscribe to topics they are interested in. Messages within a topic are divided into smaller groups called shards or partitions. Shards allows the message queue to share the input load among multiple servers or storage units, significantly improving throughput and scalability. Each shard is an independent, ordered list of messages. When a producer publishes a message to a topic, the message queue system uses a partitioning key (often derived from the message content) to determine which shard the message should be written to.

Consuming Messages

Consumers can receive/read messages (subscribe) from one or more topics. When a new message arrives in a topic that a consumer is subscribed to, the message queue system delivers the message to the consumer. This delivery can happen in two primary modes:

Push Mode: In the push mode, the message queue broker actively pushes messages to the consumer as they arrive.
Pull Mode: In the pull mode, the consumer periodically polls the message queue broker for new messages. The pull model offers finer control over message consumption rates. For example, a slow consumer can take its time to process a message before polling for the following message from the queue.

Fig. Push mode and pull mode in message queues

Faut Tolerance with Message Leases and Retries

A key part of message queues is ensuring messages are delivered and processed even if something goes wrong. This is where message leases and retries are important.

When a consumer gets a message, the message queue system grants it a lease on that message. This lease acts as a time-limited hold, stopping other consumers from simultaneously processing the same message. A message with an active lease stays in the queue but is not visible to other consumers. The consumer that has the lease must process the message before the lease expires. If the processing is successful, the consumer informs the message queue system, which then deletes the message from the queue.

Fig. Message queues grant consumers with a timed lease for processing a message

The lease ends if a consumer doesn't process the message within the lease period. This can happen due to a crash, timeout, or other error. The message queue system then makes the message available to another consumer. This way of retrying messages provides fault tolerance in case of issues.

However, proper retry strategies, such as Exponential Jitter & Retry and Dead-Letter Queues (DLQs), are essential to avoid infinite retry loops.

If you enjoyed this article, please hit the ❤️ like button.

If you think someone else will benefit from this, then please 🔁 share this post.

The Scalable Thread

Discussion about this post