Understanding Stream Processing Guarantees in Apache Kafka
Data Processing Guarantees Provided by Different Stream Processing Systems
What is Processing Guarantee?
Like any other system, stream processing systems are also prone to failure scenarios, such as network errors, message broker failure, etc. In such failure scenarios, systems either crash or retry, resulting in messages getting lost or duplicated. This behavior can have varied impacts on the data consumers. For example, a consumer sending emails, not expecting duplicates, can result in sending an email multiple times if an event is duplicated. This is why all the entities (producer, consumer, and message broker) in a streaming system need to align on data expectations from the system. These expectations are known as processing guarantees. These guarantees apply to most stream processing systems such as Apache Kafka, Google Cloud Pub/Sub, AWS Kinesis, Azure Event Hubs, etc.
At least Once Guarantee
This guarantees that consumers will definitely receive each message but can receive it more than once. On the producer side, this can happen when, while publishing the message, the producer fails to receive an acknowledgment and retries.
On the consumer side, in at least once mode, once messages are processed, consumers save the position up until the point at which they have read from the stream. If a consumer crashes before saving the position, it may reprocess some or all of the messages again.
At most Once Guarantee
This guarantees that consumers will receive each message only once, but there can be scenarios where they do not receive it (best effort). On the producer side, this can happen when producers don’t wait or care for acknowledgment during writes, resulting in data getting lost while writing or after getting written if it wasn’t replicated on most nodes and the nodes it was written on crashed.
On the consumer side, in at most once mode, consumers save the position up until the point at which they have read from the stream before processing the messages read from the stream. If a consumer crashes after saving the position but before generating the output, it will not process those messages again.
Exactly Once Guarantee
This guarantees that consumers will receive each message exactly once. It’s an ideal scenario and the most complex to implement compared to the previous two. On the producer side, this means a transactional delivery where a producer waits for the acknowledgment of the writes, and any retries are idempotent.
On the consumer side, this can be achieved by first saving the consumer’s data read position and then saving the data read from the stream, usually using a two-phase commit approach. The transaction can be aborted if a failure occurs, and the consumer can re-read the data.
References
Message delivery Guarantees for Apache Kafka | Confluent Documentation. (n.d.). https://docs.confluent.io/kafka/design/delivery-semantics.html