Consistency Models in Distributed Systems
Understanding How Data is Replicated With Strict & Eventual Consistency in Distributed Systems
Consistency & Consistency Model
In a distributed system, the network of nodes appears as a single unit from the outside, but internally, each node must follow specific communication protocols to share data with each other. This communication protocol defines the behavior a client can expect when reading or writing data, also known as Consistency.
When interacting with these systems, the client must understand the system’s Consistency behavior to meet the desired goals for scalability, performance, and other design requirements. This understanding of consistency required by the client and that provided by the system is established as a contract called the Consistency Model.
Strict Consistency Model
As the strongest consistency guarantee, the Strict Consistency model ensures that all client operations, even concurrent ones, are executed in the same order in a distributed system as in a sequential order on a single processor system. This guarantees that all reads return the most recent write irrespective of which replica the data is read from.
However, this strong data guarantee comes at a cost. Maintaining strict consistency requires mechanisms such as holding locks on resources, or protocols like Two-Phase Commit to propagate changes linearly across all participating nodes, resulting in additional latency in case of network delays, node failures, or updates to systems spread across multiple geo-locations.
Eventual Consistency Model
In some situations, a client may not need the most recent data to operate correctly and may instead prioritize low latency and high availability. These scenarios call for a weaker consistency guarantee known as the Eventual Consistency model, which ensures that all nodes in the system will eventually reflect the most recent writes. This model may result in stale reads if the data is read from a replica that hasn’t received the updates about the most recent write.
Because the nodes eventually become consistent, temporary divergence may occur during a network partition or delay. On the bright side, increasing the node count in the system doesn't affect latency, as this consistency model doesn't require all nodes to be updated immediately after a write operation.
When I design AI and data architecture, I often use eventual consistency. This is determined by the characteristics of the system. Your sharing is very good.
Here are some common consistency models:
- Strong Consistency
- Eventual Consistency
- Causal Consistency
- Sequential Consistency
- Linearizability
- Session Consistency
- Read-Your-Writes Consistency