Table of Contents
Overview
In simple terms, a system is strongly consistent if it is reading the most updated write. It never returns out of date or stale values. As per the Wikipedia, the definition of Strong Consistency is
The protocol is said to support strong consistency if: All accesses are seen by all parallel processes (or nodes, processors, etc.) in the same order (sequentially). Therefore, only one consistent state can be observed, as opposed to weak consistency, where different parallel processes (or nodes, etc.) can perceive variables in different states.
The formula for Strong Consistency
To understand strong consistency in Databases we need to understand three terms first
- Number of Nodes or Replicas
- Write Quoram
- Read Quoram
Number of Nodes or Replicas
This is the number of Nodes that exist in your system. When we say node here then that essentially means the number of replicas that have the same data
Write Quoram
It is the minimum number of nodes to which write will happen before it is returned as a success.
Read Quoram
It is the minimum number of nodes from which the read will happen before it is returned as a success. If the value returned by those minimum number of nodes is not the same then that read will be rejected
Mathematically if the sum of Write Quoram and Read Quoram is greater than the number of read nodes then the system is said to be strongly consistent.
That is if Write Quoram is W and Read Quoram is R and the number of nodes is N, then
- The system is strongly consistent if W+R > N
- The system is not guaranteed to be strongly consistent if W+R<=N
In other words, if there is a node common between Write Quoram nodes and Read Quoram nodes then the system will be strongly consistent. That common node will reject the stale reads. A node is always common between Write Quoram and Read Quoram whenever W+R > N. It is not mathematically difficult to know why.
Let’s see by some examples how this formula holds. We are going to see the following cases.
- The number of nodes is 1
- The number of nodes is 2
- The number of nodes is 3
The number of nodes is 1
In this case, the only combination of Write and Read Quoram possible is 1 and 1. The Sum of Write Quoram and Read Quoram is greater than the read node. Since there is a single instance, read is always the latest because read and write are happening on the same instance
The number of nodes is 2
In this, there are further four cases.
- Write Quoram is 2 and Read Quoram is 1 – Strongly Consistent
- Write Quoram is 1 and Read Quoram is 2 – Strongly Consistent
- Write Quoram is 2 and Read Quoram is 2 – Strongly Consistent
- Write Quoram is 1 and Read Quoram is 1 – Not Strongly Consistent
Write Quoram is 2 and Read Quoram is 1
In this case, the sum of Write Quoram and Read Quoram is 3 which is greater than the number of nodes hence system should be strongly consistent. In this case, since write is happening on both the nodes, hence read from either of the nodes will return the same data and the system overall will be strongly consistent
Write Quoram is 1 and Read Quoram is 2
Again, in this case, the sum of Write Quoram and Read Quoram is 3 which is greater than the number of nodes hence system should be strongly consistent. In this case, write is happening only on a single instance. But read is happening from both the nodes. Imagine a case where the initial value of a data named A is 1. Both the nodes have a value of A as 1. Now write happened on the first node and the value of A is changed to 2. Since Write Quoram is 1 hence write will happen on the first node and it will return successfully. Imaging before node 2 could sync up with the latest data on node 1, a read happened. Since Read Quoram is 2 it is going to read from both the nodes. The first node will return the value of A as 2 while the second node will return the value of A as 1. Since the value returned by both the nodes is not the same, the system will reject that read to maintain strong consistency.
Write Quoram is 2 and Read Quoram is 2
Again, in this case, the sum of Write Quoram and Read Quoram is 4 which is greater than the number of nodes hence system should be strongly consistent. In this case, since write is happening on both the nodes and read also from both the nodes. Therefore the read will be the latest every time.
Write Quoram is 1 and Read Quoram is 1
Again, in this case, the sum of Write Quoram and Read Quoram is 2 which is equal to the Number of nodes hence system is not strongly consistent. Write happened on node 1. A read happened on node 2 before data could be synced to node 2. This read is stale data.
The number of nodes is 3
In this scenario, there can be multiple cases but we are only going to discuss a few cases.
- Write Quoram is 2 and Read Quoram is 2 – Strongly Consistent
- Write Quoram is 2 and Read Quora is 1 – Not Strongly Consistent
- Write Quoram is 1 and Read Quoram is 2 – Not Strongly Consistent
Write Quoram is 2 and Read Quoram is 2
In this case, the sum of Write Quoram and Read Quoram is 4 which is greater than the number of nodes hence in the system should be strongly consistent. Imagine there are three nodes: node1, node2, and node3.
Again imagine a case where the initial value of a data named A is 1. A Write happened on node 1 and node 2 and the value of A is increased to 2. Before the data could be synced to node 3, a read happened. For read, there could be three cases.
- Read happened from node 1 and node 2
- Read happened from node 1 and node 3
- Read happened from node 2 and node 3
a. Read happened from node 1 and node 2
In this case, there is no issue, the value of A will be returned as 2 from both the two nodes and the system will be strongly consistent.
b. Read happened from node 1 and node 3
In this case node 1 will return the value as 2 while node 3 will return the value of A as 1. Since both the values are not the same the read will be rejected and the system will be strongly consistent.
c. Read happened from node 2 and node 3
In this case, also node 2 will return the value as 2 while node 3 will return the value as 1. Since both the values are not the same the read will be rejected and the system will be strongly consistent.
Write Quoram is 2 and Read Quoram is 1
In this case, the sum of Write Quoram and Read Quoram is 3 which is not greater than the number of nodes in the system should not be strongly consistent
Write happens on node 1 and node 2 and the value of A is updated to 2. Read happened from node3 and value is returned as 1 making the system not strongly consistent or weak consistent
Write Quoram is 1 and Read Quoram is 2
Write happened on node1 and the value of A is updated to 2. Read happened from node2 and node3 and both nodes return the value of A as 1. Since both the return values are the same, the read is not rejected and the system is not strongly consistent.
Disadvantages of Strong Consistency
It is only possible to achieve strong consistency at the cost of availability in the case of a network partition. This is as per the CAP theorem as well. CAP theorem states that in a distributed system, you will only be able to achieve two of the three properties below
- Consistency
- Availability
- Partition Tolerance
It is not practically possible to achieve partition tolerance because the network is bound to be failed and communication between nodes is bound to be lost. That is why in a distributed system that is not partitioned tolerant, it is only possible to achieve either consistency or availability.
Let’s take an example to understand why one has to be sacrificed. Assume you have two nodes in the system. Both are connected to each other and both are in sync.
Now let’s say a network partition happened between the two nodes
How consistency is achieved
Here the number of nodes is 2. Hence two achieve strong consistency we have three options
- Write Quoram is 2 and Read Quoram is 1
- Write Quoram is 1 and Read Quoram is 2
- Write Quoram is 2 and Read Quoram is 2
In the first case since Write Quoram is 2 hence it must write to both the nodes. But since the second node is not available. Hence it will reject the write. Hence the system is not available for write
In the second case since Read Quoram is 2 hence it must read from both the nodes. But since the second node is not available. Hence it will reject the read. Hence the system is not available for reading
In the third case since Write Quoram is 2 hence it must write to both the nodes and since Read Quoram is 2 hence it must read from both the nodes. But since the second node is not available. Hence it will reject both read and write. Hence the system is not available for reading as well as writing
From the three cases you can deduce that we get strong consistency at the cost of availability in case of network partitions and network partitions are bound to happen
How availability is achieved
To achieve availability we must give up consistency. Here the number of nodes is 2. Hence there is one option in which the system will not be strongly consistent.
- Write Quoram is 1 and Read Quoram is 1
In the first case since Write Quoram is 1, hence it can write to node 1. Also, the Read Quoram is 1, and hence it can read from node 1. Hence the system is available. The sum of Write Quoram and Read Quoram is 2 which is equal to the number of nodes. We know by the formula that when the sum of Write Quoram and Read Quoram is less than equal to the number of nodes, then the system is not strongly consistent. Hence in this case we have availability but the system is not consistent.
When is strong consistency desirable?
There are a couple of use cases when strong consistency is desirable. For example in the case of banking transactions, it is desirable to get the latest value every time you read.
This is all about Strong Consistency in Databases. Hope you have liked this article.
Note: Also, check out our system design tutorial series here – System Design Tutorial Series