While relational schemas rely on ACID principles, all NoSQL stores, without exception, rely on other principles described in the CAP theorem. For starters, it states that any data store has three basic properties:
- Data Consistency (Consistency). That is, the data must be complete and consistent (including across all nodes in the cluster).
- Availability. Roughly speaking, this is the speed of server response to our request (write or read).
- Partition tolerance. This means that if the system is divided into several parts, each of them, if available, should be able to work autonomously, giving the correct response and providing its data. A broken link in the cluster should not affect the final performance.
The CAP theorem tells us that we can only get two of these three components. Or in the same 2:1 proportion, balance between these components: improved performance in one of the properties entails degradation in some other property. To better understand the power of this theorem, imagine a distributed storage where you are trying to provide 100% consistent data (read and write) without performance brakes.
In a single-server architecture, if the server is running, it is available. And the database on it, if the designer’s hands grow from the shoulders, is consistent. There is no need to worry about node partitioning, since the system is physically indivisible. It is under such conditions that classical relational systems emerged. And this is why they are partition-unstable: it makes no sense to design a separable structure in an indivisible environment.
In fact, almost all NoSQL technologies were born to solve the problem of partitioning resistance, that is, to run efficiently on clusters. The relational model is not up to the task because it was created for other purposes and in other environments. You will not be able to “just saw off a couple or three tables or quietly partition them into a neighboring cluster” and then go for coffee and tea. Welcome to Hell.
NoSQL stores by their very nature can be easily partitioned into a cluster because of the specific storage structure.
It is in a distributed system environment that the true essence of the CAP theorem becomes apparent. Obviously, creating a cluster that is unstable to partitioning is devoid of any practical use. That is, the cluster must be created a priori as partition-resistant. Understanding this fact allows us to see the CAP theorem in a new light: one can choose only one of consistency and availability – or use a reasonable compromise between these two items (not three, as one might think from the original definition).
The second task that NoSQL technology ideologists try to solve is to increase availability, i.e. to get a fast server response.