Database Replication
Question:
What exactly is replication? What purpose does it serve? What is the purpose of its use? What are the most pressing issues?.
Summary:
- We need to define database replication.
- To show and for what it is used.
- And to state the issues of replication.
Explanation:
Database Replication:
Database replication means retaining multiple copies of the same data. The databases which keep copies of the same data are called replicas. We can read, recover the data from the replica.
For example, a user submits a form with some useful information. The user’s browser hits our backend server with this data, the server then writes the data to the main database so that it is persistent. The main database writes the data to the two replicas of different machines. Now all three databases will have the user information.
WHY DO WE NEED DATABASE REPLICATION?
- Lower latency.
- Keeping data close to the users.
- Data safety.
Two types of database replications:
1)Synchronous replication:
The server writes the data to the main database. And then the main database writes to the replica database. The replica database acknowledges back to the main database by confirming that it wrote data safely.
And finally, the main database tells the server that the write was successful. Now both the main database and replica database has the same data.
Problems with synchronous replication:
1)Slower writes:
For example, let us say there are two replicas and one main database. As we can see the two replicas have taken two different times to confirm that they received the data. One replica has taken 100 milliseconds to confirm and the other has taken 20 milliseconds to confirm.
So even though the main database has written the data successfully, we must have to wait 100ms before the server move on to the next request. This is with two replicas, imagine if there are thousands of replicas this delay will be longer.
2) Chances of failure:
Let’s say one of the two replicas is down for some reason. So it cannot acknowledge, in that case, the whole write operation fails. we don’t consider that a successful write.
2)Asynchronous replication:
Here the server writes to the main database. The main database tells the server the write was successful. After that, the main database writes the data to the replica without needing any acknowledgments. The replication in the replica is done in the background. So the write performance won’t suffer.
Problems with asynchronous replication:
1) Data loss:
Let’s say that the replica database is down when the write happens. Now the main database will have the data. However, the replica won’t have the data, so there will be some data loss.
2) Data inconsistency:
There will be a lot of data inconsistency due to the loss of data in the replicas.
Also, read Using python regular expressions, write python script.