Parallel Replication
Overview
During primary-secondary replication, the I/O thread of the secondary database receives the binlog and writes it to the relay log. The SQL thread of the secondary database replays the relay log and writes the update to the local drive. By default, one secondary database has only one SQL thread. In the case of high concurrency on the client, data gaps will exist between the primary and secondary databases if the speed of generating binary logs (binlogs) on the primary database is higher than the speed of consuming binlogs on the secondary database. Parallel replication reduces the data gap between the primary and secondary databases by increasing the degree of parallelism (DOP) of the SQL thread in the secondary database.
Service Process
The parallel replication solution based on the logic clock is introduced in MySQL 5.7. The solution supports two modes: Commit-Parent-Based and Lock-Based. In versions later than MySQL 5.7.21, WriteSet replication is added.
The MySQL transaction commit modes include:
- MySQL transaction commit
The MySQL transaction commit process consists of two stages: preparation and commit.
Preparation stage: After SQL statements are successfully executed, the xid information, redo log, and undo log are generated. The prepare method is called to set the transaction status to TRX_PREPARED and write the redo log to the drive.
Commit stage: After the preparations of all storage engines involved in the transaction are complete, the TC_LOG_BINLOG::log_xid method is called to write the SQL statements to the binlog. Then, the commit stage of the storage engines is called to commit the transaction and change the transaction status to TRX_NOT_STARTED.
- MySQL transaction group commit
The transaction group commit process is introduced in MySQL 5.6. The process consists of three stages: flush, sync, and commit.
- Flush stage: All registered threads are written to the binlog for caching.
- Sync stage: The binlog is synchronized to the drive for caching.
- Commit stage: The leader thread calls the storage engine to commit transactions in sequence.
Each stage has its own queue. When a thread is registered with an empty queue, the thread functions as the leader of the queue. Then, the threads that are registered with the queue are followers. The leader thread controls the operations of the follower threads in the queue. That is, the leader thread leads all follower threads in the current queue to the next stage.
In the commit stage, storage engine commit is performed. The binlog_order_commits parameter affects the commit behavior. If this parameter is set to On, the commit behavior of the storage engine is changed to a serial operation, and the commit sequence is based on the queue sequence. If this parameter is set to OFF, each transaction in the group performs the commit operation at the storage engine layer after the leader thread completes the commit operation.
Group commit is the basis of parallel replication of logic clocks. It groups all transactions and assigns last_committed and sequence_number to each transaction. last_committed indicates the commit number of the last transaction in the database. The transactions in the same group have the same last_committed value. The sequence_number values increase in sequence, and each transaction corresponds to a sequence number.
- Commit-Parent-Based and Lock-Based
In Figure 1, C indicates last_committed, indicating the maximum logical commit time obtained before transactions enter the prepare stage.
S indicates sequence_number, that is, the logical time after each transaction is committed.
- Commit-Parent-Base
- Transactions with the same C value belong to the same group. Transactions in the same group do not conflict with each other and can be executed concurrently.
- Trx1, Trx2, Trx3, and Trx4 do not belong to the same transaction group and cannot be executed concurrently.
- Trx5, Trx6, and Trx7 belong to the same transaction group. They have the same C value 4, and can be executed concurrently.
- Trx8 and Trx9 belong to the same transaction group. They have the same C value 6, and can be executed concurrently. However, they can be executed only after Trx7 is complete.
In Commit-Parent-Base mode, fewer group transactions of the primary database indicate lower DOP.
- Lock-Based
If overlapping transactions exist (that is, there is an intersection between the C and S values of transactions), there is no lock conflict, and these transactions can be executed concurrently at the prepare stage.
- Trx1, Trx2, Trx3, and Trx4 do not overlap and cannot be performed concurrently.
- Trx5, Trx6, and Trx7 overlap with each other and can be performed concurrently.
- Trx7, Trx8, and Trx9 overlap with each other, but they are not in the same group and can be executed in parallel. After Trx5 and Trx6 are executed, Trx7, Trx8, and Trx9 can be executed in parallel. The parallel granularity of Lock-Based is finer than that of Commit-Parent-Base.
In the parallel replication process, the binlog_prepare function is called to generate a last_committed value for each transaction. The last_committed value of the first transaction in the Flush queue is used as the last_commit value of the queue. The binlog_cache_data::flush function can be called to obtain the sequence_number value of each transaction, the Mts_submode_logical_clock:Mts_submode function can be called to implement parallel replication of secondary databases.
- Commit-Parent-Base
- WriteSet Replication
WriteSet checks whether write conflicts exist between different transactions. When the relay log is written, last_committed of transactions that do not conflict with each other are set to the same value, and the parallel replay of transactions is replanned. Therefore, the concurrency of secondary databases does not depend on the primary database. Instead, the parallel relationship is determined based on the update conflict of transactions. WriteSet performs hash calculation on the modified primary key and non-null unique index values of tables which are contained in committed transactions, and compares the calculation result with the values in the hash map (which stores recently committed transactions) in the memory to determine whether the current transaction conflicts with committed transactions. If no conflict exists, the current transaction and committed transactions share the same last_committed value. If a conflict exists, all WriteSets committed before the conflict transaction are deleted from the committed WriteSet in the hash map, and the last_committed value is calculated in commit_order mode.
Figure 2 WriteSet conflict detection process
- If the hash value of row 1 in the WriteSet can be found in the hash map which stores recently committed transactions, row 1 is in conflict with the hash map. The SQL thread changes the sequence_number value of row 1 in the hash map to 130 and the last_committed value of the WriteSet to 120.
- If the hash value of row 7 in the WriteSet can be found in the hash map which stores recently committed transactions, row 7 is in conflict with the hash map. The SQL thread changes the sequence_number value of row 7 in the hash map to 130. The sequence_number value of row 7 in the hash map is 114, which is less than 120 (the last_committed value of the WriteSet). Therefore, the SQL thread does not change the last_committed value of the WriteSet.
- If the hash value of row 6 in the WriteSet can be found in the hash map which stores recently committed transactions, row 6 is in conflict with the hash map. The SQL thread changes the sequence_number value of row 6 in the hash map to 130. The sequence_number value of row 6 in the hash map is 105, which is less than 120 (the last_committed value of the WriteSet). Therefore, the SQL thread does not change the last_committed value of the WriteSet.
- The hash value row 10 in the WriteSet cannot be found in the hash map which stores recently committed transactions. The SQL thread inserts a new record into the hash map.
Row 1, row 7, and row 6 in the WriteSet are in conflict with the hash map and therefore cannot be executed concurrently on secondary databases. Row 10 is not conflict with the hash map and therefore can be executed on secondary databases.
Application Scenarios
Parallel replication enhances the performance of replaying new data on the secondary database, facilitating the query of latest data. However, parallel replication does not ensure that real-time updated data can be queried.
