MyCAT Working Principles
After intercepting a SQL statement from a user, MyCAT performs sharding, route, read/write splitting, and cache analyses on the SQL statement, sends it to the back-end database for proper processing, and then returns the result to the user.
Specifically, after receiving an SQL statement, MyCAT parses the SQL statement, searches for the involved table, and checks the definition of the table. If there is a sharding rule, MyCAT obtains the value of the sharding key in the SQL statement and matches the sharding function to obtain the shard list corresponding to the SQL statement. Then, the SQL statement is sent to these shards for execution. Finally, the result data returned by all shards is collected and processed, and then output to the client.
Take the select * from Orders where prov=? statement as an example. If prov=wuhan is found, dn1 is returned based on the sharding function. Therefore, the SQL statement is sent to MySQL, and the query result is obtained from DB1 and then returned to the user.
As open source distributed database middleware, MyCAT aims to enhance and extend MySQL functions. It can implement the read/write splitting for the MySQL database. MyCAT can also detect the status of the back-end MySQL database, and supports the failover mechanism to ensure high availability of MySQL services.
MyCAT for MySQL Read/Write Splitting
MyCAT implements read/write splitting by leveraging the primary-secondary replication architecture of MySQL. In the read/write splitting configuration, writeHost serves as the primary node, and readHost serves as the secondary node. In normal cases, MyCAT uses the first writeHost as the writer node, and all DML SQL statements are sent to this node. After data is written to the writer node, it is synchronized to the secondary node through primary-secondary replication, ensuring data consistency. If read/write splitting is enabled on MyCAT, SELECT SQL statements are sent to the readHost for execution based on the read/write splitting policy.
MyCAT for MySQL HA
MyCAT is middleware at the proxy layer. The high availability (HA) of the MyCAT system involves the HA of MyCAT itself and that of MySQL at the back end. The MySQL HA solution can be used to ensure the availability of the back-end MySQL service connected to MyCAT. In most cases, you are advised to use the standard MySQL primary-secondary replication HA configuration and use MyCAT to implement automatic primary/secondary switchover of back-end MySQL nodes.
In the figure above, writeHost indicates the primary node, and readHost indicates the secondary node. MyCAT periodically detects heartbeats of all writeHosts and readHosts in a dataHost. If multiple writeHosts are configured in a dataHost and the first writeHost fails, MyCAT automatically switches to the next available writeHost after the heartbeat detection fails for three times (default), and records the index of the current writeHost in the conf/dnindex.properties file. MyCAT only distributes SQL statements. The proactive synchronization of data sources depends on the functions of the MySQL database. If data consistency is strongly required, you are advised to set switchType to 2. After the first MySQL writeHost recovers, it is re-added to MyCAT and functions as a secondary node, and primary-secondary synchronization is reconfigured.