MySQL Read/Write Deadlock Due to ARM Weak Memory Sequence
Problem Description
Sysbench is used to perform read and write tests on MySQL 5.7 Community Edition. After a period of time, MySQL 5.7 deadlock occurs and the test stops.
Test Environment
- Network topology
Sysbench is installed on an x86 server and MySQL database is installed on an ARM server and connected via a switch.
Software Type
CPU
DDR
Drive Type
OS
Sysbench
E5-2695 v4/2.10 GHz/72-core
384GB
SATA
Euler 2.0 (4.16.2)
MySQL
Kunpeng 920/2.5 GHz/96-core
512GB
SSD
Ubuntu 16.04 (4.4.0)
- Dataset
The test database contains 64 tables, and each table contains 10,000,000 rows of data.
- Test script
Use the oltp.lua file of sysbench to perform the test. The test command is as follows:
1/ssd/zqsource/rds-perf/source/sysbench-0.5/sysbench/sysbench --test=/ssd/zqsource/rds-perf/source/sysbench-0.5/sysbench/tests/db/oltp.lua --oltp-tables-count=64 --oltp-table-size=10000000 --mysql-host=$HOST --mysql-port=$PORT --mysql-db=$TABLE --mysql-user=$USER --mysql-password=$PASSWD --oltp-read-only=off --oltp-point-selects=10 --oltp-simple-ranges=1 --oltp-sum-ranges=1 --oltp-order-ranges=1 --oltp-distinct-ranges=1 --max-requests=0 --max-time=$TIME --report-interval=5 --num-threads=$THREAD run
- MySQL configuration file
[mysqld_safe] log-error=/data/mysql/log/mysql.log pid-file=/data/mysql/run/mysqld.pid [client] socket=/data/mysql/run/mysql.sock default-character-set=utf8mb4 [mysqld] basedir=/home/mysql-8.0.17-linux-glibc2.12-x86_64 basedir=/usr/local/mysql tmpdir=/data/mysql/tmp datadir=/data/mysql/data socket=/data/mysql/run/mysql.sock port=3306 user=root default_authentication_plugin=mysql_native_password innodb_page_size=4k max_connections=2000 back_log=2048 performance_schema=OFF max_prepared_stmt_count=128000 file innodb_file_per_table innodb_log_file_size=2048M innodb_log_files_in_group=32 innodb_open_files=10000 table_open_cache_instances=64 buffers innodb_buffer_pool_size=230G innodb_buffer_pool_instances=16 innodb_log_buffer_size=2048M tune sync_binlog=1 innodb_flush_log_at_trx_commit=1 innodb_use_native_aio=1 innodb_spin_wait_delay=180 innodb_sync_spin_loops=25 innodb_flush_method=O_DIRECT innodb_io_capacity=30000 innodb_io_capacity_max=40000 innodb_lru_scan_depth=9000 innodb_page_cleaners=16 innodb_spin_wait_pause_multiplier=25 perf special innodb_flush_neighbors=0 innodb_write_io_threads=24 innodb_read_io_threads=16 innodb_purge_threads=32 sql_mode=STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,NO_AUTO_VALUE_ON_ZERO,STRICT_ALL_TABLES log-bin=mysql-bin ssl=0 table_open_cache=15000
Reproduction Procedure
Procedure for reproducing the fault:
- Run the mysqld_safe command to start MySQL.
- Run the test command on the sysbench server to perform the read/write test.
- After a long-time test, the TPS returned by sysbench is 0. Connect to the MySQL database on the MySQL server to query data in the test table. No data is returned for a long time.
Analysis
When a deadlock occurs, the gdb command is used to view the thread stack information of the MySQL database. Most threads are waiting for locks.

According to code analysis, ARM uses to relaxed memory models, while x86 use a stronger isotonic model based on the total store ordering (TSO) model.
For the following pseudocode:

After abstraction, the following pseudocode is obtained:

On x86 servers, r1 = 1 and r2 = 0 are impossible. However, on ARM servers, r1 = 1 and r2 = 0 may occur. MySQL Mutex and RWLock are implemented based on the atomic operation and lock-free algorithm, the out-of-order problem of this module is especially serious. For other modules or programs, if the critical section is protected by a correct lock, the modules or programs can run properly. To ensure the logic correctness, a memory barrier needs to be added to the ARM version, and the read/write sequence of keywords needs to be focused on.

After being tested on Qualcomm ARM servers in 2017, MariaDB uses the __atomic functions to modify the implementation of RWLock and Mutex. The __atomic functions can ensure SC (memory sequence needs to be specified) and can be used across platforms. The read/write sequence of keywords such as lock_word and lock->waiters can be protected to resolve the MySQL deadlock problem.
Conclusion
- Currently, only MariaDB supports ARM servers.
https://mariadb.com/resources/blog/mariadb-server-102-now-available-qualcomm-centriqtm-2400-server-processor
- MySQL 8.0 supports ARM servers, which is explicitly stated in the MySQL official document:
https://www.mysql.com/support/supportedplatforms/database.html
- Available MariaDB versions:
