Encountering a Core Dump During an Atomic Operation on a Structure Variable
Symptom
When a program calls the atomic operation function to perform an atomic operation on a variable in a structure, a core dump occurs. The stack is as follows:
Program received signal SIGBUS, Bus error. 0x000000000040083c in main () at /root/test/src/main.c:19 19 __sync_add_and_fetch(&a.count, step); (gdb) disassemble Dump of assembler code for function main: 0x0000000000400824 <+0>: sub sp, sp, #0x10 0x0000000000400828 <+4>: mov x0, #0x1 // #1 0x000000000040082c <+8>: str x0, [sp, #8] 0x0000000000400830 <+12>: adrp x0, 0x420000 <__libc_start_main@got.plt> 0x0000000000400834 <+16>: add x0, x0, #0x31 // Puts the address of the variable to the x0 register. 0x0000000000400838 <+20>: ldr x1, [sp, #8] // Specifies the length of the data to be fetched by the LDXR instruction (8 bytes in this example). => 0x000000000040083c <+24>: ldxr x2, [x0] //The value is obtained by the LDXR instruction from the memory address pointed to by the x0 register. 0x0000000000400840 <+28>: add x2, x2, x1 0x0000000000400844 <+32>: stlxr w3, x2, [x0] 0x0000000000400848 <+36>: cbnz w3, 0x40083c <main+24> 0x000000000040084c <+40>: dmb ish 0x0000000000400850 <+44>: mov w0, #0x0 // #0 0x0000000000400854 <+48>: add sp, sp, #0x10 0x0000000000400858 <+52>: ret End of assembler dump. (gdb) p /x $x0 $4 = 0x420039 //The variable address stored in the x0 register is not 8-byte-aligned.
Cause
The Arm64 platform uses the LDAXR and STLXR instructions to perform atomic operations and lock operations on variables. To use these instructions, the variable addresses must be aligned based on the variable length. Otherwise, an exception will be triggered when the instructions are executed, causing a core dump.
Generally, the cause is that the structure is forcibly byte aligned in the code. As a result, the variable addresses are not aligned. If atomic operations or lock operations are performed on these variables, the problem is triggered.
Procedure
- Search for #pragma pack in the code. (This macro changes the default alignment mode of the compiler.)
- Find the byte-aligned structure and modify the code.
If the variables in the structure are used as input parameters of atomic operations, spin locks, mutex locks, semaphores, and read/write locks, the code needs to be modified to ensure that the variables are aligned based on the variable length.