Rate This Document
Findability
Accuracy
Completeness
Readability

Encountering a Core Dump During an Atomic Operation on a Structure Variable

Symptom

When a program calls the atomic operation function to perform an atomic operation on a variable in a structure, a core dump occurs. The stack is as follows:

Program received signal SIGBUS, Bus error. 
0x000000000040083c in main () at /root/test/src/main.c:19 
19          __sync_add_and_fetch(&a.count, step); 
(gdb) disassemble 
Dump of assembler code for function main: 
   0x0000000000400824 <+0>:     sub     sp, sp, #0x10 
   0x0000000000400828 <+4>:     mov     x0, #0x1                        // #1 
   0x000000000040082c <+8>:     str     x0, [sp, #8] 
   0x0000000000400830 <+12>:    adrp    x0, 0x420000 <__libc_start_main@got.plt> 
0x0000000000400834 <+16>:    add     x0, x0, #0x31 // Puts the address of the variable to the x0 register.
   0x0000000000400838 <+20>:    ldr     x1, [sp, #8] // Specifies the length of the data to be fetched by the LDXR instruction (8 bytes in this example).
=> 0x000000000040083c <+24>:    ldxr    x2, [x0] //The value is obtained by the LDXR instruction from the memory address pointed to by the x0 register.
   0x0000000000400840 <+28>:    add     x2, x2, x1 
   0x0000000000400844 <+32>:    stlxr   w3, x2, [x0] 
   0x0000000000400848 <+36>:    cbnz    w3, 0x40083c <main+24> 
   0x000000000040084c <+40>:    dmb     ish 
   0x0000000000400850 <+44>:    mov     w0, #0x0                        // #0 
   0x0000000000400854 <+48>:    add     sp, sp, #0x10 
   0x0000000000400858 <+52>:    ret 
End of assembler dump. 
(gdb) p /x $x0 
$4 = 0x420039 //The variable address stored in the x0 register is not 8-byte-aligned.

Cause

The Arm64 platform uses the LDAXR and STLXR instructions to perform atomic operations and lock operations on variables. To use these instructions, the variable addresses must be aligned based on the variable length. Otherwise, an exception will be triggered when the instructions are executed, causing a core dump.

Generally, the cause is that the structure is forcibly byte aligned in the code. As a result, the variable addresses are not aligned. If atomic operations or lock operations are performed on these variables, the problem is triggered.

Procedure

  1. Search for #pragma pack in the code. (This macro changes the default alignment mode of the compiler.)
  2. Find the byte-aligned structure and modify the code.

    If the variables in the structure are used as input parameters of atomic operations, spin locks, mutex locks, semaphores, and read/write locks, the code needs to be modified to ensure that the variables are aligned based on the variable length.