Rate This Document
Findability
Accuracy
Completeness
Readability

Replacing the x86 movqu Assembly Instruction

Symptom

Error: unknown mnemonic 'movqu' -- 'movqu'

Cause

movqu is an instruction in the x86 instruction set and cannot be used on the Kunpeng platform. This instruction implements data copy between registers and from registers to addresses.

On x86 servers, movqu is used to:

  • Copy the content of the xmm2 register or 128-bit memory address to the xmm1 register.
    MOVDQU xmm1, xmm2/m128 
  • Copy the content of the xmm1 register [G(IC2] to a 128-bit memory address or xmm2 register.
    MOVDQU xmm2/m128, xmm1 

For more details, see:

https://c9x.me/x86/html/file_module_x86_id_184.html

Procedure

Example:

/*x86*/
void add_x86_asm(int* result, int* a, int* b, int len)
{
    __asm__("\n\t"
    "1:                                     \n\t"
    "movdqu (%[a]), %%xmm0                  \n\t"
    "movdqu (%[b]), %%xmm1                  \n\t"
    "pand %%xmm0, %%xmm1                    \n\t"
    "movdqu %%xmm1, (%[result])             \n\t"
 
    :[result]"+r"(result)                     //output
    :[a]"r"(a),[b]"r"(b),[len]"r"(len)        //input
    :"memory","xmm0","xmm1"
    );
    return;
}
 
 
/*Kunpeng*/
void and_neon_asm(int* result, int* a, int* b, int len)
{
    int num = {0};
    __asm__("\n\t"
    "1:                               \n\t"
    "ld1 {v0.16b}, [%[a]], #16        \n\t"
    "ld1 {v1.16b}, [%[b]], #16        \n\t"
    "and v0.16b, v0.16b, v1.16b       \n\t"
    "subs %[len],%[len],#4            \n\t"
    "st1 {v0.16b}, [%[result]], #16   \n\t"
    "bgt 1b                           \n\t"
    :[result]"+r"(result)                     //output
    :[a]"r"(a),[b]"r"(b),[len]"r"(len)        //input
    :"memory","v0","v1"
    );
    return;
}