Replacing the x86 movqu Assembly Instruction
Symptom
Error: unknown mnemonic 'movqu' -- 'movqu'
Cause
movqu is an instruction in the x86 instruction set and cannot be used on the Kunpeng platform. This instruction implements data copy between registers and from registers to addresses.
On x86 servers, movqu is used to:
- Copy the content of the xmm2 register or 128-bit memory address to the xmm1 register.
MOVDQU xmm1, xmm2/m128
- Copy the content of the xmm1 register [G(IC2] to a 128-bit memory address or xmm2 register.
MOVDQU xmm2/m128, xmm1
For more details, see:
Procedure
- For the first invocation, you can use the NEON ld1 instruction instead.
ld1 instruction: Load multiple 1-element structures to one, two, three or four registers
LD1 { Vt.T }, [Xn|SP]For details, see section 9.98 in the instruction set manual (http://infocenter.arm.com/help/topic/com.arm.doc.dui0802a/DUI0802A_armasm_reference_guide.pdf).
- For the second invocation, you can use the st1 instruction instead.
st1 instruction: Store multiple 1-element structures from one, two three or four registers
ST1 { Vt.T }, [Xn|SP]For details, see section 9.202 in the instruction set manual (http://infocenter.arm.com/help/topic/com.arm.doc.dui0802a/DUI0802A_armasm_reference_guide.pdf).
Example:
/*x86*/
void add_x86_asm(int* result, int* a, int* b, int len)
{
__asm__("\n\t"
"1: \n\t"
"movdqu (%[a]), %%xmm0 \n\t"
"movdqu (%[b]), %%xmm1 \n\t"
"pand %%xmm0, %%xmm1 \n\t"
"movdqu %%xmm1, (%[result]) \n\t"
:[result]"+r"(result) //output
:[a]"r"(a),[b]"r"(b),[len]"r"(len) //input
:"memory","xmm0","xmm1"
);
return;
}
/*Kunpeng*/
void and_neon_asm(int* result, int* a, int* b, int len)
{
int num = {0};
__asm__("\n\t"
"1: \n\t"
"ld1 {v0.16b}, [%[a]], #16 \n\t"
"ld1 {v1.16b}, [%[b]], #16 \n\t"
"and v0.16b, v0.16b, v1.16b \n\t"
"subs %[len],%[len],#4 \n\t"
"st1 {v0.16b}, [%[result]], #16 \n\t"
"bgt 1b \n\t"
:[result]"+r"(result) //output
:[a]"r"(a),[b]"r"(b),[len]"r"(len) //input
:"memory","v0","v1"
);
return;
}