Rate This Document
Findability
Accuracy
Completeness
Readability

Replacing the x86 pand Assembly Instruction

Symptom

Error: unknown mnemonic 'pand' -- 'pand'

Cause

pand is an x86 instruction and cannot be used on Kunpeng devices. It is used to perform a bitwise AND operation in either of the following ways:

  • Perform a bitwise AND operation on the source operand (an xmm2 register or a 128-bit memory location) and the destination operand (xmm1), and store the result in xmm2.
    PAND xmm1, xmm2/m128
  • Perform a bitwise AND operation on the source operand (an mm2 register or a 64-bit memory location) and the destination operand (mm1), and store the result in mm2.
    PAND mm1, mm2/m64

For more details, see:

https://c9x.me/x86/html/file_module_x86_id_230.html

Procedure

On the Kunpeng platform, use the NEON instruction AND and store data in a 64-bit or 128-bit vector register.

AND Vd.<T>, Vn.<T>, Vm.<T>

Bitwise AND (vector). Where <T> is 8B or 16B (though an assembler should accept any valid format).

Vn and Vm are the registers to be operated, Vd is the destination register, and <T> is the number of bits of the selected register.

For details, see section 9.7 in the instruction set manual (http://infocenter.arm.com/help/topic/com.arm.doc.dui0802a/DUI0802A_armasm_reference_guide.pdf).

The following is a simple process of using the NEON instruction AND to perform the bitwise AND operation on data.

 
/*
* Function: performs a bitwise AND operation on arrays a and b and stores the result in result.
* The neon instruction processes 16-byte data each time. Therefore, the data length is an integral multiple of 16 bytes.
 */
void and_neon_asm(int* result, int* a, int* b, int len)
{
    __asm__("\n\t"
    "1:                               \n\t"
    "ld1 {v0.16b}, [%[a]], #16        \n\t"
    "ld1 {v1.16b}, [%[b]], #16        \n\t"
    "and v0.16b, v0.16b, v1.16b       \n\t"
    "subs %[len],%[len],#4            \n\t"
    "st1 {v0.16b}, [%[result]], #16   \n\t"
    "bgt 1b                           \n\t"
    :[result]"+r"(result)                     //output
    :[a]"r"(a),[b]"r"(b),[len]"r"(len)        //input
    :"memory","v0","v1"
    );
    return;
}