Replacing the x86 pcmpestri Assembly Instruction
Symptom
Error: unknown mnemonic 'pcmpestri' -- 'pcmpestri'
Cause
Similar to the pcmpestrm instruction, pcmpestri is an instruction in the x86 SSE4 instruction set. It is used to determine whether the byte of the string str2 appears in str1 according to the specified comparison mode, and return the matched position index (the position whose first matching result is 0). Similarly, you need to fully understand the function of the instruction and re-implement the function using C code.
For details about this instruction, see:
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=SSE4_2&expand=834
https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2010/bb531465(v=vs.100)
Procedure
The following code calls the pcmpestri instruction in Impala. The pcmpestri instruction is encapsulated into SSE4_cmpestri by referring to the implementation of the Intel _mm_cmpestri interface.
template<int MODE>
static inline int SSE4_cmpestri(__m128i str1, int len1, __m128i str2, int len2) {
int result;
__asm__ __volatile__("pcmpestri %5, %2, %1": "=c"(result) : "x"(str1), "xm"(str2), "a"(len1), "d"(len2), "i"(MODE) : "cc");
return result;
}
According to the instruction description, the operations vary with the comparison mode, and too many lines of code are required to implement the instruction's function. Based on the interface called in the code, the mode PCMPSTR_EQUAL_EACH | PCMPSTR_UBYTE_OPS | PCMPSTR_NEG_POLARITY is used. That is, the system performs matching based on the byte length and checks whether the characters in the corresponding positions of str1 and str2 are the same. If yes, the system sets the corresponding bit to 1 and outputs the position where 1 appears for the first time.
The code is implemented as follows:
#include <arm_neon.h>
template <int MODE>
static inline int SSE4_cmpestri(int32x4_t str1, int len1, int32x4_t str2, int len2)
{
__oword a, b;
a.m128i = str1;
b.m128i = str2;
int len_s, len_l;
if (len1 > len2)
{
len_s = len2;
len_l = len1;
}
else
{
len_s = len1;
len_l = len2;
}
//In the following example, the mode is STRCMP_MODE =
// PCMPSTR_EQUAL_EACH | PCMPSTR_UBYTE_OPS | PCMPSTR_NEG_POLARITY
int result;
int i;
for (i = 0; i < len_s; i++)
{
if (a.m128i_u8[i] == b.m128i_u8[i])
{
break;
}
}
result = i;
if (result == len_s)
{
result = len_l;
}
return result;
}