_mm_extract_ps函数替换
函数功能:从results中分别加载其0、1、2、3通道内的数存储到a、b、c、d中。
_mm_extract_ps 详细说明,请参考Intrinsics Guide。
- x86上代码段:
inline void DoExtractM128(__m128i results, uint32_t *a, uint32_t *b, uint32_t *c, uint32_t *d) { *a = _mm_extract_ps((__v4sf)results, 0); *b = _mm_extract_ps((__v4sf)results, 1); *c = _mm_extract_ps((__v4sf)results, 2); *d = _mm_extract_ps((__v4sf)results, 3); }
- 在鲲鹏上替换后:
#include <arm_neon.h> inline void DoExtractM128(int8x16_t results, uint32_t *a, uint32_t *b, uint32_t *c, uint32_t *d) { *a = vgetq_lane_u32(vreinterpretq_u32_s8(results), 0); *b = vgetq_lane_u32(vreinterpretq_u32_s8(results), 1); *c = vgetq_lane_u32(vreinterpretq_u32_s8(results), 2); *d = vgetq_lane_u32(vreinterpretq_u32_s8(results), 3); }
父主题: 源码修改类案例