Rate This Document
Findability
Accuracy
Completeness
Readability

Replacing the _mm_extract_ps Function

This function is used to load the data in channels 0, 1, 2, and 3 from the results to channels a, b, c, and d.

For details about _mm_extract_ps, see Intrinsics Guide.

  • Code on x86:
    inline void DoExtractM128(__m128i results, uint32_t *a, uint32_t *b, uint32_t *c, uint32_t *d) { 
    *a = _mm_extract_ps((__v4sf)results, 0); 
    *b = _mm_extract_ps((__v4sf)results, 1); 
    *c = _mm_extract_ps((__v4sf)results, 2); 
    *d = _mm_extract_ps((__v4sf)results, 3); 
    }
  • Alternative for Kunpeng processors:
    #include <arm_neon.h>
    inline void DoExtractM128(int8x16_t results, uint32_t *a, uint32_t *b, uint32_t *c, uint32_t *d) { 
    *a = vgetq_lane_u32(vreinterpretq_u32_s8(results), 0); 
    *b = vgetq_lane_u32(vreinterpretq_u32_s8(results), 1); 
    *c = vgetq_lane_u32(vreinterpretq_u32_s8(results), 2); 
    *d = vgetq_lane_u32(vreinterpretq_u32_s8(results), 3); 
    }