Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 7/13/2023
Public
Document Table of Contents

Miscellaneous Intrinsics

The prototypes for Intel® Streaming SIMD Extensions (Intel® SSE) intrinsics for miscellaneous operations are in the xmmintrin.h header file.

To use these intrinsics, include the immintrin.h file as follows:

#include <immintrin.h>

The results of each intrinsic operation are placed in registers. The information about what is placed in each register appears in the tables below, in the detailed explanation of each intrinsic. R, R0, R1, R2, and R3 represent the registers in which results are placed.

Intrinsic Name

Operation

Corresponding
Intel® SSE Instruction

_mm_shuffle_ps

Shuffle

SHUFPS

_mm_unpackhi_ps

Unpack High

UNPCKHPS

_mm_unpacklo_ps

Unpack Low

UNPCKLPS

_mm_move_ss

Set low word, pass in three high values

MOVSS

_mm_movehl_ps

Move High to Low

MOVHLPS

_mm_movelh_ps

Move Low to High

MOVLHPS

_mm_movemask_ps

Create four-bit mask

MOVMSKPS

_mm_undefined_ps

Return vector of type __m128 with undefined elements.

This is a utility intrinsic that returns some arbitrary value.

_mm_shuffle_ps

__m128 _mm_shuffle_ps(__m128 a, __m128 b, unsigned int imm8);

Selects four specific SP FP values from a and b, based on the mask imm8. The mask must be an immediate. See Macro Function for Shuffle Using Intel® Streaming SIMD Extensions for a description of the shuffle semantics.

_mm_unpackhi_ps

__m128 _mm_unpackhi_ps(__m128 a, __m128 b);

Selects and interleaves the upper two SP FP values from a and b.

R0

R1

R2

R3

a2

b2

a3

b3

_mm_unpacklo_ps

__m128 _mm_unpacklo_ps(__m128 a, __m128 b);

Selects and interleaves the lower two SP FP values from a and b.

R0

R1

R2

R3

a0

b0

a1

b1

_mm_move_ss

__m128 _mm_move_ss( __m128 a, __m128 b);

Sets the low word to the SP FP value of b. The upper three SP FP values are passed through from a.

R0

R1

R2

R3

b0

a1

a2

a3

_mm_movehl_ps

__m128 _mm_movehl_ps(__m128 a, __m128 b);

Moves the upper two SP FP values of b to the lower two SP FP values of the result. The upper two SP FP values of a are passed through to the result.

R0

R1

R2

R3

b2

b3

a2

a3

_mm_movelh_ps

__m128 _mm_movelh_ps(__m128 a, __m128 b);

Moves the lower two SP FP values of b to the upper two SP FP values of the result. The lower two SP FP values of a are passed through to the result.

R0

R1

R2

R3

a0

a1

b0

b1

_mm_movemask_ps

int _mm_movemask_ps(__m128 a);

Creates a 4-bit mask from the most significant bits of the four SP FP values.

R

sign(a3)<<3 | sign(a2)<<2 | sign(a1)<<1 | sign(a0)

_mm_undefined_ps

extern __m128 _mm_undefined_ps(void);

Returns a vector of four single precision floating point elements. The content of the vector is not specified. The result is usually used as an argument to another intrinsic that requires all operands to be initialized, and when the content of a particular argument does not matter. This intrinsic is declared in the immintrin.h header file. It typically maps to a read of some XMM register and gets whatever value happens to live in that register at the time of the read.

For example, you can use such an intrinsic when you need to calculate a sum of packed double-precision floating-point values located in the xmm register.

See Also