- _mm_storeu_si128() is not the right intrinsic, expects 128-bit alignment
- Would want to use _mm_storeu_epi32(), but that requires AVX-512
- Instead use _mm_storeu_ps() with casting from __m128 to __m128i, and from REAL4 to INT4. Since presumably __m128 and __m128i have the same layout, and REAL4 and INT4 have the same size and alignment, this should be safe; the storeu intrinsic just copies bits, it shouldn't care what those bits are
API Changes and Justification
Backwards Compatible Changes
- This change introduces no API changes
- This change adds new API calls
Backwards Incompatible Changes
- This change modifies an existing API
- This change removes an existing API
If any of the Backwards Incompatible check boxes are ticked please provide a justification why this change is necessary and why it needs to be done in a backwards incompatible way.
Please provide details on any reviews related to this change and and the associated reviewers.