c - Segmentation fault using OpenMp and SSE -
c - Segmentation fault using OpenMp and SSE -
i'm getting started experimenting adding openmp sse code.
my first test programme crashes in _mm_set_ps, works when set if (0).
it looks simple must missing obvious. i'm compiling gcc -fopenmp -g -march=core2 -pthreads
#include <stdio.h> #include <stdlib.h> #include <immintrin.h> int main() { #pragma omp parallel if (1) { #pragma omp sections { #pragma omp section { __m128 x1 = _mm_set_ps ( 1.1f, 2.1f, 3.1f, 4.1f ); } #pragma omp section { __m128 x2 = _mm_set_ps ( 1.2f, 2.2f, 3.2f, 4.2f ); } } // end omp sections } //end omp parallel homecoming 0; }
this bug in openmp implementation. having same problem in gcc on windows (mingw). -mstackrealign
command line alternative solved problem. adds instruction prolog of every function realign stack @ 16-byte boundary. didn't notice performance penalty. can seek add together __attribute__ ((force_align_arg_pointer))
function declaration, should same, specific function. might have set sse code in separate function phone call function #pragma omp, stack has chance realigned.
i stopped having problem when moved onto compiling 64-bit target (mingw64, such tdm gcc build).
i playing avx instructions require 32-byte alignment, gcc doesn't back upwards windows @ all. forced me prepare produced assembly code using python script, works.
c gcc openmp sse
Comments
Post a Comment