segmentation fault - LLVM generated vector math assembly segfaulting -
the following llvm function takes 3 pointers arrays of 5 doubles , calculates c = a*b + c:
define void @my_vector_math(double* %a1, double* %b2, double* %c3) { entry: %0 = bitcast double* %c3 <5 x double>* %1 = load <5 x double>* %0, align 64 %2 = bitcast double* %b2 <5 x double>* %3 = load <5 x double>* %2, align 64 %4 = bitcast double* %a1 <5 x double>* %5 = load <5 x double>* %4, align 64 %6 = fmul <5 x double> %3, %5 %7 = fadd <5 x double> %1, %6 store <5 x double> %7, <5 x double>* %0, align 64 ret void }
it compiles down following assembly on machine (64 bit ubuntu 13.10, llvm 3.4):
30: c5 fb 10 42 20 vmovsd 0x20(%rdx),%xmm0 35: c5 fb 10 4e 20 vmovsd 0x20(%rsi),%xmm1 3a: c5 fd 28 16 vmovapd (%rsi),%ymm2 3e: c5 fb 10 5f 20 vmovsd 0x20(%rdi),%xmm3 43: c5 f5 59 cb vmulpd %ymm3,%ymm1,%ymm1 47: c5 ed 59 17 vmulpd (%rdi),%ymm2,%ymm2 4b: c5 fd 58 c1 vaddpd %ymm1,%ymm0,%ymm0 4f: c5 ed 58 0a vaddpd (%rdx),%ymm2,%ymm1 53: c5 fd 29 0a vmovapd %ymm1,(%rdx) 57: c5 f9 13 42 20 vmovlpd %xmm0,0x20(%rdx) 5c: c5 f8 77 vzeroupper 5f: c3 retq
called arrays of size 5 segfaults, produce correct results oversized arrays:
int main() { double a[] = {1,1,1,1,1}; double b[] = {2,2,2,2,2}; double c[] = {3,3,3,3,3}; my_vector_math(a, b, c); // segfaults (int = 0; < 5; i++) { printf("%f\n",c[i]); } return 0; }
i'm having lot of trouble figuring out why happening, pointers appreciated.
edit:
the llvm ir without optimization passes:
define void @my_vector_math(double* %a1, double* %b2, double* %c3) { entry: %a = alloca <5 x double> %b = alloca <5 x double> %c = alloca <5 x double> %c = alloca double* %b = alloca double* %a = alloca double* store double* %a1, double** %a store double* %b2, double** %b store double* %c3, double** %c %c4 = load double** %c %0 = bitcast double* %c4 <5 x double>* %1 = load <5 x double>* %0 store <5 x double> %1, <5 x double>* %c %b5 = load double** %b %2 = bitcast double* %b5 <5 x double>* %3 = load <5 x double>* %2 store <5 x double> %3, <5 x double>* %b %a6 = load double** %a %4 = bitcast double* %a6 <5 x double>* %5 = load <5 x double>* %4 store <5 x double> %5, <5 x double>* %a %c7 = load <5 x double>* %c %a8 = load <5 x double>* %a %b9 = load <5 x double>* %b %6 = fmul <5 x double> %a8, %b9 %7 = fadd <5 x double> %c7, %6 %c10 = load double** %c %8 = bitcast double* %c10 <5 x double>* store <5 x double> %7, <5 x double>* %8 ret void }
your ir claims arrays of doubles have 64-byte alignment, causes compiler generate aligned load. intended specify 8-byte alignment instead (the natural alignment of double on many platforms).
Comments
Post a Comment