c - How to tell GCC that a pointer argument is always double-word-aligned? -


in program have function simple vector addition c[0:15] = a[0:15] + b[0:15]. function prototype is:

void vecadd(float * restrict a, float * restrict b, float * restrict c); 

on our 32-bit embedded architecture there load/store option of loading/storing double words, like:

r16 = 0x4000  ; strd r0,[r16] ; stores r0 in [0x4000] , r1 in [0x4004] 

the gcc optimizer recognizes vector nature of loop , generates 2 branches of code - 1 case 3 arrays double word aligned (so uses double load/store instructions) , other case arrays word-aligned (where uses single load/store option).

the problem address alignment check costly relative addition part , want eliminate hinting compiler a, b , c 8-aligned. there modifier add pointer declaration tell compiler?

the arrays used calling function have aligned(8) attribute, not reflected in function code itself. possible add attribute function parameters?

following piece of example code i've found on system, tried following solution, incorporate ideas few of answers given earlier: basically, create union of small array of floats 64-bit type - in case simd vector of floats - , call function cast of operand float arrays:

typedef float f2 __attribute__((vector_size(8))); typedef union { f2 v; float f[2]; } simdfu;  void vecadd(f2 * restrict a, f2 * restrict b, f2 * restrict c);  float a[16] __attribute__((aligned(8))); float b[16] __attribute__((aligned(8))); float c[16] __attribute__((aligned(8)));  int main() {     vecadd((f2 *) a, (f2 *) b, (f2 *) c);     return 0; } 

now compiler not generate 4-aligned branch.

however, __builtin_assume_aligned() preferable solution, preventing cast , possible side effects, if worked...

edit: noticed builtin function buggy on our implementation (i.e, not doesn't work, causes calculation errors later in code.


Comments

Popular posts from this blog

jasper reports - Fixed header in Excel using JasperReports -

media player - Android: mediaplayer went away with unhandled events -

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -