[Insight-developers] Empty FixedArray destructor: Performance hit using gcc (times 2) : __attribute__ ((aligned (8)))

Tom Vercauteren tom.vercauteren at m4x.org
Fri Jun 6 10:23:10 EDT 2008


Hi Luis,


> Trying to understand why this alignment happens, we have reduced
> the test to minimalistic implementation of FixedArray:
>
>  a) When the destructor exists, an array of MyArray(s) is
>     allocated in a 4byte boundary
>
>  b) When the destructor does not exists, an array of
>     MyArray(s) is allocated in a 8byte boundary
>
>
> Then, by Googling about it we found this GCC flag:
>
>              -malign-double
>
> When compiling with this flag, your test is always aligned
> to 8 bytes, regardless of whether the destructor is present
> or not.

Thanks for this interesting observation and for the compilation flag hint.


> We still have not answered the fundamental question:
>
>   Why is that the presence of a non-virtual destructor
>   changes the alignment ?

I do agree that this should be the right question.


> The Attribute:
>         __attribute__ ((aligned (8)))
> also does the trick.

Thanks again!


> we can compile without -malign-double and the structure
> is still aligned to 8 bytes, despites the fact that the
> destructor is sill present.
>
>
> We could create ITK macros for this attribute options,
> and define the macros at configuration time by using
> TRY_COMPILES.
>
>
> At first sight, it is much better than the global
> -malign-double option, and we can apply it only
> to structures that we know must be aligned.
>
>
> One challenge here is that although we want FixedArray<double,N>
> to be 8-bytes aligned, we don't always want the FixedArray<T,N>
> to be aligned this way.  For example: FixedArray<char,3>  ??
>
>
> One option could be to create your pixel type as a class derivied
> from FixedArray<double,2>, and see if we can apply the attribute
> just to the derived class....
>
> In this way, this will be an application specific issue, as opposed
> to something that has to be done pervasively in ITK.

Well I would really vote for removing the empty destructor
implementation. This seems to be the cleanest way to solve the
problem. The empty destructor is useless and only interferes here. I
don't really like changing compilation flags or adding macros and
adding a new derived class would be confusing for the user.

Tom


More information about the Insight-developers mailing list