diff options
author | eschnett <eschnett@105869f7-3296-0410-a4ea-f4349344b45a> | 2011-06-06 10:11:44 +0000 |
---|---|---|
committer | eschnett <eschnett@105869f7-3296-0410-a4ea-f4349344b45a> | 2011-06-06 10:11:44 +0000 |
commit | 2ab4d61cd4b632c0e991c781f3c15f3b054d1bbd (patch) | |
tree | 6664b1e9ee360ee0abf9df6b9a5562eb5bdc88c5 /README | |
parent | 5d4858e0736a0c0881c65b9e9ac0983d3b5bb24b (diff) |
Introduce Cactus options for vectorisation
Introduce configuration-time options for vectorisation, including
options to allow architecture-specific choices that may influence
performance.
Introduce "middle" masked stores for large vector sizes and small
loops.
Clean up and simplify some of the implementation code.
git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@10 105869f7-3296-0410-a4ea-f4349344b45a
Diffstat (limited to 'README')
-rw-r--r-- | README | 47 |
1 files changed, 46 insertions, 1 deletions
@@ -6,4 +6,49 @@ Licence : GPL 1. Purpose -Provide a C++ class template that helps vectorisation. +Provide C macro definitions and a C++ class template that help +vectorisation. + + + +2. Build-time choices + +Several choices can be made via configuration options, which can be +set to "yes" or "no": + +VECTORISE (default "no"): Vectorise. Otherwise, scalar code is +generated, and the other options have no effect. + + + +VECTORISE_ALIGNED_ARRAYS (default "no", experimental): Assume that all +arrays have an extent in the x direction that is a multiple of the +vector size. This allows aligned load operations e.g. for finite +differencing operators in the y and z directions. (Setting this +produces faster code, but may lead to segfaults if the assumption is +not true.) + +VECTORISE_ALWAYS_USE_UNALIGNED_LOADS (default "no", experimental): +Replace all aligned load operations with unaligned load operations. +This may simplify some code where alignment is unknown at compile +time. This should never lead to better code, since the default is to +use aligned load operations iff the alignment is known to permit this +at build time. This options is probably useless. + +VECTORISE_ALWAYS_USE_ALIGNED_LOADS (default "no", experimental): +Replace all unaligned load operations by (multiple) aligned load +operations and corresponding vector-gather operations. This may be +beneficial if unaligned load operations are slow, and if vector-gather +operations are fast. + +VECTORISE_INLINE (default "yes"): Inline functions into the loop body +as much as possible. (Disabling this may reduce code size, which can +improve performance if the instruction cache is small.) + +VECTORISE_STREAMING_STORES (default "yes"): Use streaming stores, i.e. +use store operations that bypass the cache. (Disabling this produces +slower code.) + +VECTORISE_EMULATE_AVX (default "no", experimental): Emulate AVX +instructions with SSE2 instructions. This produces slower code, but +can be used to test AVX code on systems that don't support AVX. |