From 2ab4d61cd4b632c0e991c781f3c15f3b054d1bbd Mon Sep 17 00:00:00 2001
From: eschnett <eschnett@105869f7-3296-0410-a4ea-f4349344b45a>
Date: Mon, 6 Jun 2011 10:11:44 +0000
Subject: Introduce Cactus options for vectorisation

Introduce configuration-time options for vectorisation, including
options to allow architecture-specific choices that may influence
performance.

Introduce "middle" masked stores for large vector sizes and small
loops.

Clean up and simplify some of the implementation code.


git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@10 105869f7-3296-0410-a4ea-f4349344b45a
---
 README | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

(limited to 'README')

diff --git a/README b/README
index a49408d..40a19a7 100644
--- a/README
+++ b/README
@@ -6,4 +6,49 @@ Licence      : GPL
 
 1. Purpose
 
-Provide a C++ class template that helps vectorisation.
+Provide C macro definitions and a C++ class template that help
+vectorisation.
+
+
+
+2. Build-time choices
+
+Several choices can be made via configuration options, which can be
+set to "yes" or "no":
+
+VECTORISE (default "no"): Vectorise. Otherwise, scalar code is
+generated, and the other options have no effect.
+
+
+
+VECTORISE_ALIGNED_ARRAYS (default "no", experimental): Assume that all
+arrays have an extent in the x direction that is a multiple of the
+vector size. This allows aligned load operations e.g. for finite
+differencing operators in the y and z directions. (Setting this
+produces faster code, but may lead to segfaults if the assumption is
+not true.)
+
+VECTORISE_ALWAYS_USE_UNALIGNED_LOADS (default "no", experimental):
+Replace all aligned load operations with unaligned load operations.
+This may simplify some code where alignment is unknown at compile
+time. This should never lead to better code, since the default is to
+use aligned load operations iff the alignment is known to permit this
+at build time. This options is probably useless.
+
+VECTORISE_ALWAYS_USE_ALIGNED_LOADS (default "no", experimental):
+Replace all unaligned load operations by (multiple) aligned load
+operations and corresponding vector-gather operations. This may be
+beneficial if unaligned load operations are slow, and if vector-gather
+operations are fast.
+
+VECTORISE_INLINE (default "yes"): Inline functions into the loop body
+as much as possible. (Disabling this may reduce code size, which can
+improve performance if the instruction cache is small.)
+
+VECTORISE_STREAMING_STORES (default "yes"): Use streaming stores, i.e.
+use store operations that bypass the cache. (Disabling this produces
+slower code.)
+
+VECTORISE_EMULATE_AVX (default "no", experimental): Emulate AVX
+instructions with SSE2 instructions. This produces slower code, but
+can be used to test AVX code on systems that don't support AVX.
-- 
cgit v1.2.3