aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorschnetter <schnetter@2e825fa2-fb71-486d-8b7f-a5ff3f0f6cb8>2002-10-22 14:58:59 +0000
committerschnetter <schnetter@2e825fa2-fb71-486d-8b7f-a5ff3f0f6cb8>2002-10-22 14:58:59 +0000
commit2f751a2911f964a8e471b4206352d913b968a7bf (patch)
treec86ae0eba378ddc832c1b17a7e3cd1423bb08446
parente4f4fa422aa52730e0a417120d527694d13e09c3 (diff)
A generic slabbing thorn.
git-svn-id: http://svn.cactuscode.org/arrangements/CactusNumerical/Slab/trunk@2 2e825fa2-fb71-486d-8b7f-a5ff3f0f6cb8
-rw-r--r--README11
-rw-r--r--doc/documentation.tex147
-rw-r--r--interface.ccl6
-rw-r--r--param.ccl2
-rw-r--r--schedule.ccl2
-rw-r--r--src/make.code.defn9
-rw-r--r--src/slab.c679
-rw-r--r--src/slab.h101
8 files changed, 957 insertions, 0 deletions
diff --git a/README b/README
new file mode 100644
index 0000000..bfd0c01
--- /dev/null
+++ b/README
@@ -0,0 +1,11 @@
+CVS info : $Header$
+
+Cactus Code Thorn Slab
+Thorn Author(s) : Erik Schnetter <schnetter@uni-tuebingen.de>
+Thorn Maintainer(s) : Erik Schnetter <schnetter@uni-tuebingen.de>
+--------------------------------------------------------------------------
+
+Purpose of the thorn:
+
+This thorn provides routines that use MPI to transfer slabs of arrays
+between processors.
diff --git a/doc/documentation.tex b/doc/documentation.tex
new file mode 100644
index 0000000..4c5b203
--- /dev/null
+++ b/doc/documentation.tex
@@ -0,0 +1,147 @@
+% *======================================================================*
+% Cactus Thorn template for ThornGuide documentation
+% Author: Ian Kelley
+% Date: Sun Jun 02, 2002
+% $Header$
+%
+% Thorn documentation in the latex file doc/documentation.tex
+% will be included in ThornGuides built with the Cactus make system.
+% The scripts employed by the make system automatically include
+% pages about variables, parameters and scheduling parsed from the
+% relevent thorn CCL files.
+%
+% This template contains guidelines which help to assure that your
+% documentation will be correctly added to ThornGuides. More
+% information is available in the Cactus UsersGuide.
+%
+% Guidelines:
+% - Do not change anything before the line
+% % START CACTUS THORNGUIDE",
+% except for filling in the title, author, date etc. fields.
+% - Each of these fields should only be on ONE line.
+% - Author names should be sparated with a \\ or a comma
+% - You can define your own macros are OK, but they must appear after
+% the START CACTUS THORNGUIDE line, and do not redefine standard
+% latex commands.
+% - To avoid name clashes with other thorns, 'labels', 'citations',
+% 'references', and 'image' names should conform to the following
+% convention:
+% ARRANGEMENT_THORN_LABEL
+% For example, an image wave.eps in the arrangement CactusWave and
+% thorn WaveToyC should be renamed to CactusWave_WaveToyC_wave.eps
+% - Graphics should only be included using the graphix package.
+% More specifically, with the "includegraphics" command. Do
+% not specify any graphic file extensions in your .tex file. This
+% will allow us (later) to create a PDF version of the ThornGuide
+% via pdflatex. |
+% - References should be included with the latex "bibitem" command.
+% - use \begin{abstract}...\end{abstract} instead of \abstract{...}
+% - For the benefit of our Perl scripts, and for future extensions,
+% please use simple latex.
+%
+% *======================================================================*
+%
+% Example of including a graphic image:
+% \begin{figure}[ht]
+% \begin{center}
+% \includegraphics[width=6cm]{MyArrangement_MyThorn_MyFigure}
+% \end{center}
+% \caption{Illustration of this and that}
+% \label{MyArrangement_MyThorn_MyLabel}
+% \end{figure}
+%
+% Example of using a label:
+% \label{MyArrangement_MyThorn_MyLabel}
+%
+% Example of a citation:
+% \cite{MyArrangement_MyThorn_Author99}
+%
+% Example of including a reference
+% \bibitem{MyArrangement_MyThorn_Author99}
+% {J. Author, {\em The Title of the Book, Journal, or periodical}, 1 (1999),
+% 1--16. {\tt http://www.nowhere.com/}}
+%
+% *======================================================================*
+
+% If you are using CVS use this line to give version information
+% $Header$
+
+\documentclass{article}
+
+% Use the Cactus ThornGuide style file
+% (Automatically used from Cactus distribution, if you have a
+% thorn without the Cactus Flesh download this from the Cactus
+% homepage at www.cactuscode.org)
+\usepackage{../../../../doc/ThornGuide/cactus}
+
+\begin{document}
+
+% The author of the documentation
+\author{Erik Schnetter \textless schnetter@uni-tuebingen.de\textgreater}
+
+% The title of the document (not necessarily the name of the Thorn)
+\title{Generic Hyperslabbing}
+
+% the date your document was last changed, if your document is in CVS,
+% please use:
+\date{$ $Date$ $}
+
+\maketitle
+
+% Do not delete next line
+% START CACTUS THORNGUIDE
+
+% Add all definitions used in this documentation here
+% \def\mydef etc
+
+% Add an abstract for this thorn's documentation
+\begin{abstract}
+The Slab thorn provides generic slabbing facilities. A slab is a
+sub-array of another array. Both can be multidimensional, and the
+slab can have a non-unit stride. The Slab thorn provides a routine to
+copy a slab from one array into a slab of another array, while
+possibly transposing or inverting the slab. This combines get-slab
+and put-slab interfaces.
+
+The Slab thorn is driver independent, i.e.\ not tied to PUGH or
+Carpet, and does not require MPI.
+\end{abstract}
+
+% The following sections are suggestive only.
+% Remove them or add your own.
+
+\section{Introduction}
+
+\section{Physical System}
+
+\section{Numerical Implementation}
+
+\section{Using This Thorn}
+
+\subsection{Obtaining This Thorn}
+
+\subsection{Basic Usage}
+
+\subsection{Special Behaviour}
+
+\subsection{Interaction With Other Thorns}
+
+\subsection{Support and Feedback}
+
+\section{History}
+
+\subsection{Thorn Source Code}
+
+\subsection{Thorn Documentation}
+
+\subsection{Acknowledgements}
+
+
+\begin{thebibliography}{9}
+
+\end{thebibliography}
+
+% Do not delete next line
+% END CACTUS THORNGUIDE
+
+\end{document}
diff --git a/interface.ccl b/interface.ccl
new file mode 100644
index 0000000..e1bed10
--- /dev/null
+++ b/interface.ccl
@@ -0,0 +1,6 @@
+# Interface definition for thorn Slab
+# $Header$
+
+IMPLEMENTS: Slab
+
+INCLUDES HEADER: slab.h IN Slab.h
diff --git a/param.ccl b/param.ccl
new file mode 100644
index 0000000..31b7f9c
--- /dev/null
+++ b/param.ccl
@@ -0,0 +1,2 @@
+# Parameter definitions for thorn Slab
+# $Header$
diff --git a/schedule.ccl b/schedule.ccl
new file mode 100644
index 0000000..2a19d26
--- /dev/null
+++ b/schedule.ccl
@@ -0,0 +1,2 @@
+# Schedule definitions for thorn Slab
+# $Header$
diff --git a/src/make.code.defn b/src/make.code.defn
new file mode 100644
index 0000000..427548a
--- /dev/null
+++ b/src/make.code.defn
@@ -0,0 +1,9 @@
+# Main make.code.defn file for thorn Slab
+# $Header$
+
+# Source files in this directory
+SRCS = slab.c
+
+# Subdirectories containing source files
+SUBDIRS =
+
diff --git a/src/slab.c b/src/slab.c
new file mode 100644
index 0000000..a5800a7
--- /dev/null
+++ b/src/slab.c
@@ -0,0 +1,679 @@
+/* $Header$ */
+
+/*
+ TODO:
+ Provide facilities for dim != 3
+ Set up the slab exchange information in advance
+ Test slabbing without MPI
+*/
+
+
+
+#include <assert.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "cctk.h"
+#include "cctk_DefineThorn.h"
+
+#ifdef CCTK_MPI
+# include <mpi.h>
+#endif
+
+#include "slab.h"
+
+
+
+static const char * rcsid = "$Header$";
+
+CCTK_FILEVERSION(TAT_Slab_slab_c);
+
+
+
+/* Print debug information? */
+#undef DEBUG
+
+/* Perform expensive self-checks? */
+#undef CHECK
+
+
+
+#ifdef DEBUG
+# define ifdebug
+#else
+# define ifdebug while (0)
+#endif
+
+#ifdef CHECK
+# define ifcheck
+#else
+# define ifcheck while (0)
+#endif
+
+
+
+/* Find out which driver to use */
+#ifdef CCTK_MPI
+# if defined CARPET_CARPET
+# include "Carpet/Carpet/src/carpet_public.h"
+# elif defined CACTUSPUGH_PUGH
+# include "CactusPUGH/PUGH/src/include/pugh.h"
+# else
+# error "No supported driver thorn included"
+# endif
+#endif
+
+
+
+/* Replace MPI functions if MPI is disabled */
+#ifndef CCTK_MPI
+
+typedef int MPI_Datatype;
+#define MPI_DOUBLE CCTK_VARIABLE_REAL8
+#define MPI_INT CCTK_VARIABLE_INT4
+
+typedef int MPI_Comm;
+
+static int
+MPI_Barrier (MPI_Comm comm)
+{
+ return 0;
+}
+
+static int
+MPI_Comm_Size (MPI_Comm comm, int * size)
+{
+ *size = 1;
+ return 0;
+}
+
+static int
+MPI_Comm_Rank (MPI_Comm comm, int * rank)
+{
+ *rank = 0;
+ return 0;
+}
+
+static int
+MPI_Allgather (void * sendbuf, int sendcnt, int sendtype,
+ void * recvbuf, int recvcnt, int recvtype,
+ MPI_Comm comm)
+{
+ assert (sendbuf);
+ assert (recvbuf);
+ assert (sendcnt == recvcnt);
+ assert (sendtype == recvtype);
+ int const recvsize = CCTK_VarTypeSize (recvtype);
+ assert (size > 0);
+ memcpy (recvbuf, sendbuf, recvcnt * recvsize);
+ return 0;
+}
+
+static int
+MPI_Alltoall (void * sendbuf, int sendcnt, int sendtype,
+ void * recvbuf, int recvcnt, int recvtype,
+ MPI_Comm comm)
+{
+ assert (sendbuf);
+ assert (recvbuf);
+ assert (sendcnt == recvcnt);
+ assert (sendtype == recvtype);
+ int const recvsize = CCTK_VarTypeSize (recvtype);
+ assert (size > 0);
+ memcpy (recvbuf, sendbuf, recvcnt * recvsize);
+ return 0;
+}
+
+static int
+MPI_Alltoallv (void * sendbuf, int * sendcnt, int * sendoff, int sendtype,
+ void * recvbuf, int * recvcnt, int * recvoff, int recvtype,
+ MPI_Comm comm)
+{
+ assert (sendbuf);
+ assert (recvbuf);
+ assert (sendcnt);
+ assert (recvcnt);
+ assert (*sendcnt == *recvcnt);
+ assert (sendoff);
+ assert (recvoff);
+ assert (*sendoff == 0);
+ assert (*recvoff == 0);
+ assert (sendtype == recvtype);
+ int const recvsize = CCTK_VarTypeSize (recvtype);
+ assert (size > 0);
+ memcpy (recvbuf, sendbuf, *recvcnt * recvsize);
+ return 0;
+}
+
+#endif
+
+
+
+struct bbox {
+ int off, len, str;
+};
+
+struct arrays {
+ struct bbox global, local, slab;
+};
+
+struct info {
+ struct arrays src, dst;
+ int xpose;
+ int flip;
+};
+
+
+
+static inline int min (int const x, int const y)
+{
+ return x < y ? x : y;
+}
+
+static inline int max (int const x, int const y)
+{
+ return x > y ? x : y;
+}
+
+static inline int roundup (int const x, int const y)
+{
+ assert (x >= 0);
+ assert (y > 0);
+ return (x + y - 1) / y * y;
+}
+
+
+
+static void bbox_print (struct bbox const * restrict const bbox)
+{
+ assert (bbox);
+ printf
+ ("[%d:%d:%d]",
+ bbox->off, bbox->off + (bbox->len - 1) * bbox->str, bbox->str);
+}
+
+static void bbox_check (struct bbox const * restrict const bbox)
+{
+ assert (bbox);
+ assert (bbox->len >= 0);
+ assert (bbox->str > 0);
+}
+
+static void global2bbox (struct slabinfo const * restrict const slab,
+ struct bbox * restrict const bbox)
+{
+ assert (slab);
+ assert (bbox);
+ assert (slab->gsh >= 0);
+ bbox->off = 0;
+ bbox->len = slab->gsh;
+ bbox->str = 1;
+ bbox_check (bbox);
+}
+
+static void local2bbox (struct slabinfo const * restrict const slab,
+ struct bbox * restrict const bbox)
+{
+ assert (slab);
+ assert (bbox);
+ assert (slab->lbnd >= 0);
+ assert (slab->lsh >= 0);
+ assert (slab->lbnd + slab->lsh <= slab->gsh);
+ assert (slab->lbbox == 0 || slab->lbbox == 1);
+ assert (slab->ubbox == 0 || slab->ubbox == 1);
+ assert (slab->nghostzones >= 0);
+ int const nlghostzones = slab->lbbox ? 0 : slab->nghostzones;
+ int const nughostzones = slab->ubbox ? 0 : slab->nghostzones;
+ bbox->off = slab->lbnd + nlghostzones;
+ bbox->len = slab->lsh - nlghostzones - nughostzones;
+ bbox->str = 1;
+ bbox_check (bbox);
+}
+
+static void slab2bbox (struct slabinfo const * restrict const slab,
+ struct bbox * restrict const bbox)
+{
+ assert (slab);
+ assert (bbox);
+ bbox->off = slab->off;
+ bbox->len = slab->len;
+ bbox->str = slab->str;
+ bbox_check (bbox);
+}
+
+static int bbox_iscontained (struct bbox const * restrict const inner,
+ struct bbox const * restrict const outer)
+{
+ bbox_check (inner);
+ bbox_check (outer);
+ int const inner_last = inner->off + (inner->len - 1) * inner->str;
+ int const outer_last = outer->off + (outer->len - 1) * outer->str;
+ return inner->off >= outer->off && inner_last <= outer_last;
+}
+
+static void bbox_clip (struct bbox * restrict const inner,
+ struct bbox const * restrict const outer)
+{
+ bbox_check (inner);
+ bbox_check (outer);
+ int inner_last = inner->off + (inner->len - 1) * inner->str;
+ int const outer_last = outer->off + (outer->len - 1) * outer->str;
+ if (inner->off < outer->off) {
+ inner->off += roundup (outer->off - inner->off, inner->str);
+ }
+ if (inner_last > outer_last) {
+ inner_last -= roundup (inner_last - outer_last, inner->str);
+ }
+ assert ((inner_last - inner->off) % inner->str == 0);
+ if (inner_last >= inner->off) {
+ inner->len = (inner_last - inner->off + inner->str) / inner->str;
+ } else {
+ inner->len = 0;
+ }
+ bbox_check (inner);
+}
+
+static void bbox_xform (struct bbox * restrict const ydst,
+ struct bbox const * restrict const ysrc,
+ struct bbox const * restrict const xdst,
+ struct bbox const * restrict const xsrc,
+ int const flip)
+{
+ assert (ydst);
+ bbox_check (ysrc);
+ bbox_check (xdst);
+ bbox_check (xsrc);
+ assert (ysrc->str == xsrc->str);
+ int const xsrc_last = xsrc->off + (xsrc->len - 1) * xsrc->str;
+ int const xdst_last = xdst->off + (xdst->len - 1) * xdst->str;
+ int const ysrc_last = ysrc->off + (ysrc->len - 1) * ysrc->str;
+ ydst->str = xdst->str;
+ assert ((ysrc->off - xsrc->off) % ysrc->str == 0);
+ int ydst_last;
+ ydst->off = xdst->off + (ysrc->off - xsrc->off) / ysrc->str * ydst->str;
+ ydst_last = xdst->off + (ysrc_last - xsrc->off) / ysrc->str * ydst->str;
+ if (flip) {
+ int const off = ydst->off;
+ int const last = ydst_last;
+ ydst->off = xdst->off + xdst_last - last;
+ ydst_last = xdst_last - (off - xdst->off);
+ }
+ assert ((ysrc_last - xsrc->off) % ysrc->str == 0);
+ assert (ydst_last - ydst->off + ydst->str >= 0);
+ ydst->len = (ydst_last - ydst->off + ydst->str) / ydst->str;
+ bbox_check (ydst);
+}
+
+
+
+int Slab_Transfer (cGH * const cctkGH,
+ int const dim,
+ struct xferinfo const * const xferinfo,
+ int const srctype,
+ void const * const srcptr,
+ int const dsttype,
+ void * const dstptr)
+{
+ /* Check arguments */
+ assert (cctkGH);
+ assert (dim >= 0);
+ assert (xferinfo);
+ assert (srctype == CCTK_VARIABLE_REAL);
+ assert (srcptr);
+ assert (dsttype == CCTK_VARIABLE_REAL);
+ assert (dstptr);
+
+ struct info info[dim];
+ for (int d=0; d<dim; ++d) {
+ global2bbox (&xferinfo[d].src, &info[d].src.global);
+ local2bbox (&xferinfo[d].src, &info[d].src.local);
+ assert (bbox_iscontained (&info[d].src.local, &info[d].src.global));
+ slab2bbox (&xferinfo[d].src, &info[d].src.slab);
+ assert (bbox_iscontained (&info[d].src.local, &info[d].src.global));
+
+ global2bbox (&xferinfo[d].dst, &info[d].dst.global);
+ local2bbox (&xferinfo[d].dst, &info[d].dst.local);
+ assert (bbox_iscontained (&info[d].dst.local, &info[d].dst.global));
+ slab2bbox (&xferinfo[d].dst, &info[d].dst.slab);
+ assert (bbox_iscontained (&info[d].dst.local, &info[d].dst.global));
+
+ info[d].xpose = xferinfo[d].xpose;
+ assert (info[d].xpose >= 0 && info[d].xpose < dim);
+ info[d].flip = xferinfo[d].flip;
+ assert (info[d].flip == 0 || info[d].flip == 1);
+ }
+
+ {
+ int iflag[dim];
+ for (int d=0; d<dim; ++d) {
+ iflag[d] = 0;
+ }
+ for (int d=0; d<dim; ++d) {
+ assert (! iflag[info[d].xpose]);
+ iflag[info[d].xpose] = 1;
+ }
+ for (int d=0; d<dim; ++d) {
+ assert (iflag[d]);
+ }
+ for (int d=0; d<dim; ++d) {
+ assert (info[info[d].xpose].src.slab.len == info[d].dst.slab.len);
+ }
+ }
+
+ size_t srclentot, dstlentot;
+ srclentot = 1;
+ dstlentot = 1;
+ for (int d=0; d<dim; ++d) {
+ srclentot *= info[d].src.local.len;
+ dstlentot *= info[d].dst.local.len;
+ }
+
+
+
+ MPI_Comm comm;
+#ifdef CCTK_MPI
+# if defined CARPET_CARPET
+ comm = CarpetMPIComm ();
+# elif defined CACTUSPUGH_PUGH
+ comm = PUGH_pGH(cctkGH)->PUGH_COMM_WORLD;
+# else
+# error "No supported driver thorn included"
+# endif
+#else
+ comm = 0;
+#endif
+
+ ifcheck {
+ ifdebug fflush (stdout);
+ MPI_Barrier (comm);
+ }
+
+ int size, rank;
+ MPI_Comm_size (comm, &size);
+ MPI_Comm_rank (comm, &rank);
+
+ assert (sizeof(CCTK_REAL) == sizeof(double));
+ MPI_Datatype srcdatatype, dstdatatype;
+ srcdatatype = MPI_DOUBLE;
+ dstdatatype = MPI_DOUBLE;
+
+
+
+ struct info allinfo[size][dim];
+ int const info_nints = sizeof(struct info) / sizeof(int);
+ ifdebug fflush (stdout);
+ MPI_Allgather
+ (info, dim * info_nints, MPI_INT,
+ allinfo, dim * info_nints, MPI_INT, comm);
+
+ for (int n = 0; n < size; ++n) {
+ for (int d=0; d<dim; ++d) {
+ assert (allinfo[n][d].src.global.off == info[d].src.global.off);
+ assert (allinfo[n][d].src.global.len == info[d].src.global.len);
+ assert (allinfo[n][d].src.global.str == info[d].src.global.str);
+ assert (allinfo[n][d].dst.global.off == info[d].dst.global.off);
+ assert (allinfo[n][d].dst.global.len == info[d].dst.global.len);
+ assert (allinfo[n][d].dst.global.str == info[d].dst.global.str);
+ assert (allinfo[n][d].src.local.str == info[d].src.local.str);
+ assert (allinfo[n][d].dst.local.str == info[d].dst.local.str);
+ assert (allinfo[n][d].src.slab.str == info[d].src.slab.str);
+ assert (allinfo[n][d].dst.slab.str == info[d].dst.slab.str);
+ assert (allinfo[n][d].xpose == info[d].xpose);
+ assert (allinfo[n][d].flip == info[d].flip);
+ }
+ }
+
+
+
+ struct bbox srcdetail[size][dim];
+ for (int n = 0; n < size; ++n) {
+ ifdebug printf ("srcdetail n=%d:\n", n);
+ for (int d=0; d<dim; ++d) {
+ srcdetail[n][d] = allinfo[n][d].src.slab;
+ ifdebug printf (" src.slab d=%d ", d);
+ ifdebug bbox_print (&srcdetail[n][d]);
+ ifdebug printf ("\n");
+ bbox_clip (&srcdetail[n][d], &info[d].src.local);
+ ifdebug printf (" clipped with src.local d=%d ", d);
+ ifdebug bbox_print (&srcdetail[n][d]);
+ ifdebug printf ("\n");
+ }
+ for (int d=0; d<dim; ++d) {
+ struct bbox whereto = allinfo[n][d].dst.slab;
+ ifdebug printf (" dst.slab d=%d ", info[d].xpose);
+ ifdebug bbox_print (&whereto);
+ ifdebug printf ("\n");
+ bbox_clip (&whereto, &allinfo[n][d].dst.local);
+ ifdebug printf (" whereto d=%d ", info[d].xpose);
+ ifdebug bbox_print (&whereto);
+ ifdebug printf ("\n");
+ struct bbox wherefrom;
+ bbox_xform
+ (&wherefrom, &whereto,
+ &allinfo[n][info[d].xpose].src.slab, &allinfo[n][d].dst.slab,
+ info[d].flip);
+ ifdebug printf (" wherefrom d=%d ", info[d].xpose);
+ ifdebug bbox_print (&wherefrom);
+ ifdebug printf ("\n");
+ bbox_clip (&srcdetail[n][info[d].xpose], &wherefrom);
+ ifdebug printf (" clipped with wherefrom d=%d ", info[d].xpose);
+ ifdebug bbox_print (&srcdetail[n][info[d].xpose]);
+ ifdebug printf ("\n");
+ }
+ }
+
+ int srccount[size];
+ int srcoffset[size+1];
+ srcoffset[0] = 0;
+ for (int n = 0; n < size; ++n) {
+ srccount[n] = 1;
+ for (int d=0; d<dim; ++d) {
+ srccount[n] *= srcdetail[n][d].len;
+ }
+ ifdebug printf
+ ("srccnt n=%d offset=%d count=%d\n", n, srcoffset[n], srccount[n]);
+ srcoffset[n+1] = srcoffset[n] + srccount[n];
+ }
+ void * restrict srcdata = malloc (srcoffset[size] * sizeof(CCTK_REAL));
+ assert (srcoffset[size] == 0 || srcdata);
+ ifcheck {
+ CCTK_REAL marker;
+ memset (&marker, -1, sizeof marker);
+ CCTK_REAL * restrict const srcptr = srcdata;
+ for (size_t i = 0; i < srcoffset[size]; ++i) {
+ memcpy (&srcptr[i], &marker, sizeof marker);
+ }
+ }
+
+ struct bbox dstdetail[size][dim];
+ for (int n = 0; n < size; ++n) {
+ ifdebug printf ("dstdetail n=%d:\n", n);
+ for (int d=0; d<dim; ++d) {
+ dstdetail[n][d] = allinfo[n][d].dst.slab;
+ ifdebug printf (" dst.slab d=%d ", d);
+ ifdebug bbox_print (&dstdetail[n][d]);
+ ifdebug printf ("\n");
+ bbox_clip (&dstdetail[n][d], &info[d].dst.local);
+ ifdebug printf (" clipped with dst.local d=%d ", d);
+ ifdebug bbox_print (&dstdetail[n][d]);
+ ifdebug printf ("\n");
+ struct bbox wherefrom = allinfo[n][info[d].xpose].src.slab;
+ ifdebug printf (" src.slab d=%d ", d);
+ ifdebug bbox_print (&dstdetail[n][d]);
+ ifdebug printf ("\n");
+ bbox_clip (&wherefrom, &allinfo[n][info[d].xpose].src.local);
+ ifdebug printf (" wherefrom d=%d ", d);
+ ifdebug bbox_print (&dstdetail[n][d]);
+ ifdebug printf ("\n");
+ struct bbox whereto;
+ bbox_xform
+ (&whereto, &wherefrom,
+ &allinfo[n][d].dst.slab, &allinfo[n][info[d].xpose].src.slab,
+ info[d].flip);
+ ifdebug printf (" whereto d=%d ", d);
+ ifdebug bbox_print (&dstdetail[n][d]);
+ ifdebug printf ("\n");
+ bbox_clip (&dstdetail[n][d], &whereto);
+ ifdebug printf (" clipped with whereto d=%d ", d);
+ ifdebug bbox_print (&dstdetail[n][d]);
+ ifdebug printf ("\n");
+ }
+ }
+
+ int dstcount[size];
+ int dstoffset[size+1];
+ dstoffset[0] = 0;
+ for (int n = 0; n < size; ++n) {
+ dstcount[n] = 1;
+ for (int d=0; d<dim; ++d) {
+ dstcount[n] *= dstdetail[n][d].len;
+ }
+ ifdebug printf
+ ("dstcnt n=%d offset=%d count=%d\n", n, dstoffset[n], dstcount[n]);
+ dstoffset[n+1] = dstoffset[n] + dstcount[n];
+ }
+ void * restrict dstdata = malloc (dstoffset[size] * sizeof(CCTK_REAL));
+ assert (dstoffset[size] == 0 || dstdata);
+ ifcheck {
+ CCTK_REAL marker;
+ memset (&marker, -1, sizeof marker);
+ CCTK_REAL * restrict const dstptr = dstdata;
+ for (size_t i = 0; i < dstoffset[size]; ++i) {
+ memcpy (&dstptr[i], &marker, sizeof marker);
+ }
+ }
+
+ ifcheck {
+ int src2count[size];
+ int dst2count[size];
+ ifdebug fflush (stdout);
+ MPI_Alltoall (srccount, 1, MPI_INT, src2count, 1, MPI_INT, comm);
+ MPI_Alltoall (dstcount, 1, MPI_INT, dst2count, 1, MPI_INT, comm);
+ for (int n = 0; n < size; ++n) {
+ assert (src2count[n] == dstcount[n]);
+ assert (dst2count[n] == srccount[n]);
+ }
+ }
+
+
+
+ for (int n = 0; n < size; ++n) {
+ assert (dim == 3);
+ for (int k = 0; k < srcdetail[n][info[2].xpose].len; ++k) {
+ for (int j = 0; j < srcdetail[n][info[1].xpose].len; ++j) {
+ for (int i = 0; i < srcdetail[n][info[0].xpose].len; ++i) {
+ int ipos[dim];
+ ipos[0] = i;
+ ipos[1] = j;
+ ipos[2] = k;
+ int srcipos[dim];
+ int bufipos[dim];
+ for (int d=0; d<dim; ++d) {
+ int const c = info[d].xpose;
+ srcipos[c] = srcdetail[n][c].off + ipos[d] * srcdetail[n][c].str;
+ assert (srcipos[c] >= info[c].src.local.off
+ && srcipos[c] < info[c].src.local.off + info[c].src.local.len);
+ if (! (srcipos[c] >= allinfo[n][c].src.slab.off
+ && srcipos[c] <= allinfo[n][c].src.slab.off + (allinfo[n][c].src.slab.len - 1) * allinfo[n][c].src.slab.str)) {
+ printf ("ssc n=%d ipos=[%d,%d,%d] d=%d srcipos=%d slab.off=%d slab.len=%d\n", n, i, j, k, d, srcipos[c], allinfo[n][c].src.slab.off, allinfo[n][c].src.slab.len);
+ }
+ assert (srcipos[c] >= allinfo[n][c].src.slab.off
+ && srcipos[c] <= allinfo[n][c].src.slab.off + (allinfo[n][c].src.slab.len - 1) * allinfo[n][c].src.slab.str);
+ assert ((srcipos[c] - allinfo[n][c].src.slab.off) % allinfo[n][c].src.slab.str == 0);
+ bufipos[d] = ipos[d];
+ assert (bufipos[d] >= 0 && bufipos[d] < srcdetail[n][c].len);
+ }
+ size_t srcind = 0;
+ size_t bufind = 0;
+ for (int d=dim-1; d>=0; --d) {
+ int const c = info[d].xpose;
+ srcind = srcind * info[d].src.local.len + srcipos[d] - info[d].src.local.off;
+ bufind = bufind * srcdetail[n][c].len + bufipos[d];
+ }
+ assert (srcind < srclentot);
+ assert (bufind < srccount[n]);
+ ((CCTK_REAL*)srcdata)[srcoffset[n] + bufind]
+ = ((const CCTK_REAL*)srcptr)[srcind];
+ }
+ }
+ }
+ }
+
+ ifcheck {
+ CCTK_REAL marker;
+ memset (&marker, -1, sizeof marker);
+ const CCTK_REAL * restrict const srcptr = srcdata;
+ for (size_t i = 0; i < srcoffset[size]; ++i) {
+ assert (memcmp(&srcptr[i], &marker, sizeof marker) != 0);
+ }
+ }
+
+ ifdebug fflush (stdout);
+ MPI_Alltoallv
+ (srcdata, srccount, srcoffset, srcdatatype,
+ dstdata, dstcount, dstoffset, dstdatatype, comm);
+
+ ifcheck {
+ CCTK_REAL marker;
+ memset (&marker, -1, sizeof marker);
+ const CCTK_REAL * restrict const dstptr = dstdata;
+ for (size_t i = 0; i < dstoffset[size]; ++i) {
+ assert (memcmp(&dstptr[i], &marker, sizeof marker) != 0);
+ }
+ }
+
+ for (int n = 0; n < size; ++n) {
+ assert (dim == 3);
+ for (int k = 0; k < dstdetail[n][2].len; ++k) {
+ for (int j = 0; j < dstdetail[n][1].len; ++j) {
+ for (int i = 0; i < dstdetail[n][0].len; ++i) {
+ int ipos[dim];
+ ipos[0] = i;
+ ipos[1] = j;
+ ipos[2] = k;
+ int bufipos[dim];
+ int dstipos[dim];
+ for (int d=0; d<dim; ++d) {
+ if (! info[d].flip) {
+ bufipos[d] = ipos[d];
+ } else {
+ bufipos[d] = dstdetail[n][d].len - 1 - ipos[d];
+ }
+ assert (bufipos[d] >= 0 && bufipos[d] < dstdetail[n][d].len);
+ dstipos[d] = dstdetail[n][d].off + ipos[d] * info[d].dst.slab.str;
+ assert (dstipos[d] >= info[d].dst.local.off
+ && dstipos[d] < info[d].dst.local.off + info[d].dst.local.len);
+ assert (dstipos[d] >= info[d].dst.slab.off
+ && dstipos[d] <= info[d].dst.slab.off + (info[d].dst.slab.len - 1) * info[d].dst.slab.str);
+ assert ((dstipos[d] - info[d].dst.slab.off) % info[d].dst.slab.str == 0);
+ }
+ size_t bufind = 0;
+ size_t dstind = 0;
+ for (int d=dim-1; d>=0; --d) {
+ bufind = bufind * dstdetail[n][d].len + bufipos[d];
+ dstind = dstind * info[d].dst.local.len + dstipos[d] - info[d].dst.local.off;
+ }
+ assert (bufind < dstcount[n]);
+ assert (dstind < dstlentot);
+ ((CCTK_REAL*)dstptr)[dstind]
+ = ((const CCTK_REAL*)dstdata)[dstoffset[n] + bufind];
+ }
+ }
+ }
+ }
+
+
+
+ free (srcdata);
+ free (dstdata);
+
+
+
+ ifcheck {
+ ifdebug fflush (stdout);
+ MPI_Barrier (comm);
+ }
+
+ return 0;
+}
diff --git a/src/slab.h b/src/slab.h
new file mode 100644
index 0000000..ef03a06
--- /dev/null
+++ b/src/slab.h
@@ -0,0 +1,101 @@
+/* $Header$ */
+
+#ifndef SLAB_H
+#define SLAB_H
+
+#include "cctk.h"
+
+struct slabinfo {
+ int gsh;
+ int lbnd, lsh;
+ int lbbox, ubbox, nghostzones;
+ int off, str, len;
+};
+
+struct xferinfo {
+ struct slabinfo src, dst;
+ int xpose;
+ int flip;
+};
+
+/*
+ Slab_Transfer copies a slab from one array into a slab of another
+ array.
+
+ The src and dst variables describe the source and the destination
+ slab, respectively.
+
+ The variables gsh, lbnd, and lsh describe the shape and distribution
+ of the array containing the slab. They are equivalent to the
+ corresponding quantities in the cGH structure.
+
+ off, str, and len describe the location and shape of the slab within
+ the array. off is the offset, i.e. the location of the "first"
+ corner of the slab. str is the stride, i.e. the distance between to
+ grid points in the slab. The stride can be negative. len is the
+ length, i.e. the number of grid points making up the slab. len does
+ not include the grid points that are skipped if the stride is larger
+ than one.
+
+ xpose describes a possible permutation of the coordinate axes
+ between the slabs. It is source-axis = xpose[destination-axis].
+
+ flip describes a possible inversion of the coordinate axes (from the
+ point of view of the destination slab). It is source-axis =
+ xpose[flip[destination-axis]].
+
+ The corresponding lengths of the source and the destination slabs
+ must be equal, i.e. for all d: src.len[xpose[d]] = dst.len[d].
+
+ The slabs are copied according to
+
+ . dst[dst.off + I * dst.str] = src[src.off + J * src.str]
+
+ where the multi-indices I and J have the ranges specified by dst.len
+ and src.len, respectively, and I and J are related by the
+ transposition
+
+ . J = xpose[flip[I]]
+
+
+
+ Restrictions:
+
+ . dim >= 0
+
+ . gsh >= 0
+ . lbnd >= 0
+ . lsh >= 0
+ . lbnd + lsh <= gsh
+ . lbbox and ubbox must be booleans, i.e. either 0 or 1
+ . nghostzones >= 0
+
+ . len >= 0
+ . str != 0
+ . off >= 0
+ . off < gsh
+ . off + (len-1) * str >= 0
+ . off + (len-1) * str < gsh
+
+ . xpose must be a permutation of 0 ... dim-1
+ . flip must be a boolean, i.e. either 0 or 1
+
+ The source and the destination arrays may be the same.
+
+
+
+ To do:
+
+ . ignore ghostzones
+ . use size_t instead of int
+*/
+
+int Slab_Transfer (cGH * const cctkGH,
+ int const dim,
+ struct xferinfo const * const xferinfo,
+ int const srctype,
+ void const * const srcptr,
+ int const dsttype,
+ void * const dstptr);
+
+#endif /* defined SLAB_H */