CarpetInterp: optimise global interpolator communication scheme

CarpetInterp used to use CarpetLib's data class to exchange interpolation information (interpolation coordinates and a source map as inputs, interpolation results and status/return codes as outputs) between processors. This point-to-point communication has been replaced by explicit collective MPI operations which should now make global interpolations faster. For a general overview on the implementation see CarpetInterp's thorn documentation. darcs-hash:20050826115157-776a0-d910b51d7a26cef12e13408a79f11ed2826f5ed1.gz
author: Thomas Radke <tradke@aei.mpg.de> 2005-08-26 11:51:00 +0000
committer: Thomas Radke <tradke@aei.mpg.de> 2005-08-26 11:51:00 +0000
commit: ed0dc00bca80c9a29ee27208cc5ce36ff2709b92 (patch)
tree: deefeb13cdeaa9c6a60103b042f3480ee8e60160 /Carpet/CarpetInterp
parent: d21cbeb047f2816c252845fd3b2c1d497a684626 (diff)
2 files changed, 889 insertions, 1034 deletions
diff --git a/Carpet/CarpetInterp/doc/documentation.tex b/Carpet/CarpetInterp/doc/documentation.tex
index 52848ad40..44e000a13 100644
--- a/Carpet/CarpetInterp/doc/documentation.tex
+++ b/Carpet/CarpetInterp/doc/documentation.tex
@@ -75,14 +75,15 @@
 \begin{document}
 
 % The author of the documentation
-\author{Erik Schnetter \textless schnetter@uni-tuebingen.de\textgreater}
+\author{Erik Schnetter \textless schnetter@aei.mpg.de\textgreater\\
+        Thomas Radke \textless tradke@aei.mpg.de\textgreater}
 
 % The title of the document (not necessarily the name of the Thorn)
 \title{CarpetInterp}
 
 % the date your document was last changed, if your document is in CVS, 
 % please use:
-\date{April 29, 2003}
+\date{26 August 2005}
 
 \maketitle
 
@@ -95,6 +96,8 @@
 % Add an abstract for this thorn's documentation
 \begin{abstract}
 
+Thorn {\bf CarpetInterp} provides a parallel interpolator for Carpet.
+
 \end{abstract}
 
 % The following sections are suggestive only.
@@ -102,36 +105,84 @@
 
 \section{Introduction}
 
-\section{Physical System}
-
-\section{Numerical Implementation}
-
-\section{Using This Thorn}
-
-\subsection{Obtaining This Thorn}
-
-\subsection{Basic Usage}
-
-\subsection{Special Behaviour}
-
-\subsection{Interaction With Other Thorns}
-
-\subsection{Examples}
-
-\subsection{Support and Feedback}
-
-\section{History}
-
-\subsection{Thorn Source Code}
-
-\subsection{Thorn Documentation}
-
-\subsection{Acknowledgements}
-
-
-\begin{thebibliography}{9}
+{\bf CarpetInterp} works similar to {\bf PUGHInterp}, the interpolation
+thorn of the standard unigrid driver in Cactus.
+
+\begin{enumerate}
+  \item Firstly, the points to interpolate at (which can vary from processor
+        to processor) are mapped onto processors.
+        Each processor accumulates the number of coordinates it needs to
+        send to other processors (which own the corresponding points).
+
+  \item In a collective communication step, all processors exchange the
+        coordinates of the interpolation points which are not located locally.
+
+  \item After each processor received the coordinates of (local) points to
+        interpolate at, it maps them onto components (also counting the
+        number of points for each component).
+
+  \item Now all processors do -- in parallel -- the actual interpolation
+        by calling the local interpolator (via {\tt CCTK\_InterpLocalUnform()})
+        on each component.
+
+  \item Finally, the interpolation results are sent back to the requesting
+        processors in another collective communication step.
+\end{enumerate}
+
+
+\section{Using CarpetInterp}
+
+{\bf CarpetInterp} overloads the flesh interpolation C API routine
+\begin{verbatim}
+  int CCTK_InterpGridArrays
+                     (const cGH *GH,
+                      int N_dims,
+                      int local_interp_handle,
+                      int param_table_handle,
+                      int coord_system_handle,
+                      int N_interp_points,
+                        int interp_coords_type,
+                        const void *const interp_coords[],
+                      int N_input_arrays,
+                        const CCTK_INT input_array_indices[],
+                      int N_output_arrays,
+                        const CCTK_INT output_array_types[],
+                        void *const output_arrays[])
+\end{verbatim}
+which is described in detail in the Cactus Reference Manual.
+
+It also provides an equivalent aliased function
+\begin{verbatim}
+  CCTK_INT FUNCTION DriverInterpolate
+                     (CCTK_POINTER_TO_CONST IN cctkGH,
+                      CCTK_INT IN N_dims,
+                      CCTK_INT IN local_interp_handle,
+                      CCTK_INT IN param_table_handle,
+                      CCTK_INT IN coord_system_handle,
+                      CCTK_INT IN N_interp_points,
+                      CCTK_INT IN interp_coords_type,
+                      CCTK_POINTER_TO_CONST ARRAY IN interp_coords,
+                      CCTK_INT IN N_input_arrays,
+                      CCTK_INT ARRAY IN input_array_indices,
+                      CCTK_INT IN N_output_arrays,
+                      CCTK_INT ARRAY IN output_array_types,
+                      CCTK_POINTER ARRAY IN output_arrays)
+\end{verbatim}
+which can be called by symmetry thorns to implement symmetry interpolations
+(see the thorn documentation of {\bf CactusBase/SymBase} for details).
+
+
+\subsection{Limitations}
+
+The current implementation of {\bf CarpetInterp} supports only 3-dimensional
+grid variables as inputs ({\tt N\_dims $==$ 3}).
+
+Inperpolation coordinates and output arrays must be of type
+{\tt CCTK\_VARIABLE\_REAL}.
+
+A valid handle for a parameter table mus be passed in so that {\bf CarpetInterp}
+can store in it the overall status code of the local interpolator.
 
-\end{thebibliography}
 
 % Do not delete next line
 % END CACTUS THORNGUIDE
diff --git a/Carpet/CarpetInterp/src/interp.cc b/Carpet/CarpetInterp/src/interp.cc
index 5d27b9988..bdc17ffaa 100644
--- a/Carpet/CarpetInterp/src/interp.cc
+++ b/Carpet/CarpetInterp/src/interp.cc
@@ -23,225 +23,135 @@
 #include "interp.hh"
 
 
+//////////////////////////////////////////////////////////////////////////
+// For a general overview on the implementation
+// please see CarpetInterp's thorn documentation.
+//////////////////////////////////////////////////////////////////////////
+
 
 namespace CarpetInterp {
-  
+
   using namespace std;
   using namespace Carpet;
-  
-  
-  
-  typedef vect<int,dim> ivect;
-  typedef vect<CCTK_REAL,dim> rvect;
-  
-  typedef bbox<int,dim> ibbox;
-  
-  
-  
-#define ind_rc(m,rl,c) ind_rc_(m,rl,minrl,maxrl,c,maxncomps,vhh)
-  static inline int ind_rc_(int const m,
-                            int const rl, int const minrl, int const maxrl,
-                            int const c, int const maxncomps,
-                            vector<gh*> const hh)
-  {
-    assert (m>=0 && m<maps);
-    assert (rl>=minrl && rl<maxrl);
-    assert (minrl>=0 && maxrl<=hh.at(m)->reflevels());
-    assert (c>=0 && c<maxncomps);
-    assert (maxncomps>=0 && maxncomps<=hh.at(m)->components(rl));
-    int const ind = ((rl-minrl) * maps + m) * maxncomps + c;
-    assert (ind>=0 && ind < (maxrl-minrl) * maps * maxncomps);
-    return ind;
-  }
-  
-#define ind_prc(p,m,rl,c) \
-  ind_prc_(p,nprocs,m,rl,minrl,maxrl,c,maxncomps,cgh,vhh)
-  static inline int ind_prc_(int const p, int const nprocs,
-                             int const m,
-                             int const rl, int const minrl, int const maxrl,
-                             int const c, int const maxncomps,
-                             cGH const * const cgh,
-                             vector<gh*> const hh)
+
+
+// return a unique index for component c on reflevel rl, map m and processor p
+#define component_idx(p,m,rl,c)    \
+        component_idx_ (p, m, rl, c, minrl, maxrl, maxncomps)
+  static inline int component_idx_ (int const p,
+                                    int const m,
+                                    int const rl,
+                                    int const c,
+                                    int const minrl, int const maxrl,
+                                    int const maxncomps)
   {
-    assert (p>=0 && p<nprocs);
-    assert (nprocs==CCTK_nProcs(cgh));
-    assert (m>=0 && m<maps);
-    assert (rl>=minrl && rl<maxrl);
-    assert (minrl>=0 && maxrl<=hh.at(m)->reflevels());
-    assert (c>=0 && c<maxncomps);
-    assert (maxncomps>=0 && maxncomps<=hh.at(m)->components(rl));
-    int const ind
-      = ((p * (maxrl-minrl) + (rl-minrl)) * maps + m) * maxncomps + c;
-    assert (ind>=0 && ind < nprocs * (maxrl-minrl) * maps * maxncomps);
-    return ind;
+    assert (p  >= 0     and p  < dist::size());
+    assert (m  >= 0     and m  < maps);
+    assert (rl >= minrl and rl < maxrl);
+    assert (c  >= 0     and c  < maxncomps);
+    int const local_idx  = ((rl-minrl) * maps + m) * maxncomps + c;
+    int const global_idx = p * (maxrl-minrl) * maps * maxncomps + local_idx;
+    return global_idx;
   }
-  
-  
-  
-  static void
-  find_time_interpolation_order (cGH const * const cgh,
-                                 int const param_table_handle,
-                                 int const N_output_arrays,
-                                 int & prolongation_order_time,
-                                 CCTK_REAL & current_time,
-                                 vector<CCTK_INT> & time_deriv_order,
-                                 bool & have_time_derivs);
-  
-  static void
-  assign_interpolation_points_to_components (vector<int> & rlev,
-                                             vector<int> & home,
-                                             vector<int> & homecnts,
-                                             int const ml,
-                                             vector<CCTK_INT> const & source_map,
-                                             int const interp_coords_type_code,
-                                             void const * const interp_coords [],
-                                             vector<rvect> const & lower,
-                                             vector<rvect> const & upper,
-                                             vector<rvect> const & delta,
-                                             int const minrl,
-                                             int const maxrl,
-                                             int const maxncomps,
-                                             int const N_interp_points);
-  
-  static void
-  create_coordinate_patches (cGH const * const cgh,
-                             int const nprocs,
-                             int const minrl,
-                             int const maxrl,
-                             int const maxncomps,
-                             vector<int> const & allhomecnts,
-                             vector<data<CCTK_INT> *> & allmaps,
-                             vector<data<CCTK_REAL> *> & allcoords);
-  
-  static void
-  fill_local_coordinate_patches (cGH const * const cgh,
-                                 int const nprocs,
-                                 int const minrl,
-                                 int const maxrl,
-                                 int const maxncomps,
-                                 CCTK_INT const N_interp_points,
-                                 vector<int> const & rlev,
-                                 vector<CCTK_INT> const & source_map,
-                                 vector<int> const & home,
-                                 vector<int> const & homecnts,
-                                 int const myproc,
-                                 vector<int> const & allhomecnts,
-                                 vector<data<CCTK_INT> *> & allmaps,
-                                 vector<data<CCTK_REAL> *> & allcoords,
-                                 CCTK_POINTER_TO_CONST const interp_coords []);
-  
+
+
+  static int
+  extract_parameter_table_options (cGH const * const cctkGH,
+                                   int const param_table_handle,
+                                   int const N_interp_points,
+                                   int const N_input_arrays,
+                                   int const N_output_arrays,
+                                   bool& have_source_map,
+                                   bool& have_time_derivs,
+                                   int & prolongation_order_time,
+                                   CCTK_REAL & current_time,
+                                   vector<CCTK_INT> & source_map,
+                                   vector<CCTK_INT> & operand_indices,
+                                   vector<CCTK_INT> & time_deriv_order);
+
   static void
-  transfer_coordinate_patches (cGH const * const cgh,
-                               int const nprocs,
-                               int const minrl,
-                               int const maxrl,
-                               int const maxncomps,
-                               vector<data<CCTK_INT> *> & allmaps,
-                               vector<data<CCTK_REAL> *> & allcoords);
-  
+  map_points (cGH const * const cctkGH,
+              int const ml,
+              int const minrl,
+              int const maxrl,
+              int const maxncomps,
+              int const npoints,
+              vector<CCTK_INT> const& source_map,
+              void const * const coords_list[],
+              vector<CCTK_REAL> const& coords,
+              vector<int>& procs,
+              vector<int>& N_points_to,
+              vector<int>& rlev,
+              vector<int>& home,
+              vector<int>& homecnts);
+
   static void
-  create_output_patches (cGH const * const cgh,
-                         int const nprocs,
-                         int const minrl,
-                         int const maxrl,
-                         int const maxncomps,
-                         CCTK_INT const N_output_arrays,
-                         vector<int> const & allhomecnts,
-                         vector<data<CCTK_REAL> *> & allcoords,
-                         vector<data<CCTK_REAL> *> & alloutputs,
-                         vector<data<CCTK_INT> *> & allstatuses);
-  
-  static int
-  do_local_interpolation (cGH const * const cgh,
+  interpolate_components (cGH const * const cctkGH,
                           int const minrl,
                           int const maxrl,
                           int const maxncomps,
                           bool const want_global_mode,
                           int const prolongation_order_time,
                           int const N_dims,
-                          vector<data<CCTK_REAL> *> & allcoords,
-                          vector<data<CCTK_REAL> *> & alloutputs,
-                          vector<data<CCTK_INT> *> & allstatuses,
+                          vector<int> & homecnts,
+                          vector<CCTK_REAL*> & coords,
+                          vector<CCTK_REAL*> & outputs,
+                          CCTK_INT* const per_proc_statuses,
+                          CCTK_INT* const per_proc_retvals,
+                          vector<CCTK_INT> & operand_indices,
                           vector<CCTK_INT> & time_deriv_order,
-                          vector<int> & allhomecnts,
+                          bool const have_time_derivs,
                           CCTK_INT const local_interp_handle,
                           CCTK_INT const param_table_handle,
                           CCTK_REAL const current_time,
                           int const interp_coords_type_code,
-                          bool const have_time_derivs,
                           int const N_input_arrays,
                           int const N_output_arrays,
                           CCTK_INT const output_array_type_codes [],
                           CCTK_INT const input_array_variable_indices []);
-  
-  static void
-  transfer_output_patches_back (cGH const * const cgh,
-                                int const nprocs,
-                                int const minrl,
-                                int const maxrl,
-                                int const maxncomps,
-                                vector<data<CCTK_REAL> *> & alloutputs,
-                                vector<data<CCTK_INT> *> & allstatuses);
-  
+
   static void
-  read_out_local_output_patches (cGH const * const cgh,
-                                 int const nprocs,
-                                 int const minrl,
-                                 int const maxrl,
-                                 int const maxncomps,
-                                 int const myproc,
-                                 CCTK_INT const N_interp_points,
-                                 vector<int> const & rlev,
-                                 vector<CCTK_INT> const & source_map,
-                                 vector<int> const & home,
-                                 CCTK_INT const N_output_arrays,
-                                 CCTK_INT const interp_coords_type_code,
-                                 vector<data<CCTK_REAL> *> const & alloutputs,
-                                 CCTK_POINTER const output_arrays [],
-                                 vector<data<CCTK_INT> *> const & allstatuses,
-                                 vector<int> const & allhomecnts,
-                                 vector<int> const & homecnts,
-                                 CCTK_INT const param_table_handle);
-  
-  static int
-  interpolate_within_component (cGH const * const cgh,
+  interpolate_single_component (cGH const * const cctkGH,
                                 int const minrl,
                                 int const maxrl,
                                 int const maxncomps,
                                 int const N_dims,
-                                vector<data<CCTK_REAL> *> & allcoords,
-                                vector<data<CCTK_REAL> *> & alloutputs,
-                                vector<data<CCTK_INT> *> & allstatuses,
+                                int const npoints,
+                                CCTK_REAL const* const coords,
+                                CCTK_REAL* const outputs,
+                                CCTK_INT& overall_status,
+                                CCTK_INT& overall_retval,
+                                vector<CCTK_INT> & operand_indices,
                                 vector<CCTK_INT> & time_deriv_order,
-                                vector<int> & allhomecnts,
+                                vector<CCTK_INT> & interp_num_time_levels,
                                 CCTK_INT const local_interp_handle,
                                 CCTK_INT const param_table_handle,
-                                CCTK_REAL const current_time,
-                                int const interp_coords_type_code,
                                 int const num_tl,
                                 bool const need_time_interp,
+                                CCTK_REAL const current_time,
                                 int const N_input_arrays,
                                 int const N_output_arrays,
+                                int const interp_coords_type_code,
                                 CCTK_INT const output_array_type_codes [],
                                 CCTK_INT const input_array_variable_indices []);
-  
-  
-  
+
+
+
   void CarpetInterpStartup ()
   {
     CCTK_OverloadInterpGridArrays (InterpGridArrays);
   }
-  
-  
-  
-  int InterpGridArrays (cGH const * const cgh,
+
+
+
+  int InterpGridArrays (cGH const * const cctkGH,
                         int const N_dims,
                         int const local_interp_handle,
                         int const param_table_handle,
                         int const coord_system_handle,
                         int const N_interp_points,
                         int const interp_coords_type_code,
-                        void const * const interp_coords [],
+                        void const * const coords [],
                         int const N_input_arrays,
                         CCTK_INT const input_array_variable_indices [],
                         int const N_output_arrays,
@@ -250,16 +160,16 @@ namespace CarpetInterp {
   {
     if (CCTK_IsFunctionAliased ("SymmetryInterpolate")) {
       return SymmetryInterpolate
-        (cgh, N_dims,
+        (cctkGH, N_dims,
          local_interp_handle, param_table_handle, coord_system_handle,
-         N_interp_points, interp_coords_type_code, interp_coords,
+         N_interp_points, interp_coords_type_code, coords,
          N_input_arrays, input_array_variable_indices,
          N_output_arrays, output_array_type_codes, output_arrays);
     } else {
       return Carpet_DriverInterpolate
-        (cgh, N_dims,
+        (cctkGH, N_dims,
          local_interp_handle, param_table_handle, coord_system_handle,
-         N_interp_points, interp_coords_type_code, interp_coords,
+         N_interp_points, interp_coords_type_code, coords,
          N_input_arrays, input_array_variable_indices,
          N_output_arrays, output_array_type_codes, output_arrays);
     }
@@ -268,46 +178,55 @@ namespace CarpetInterp {
 
 
   extern "C" CCTK_INT
-  Carpet_DriverInterpolate (CCTK_POINTER_TO_CONST const cgh_,
+  Carpet_DriverInterpolate (CCTK_POINTER_TO_CONST const cctkGH_,
                             CCTK_INT const N_dims,
                             CCTK_INT const local_interp_handle,
                             CCTK_INT const param_table_handle,
                             CCTK_INT const coord_system_handle,
                             CCTK_INT const N_interp_points,
                             CCTK_INT const interp_coords_type_code,
-                            CCTK_POINTER_TO_CONST const interp_coords [],
+                            CCTK_POINTER_TO_CONST const coords_list [],
                             CCTK_INT const N_input_arrays,
                             CCTK_INT const input_array_variable_indices [],
                             CCTK_INT const N_output_arrays,
                             CCTK_INT const output_array_type_codes [],
                             CCTK_POINTER const output_arrays [])
   {
-    cGH const * const cgh = static_cast<cGH const *> (cgh_);
-    int ierr;
-    
-    assert (cgh);
+    cGH const * const cctkGH = static_cast<cGH const *> (cctkGH_);
+    assert (cctkGH);
     assert (N_dims == dim);
-    
-    
-    
+
+
+
     if (is_meta_mode()) {
-      CCTK_WARN (0, "It is not possible to interpolate in meta mode");
+      CCTK_WARN (CCTK_WARN_ABORT,
+                 "It is not possible to interpolate in meta mode");
     }
-    
+
     bool const want_global_mode = is_global_mode();
-    
-    
-    
-    // Get some information
-    MPI_Comm const comm = CarpetMPIComm ();
-    int const myproc = CCTK_MyProc (cgh);
-    int const nprocs = CCTK_nProcs (cgh);
-    assert (myproc>=0 && myproc<nprocs);
-    
+
+
     // Multiple convergence levels are not supported
     assert (mglevels == 1);
     int const ml = 0;
-    
+
+    assert (N_interp_points >= 0);
+    assert (coords_list);
+    for (int d=0; d<dim; ++d) {
+      assert (N_interp_points == 0 or coords_list[d]);
+    }
+
+    assert (interp_coords_type_code == CCTK_VARIABLE_REAL);
+
+    assert (N_output_arrays >= 0);
+    if (N_interp_points > 0) {
+      assert (output_arrays);
+      for (int j=0; j<N_output_arrays; ++j) {
+        assert (output_arrays[j]);
+      }
+    }
+
+
     // Find range of refinement levels
     assert (maps > 0);
     for (int m=1; m<maps; ++m) {
@@ -315,696 +234,647 @@ namespace CarpetInterp {
     }
     int const minrl = want_global_mode ? 0                      : reflevel;
     int const maxrl = want_global_mode ? vhh.at(0)->reflevels() : reflevel+1;
-    
-    // Find source maps
-    vector<CCTK_INT> source_map (N_interp_points);
-    int const iret = Util_TableGetIntArray
-      (param_table_handle, N_interp_points, &source_map.front(), "source_map");
-    if (iret == UTIL_ERROR_TABLE_NO_SUCH_KEY) {
-      // No explicit source map specified
-      if (Carpet::map != -1) {
-        // Interpolate from the current map
-        for (int n=0; n<N_interp_points; ++n) {
-          source_map.at(n) = Carpet::map;
-        }
-      } else if (maps == 1) {
-        // Interpolate from map 0 if this is the only one
-        // (for backwards compatibility)
-        for (int n=0; n<N_interp_points; ++n) {
-          source_map.at(n) = 0;
-        }
-      } else {
-        CCTK_WARN (1, "No source map specified");
-        return -1;
-      }
-    } else if (iret < 0) {
-      CCTK_WARN (1, "internal error");
-      return -1;
-    } else if (iret != N_interp_points) {
-      CCTK_WARN (1, "Source map array has wrong size");
-      return -1;
-    }
-    
-    // Check source maps
-    for (int n=0; n<N_interp_points; ++n) {
-      assert (source_map.at(n) >=0 and source_map.at(n) < maps);
-    }
-    
+
+    // Find maximum number of components over all levels and maps
     int maxncomps = 0;
     for (int rl=minrl; rl<maxrl; ++rl) {
       for (int m=0; m<maps; ++m) {
         maxncomps = max(maxncomps, vhh.at(m)->components(rl));
       }
     }
-    
-    
-    
-    // Find the time interpolation order
+
+    //////////////////////////////////////////////////////////////////////
+    // Extract parameter table options:
+    //   - source map
+    //   - output array operand indices
+    //   - time interpolation order
+    //////////////////////////////////////////////////////////////////////
+    vector<CCTK_INT> source_map (N_interp_points);
+    vector<CCTK_INT> operand_indices (N_output_arrays);
     vector<CCTK_INT> time_deriv_order (N_output_arrays);
-    bool have_time_derivs;
+    bool have_source_map, have_time_derivs;
     CCTK_REAL current_time;
     int prolongation_order_time;
-    
-    find_time_interpolation_order
-      (cgh,
+
+    int iret = extract_parameter_table_options
+      (cctkGH,
        param_table_handle,
-       N_output_arrays,
+       N_interp_points, N_input_arrays, N_output_arrays,
+       have_source_map, have_time_derivs,
        prolongation_order_time, current_time,
-       time_deriv_order, have_time_derivs);
-    
-    
-    
-    // Find out about the coordinates: origin and delta for the Carpet
-    // grid indices
-#if 0
-    const char * coord_system_name
-      = CCTK_CoordSystemName (coord_system_handle);
-    assert (coord_system_name);
-    rvect lower, upper, delta;
-    for (int d=0; d<dim; ++d) {
-      ierr = CCTK_CoordRange
-        (cgh, &lower[d], &upper[d], d+1, 0, coord_system_name);
-      assert (!ierr);
-      const ibbox & baseext = vhh.at(m)->baseextent;
-      delta[d]
-        = (upper[d] - lower[d]) / (baseext.upper()[d] - baseext.lower()[d]);
+       source_map, operand_indices, time_deriv_order);
+    if (iret < 0) {
+      return iret;
     }
-#else
-    vector<rvect> lower (maps);
-    vector<rvect> delta (maps);
-    vector<rvect> upper (maps);
-    for (int m=0; m<maps; ++m) {
-      lower.at(m) = rvect::ref(cgh->cctk_origin_space);
-      delta.at(m) = rvect::ref(cgh->cctk_delta_space) / maxspacereflevelfact;
-      upper.at(m) = lower.at(m) + delta.at(m) * (vhh.at(m)->baseextent.upper() - vhh.at(m)->baseextent.lower());
+
+    //////////////////////////////////////////////////////////////////////
+    // Map interpolation points to processors
+    //////////////////////////////////////////////////////////////////////
+    vector<int> N_points_to (dist::size());
+    vector<int> rlev (N_interp_points); // refinement level of point n
+    vector<int> home (N_interp_points); // component of point n
+    vector<int> dstprocs (N_interp_points); // which processor owns point n
+    vector<int> allhomecnts ((maxrl-minrl) * maps * maxncomps * dist::size());
+                                            // number of points in component c
+    vector<CCTK_REAL> coords_buffer (dim * N_interp_points);
+
+    // each point from coord_list is mapped onto the processor
+    // that owns it (dstprocs)
+    // also accumulate the number of points per processor (N_points_to)
+    map_points (cctkGH,
+                ml, minrl, maxrl, maxncomps, N_interp_points,
+                source_map,
+                coords_list, coords_buffer,
+                dstprocs, N_points_to,
+                rlev, home, allhomecnts);
+
+    //////////////////////////////////////////////////////////////////////
+    // Communicate the number of points each processor is going to communicate
+    //////////////////////////////////////////////////////////////////////
+
+    // N_points_from denotes the number of points
+    // that this processor is to receive from others
+    vector<int> N_points_from (dist::size());
+    MPI_Alltoall (&N_points_to[0],   1, dist::datatype (N_points_to[0]),
+                  &N_points_from[0], 1, dist::datatype (N_points_from[0]),
+                  dist::comm);
+
+    //////////////////////////////////////////////////////////////////////
+    // Communicate the interpolation coordinates
+    //////////////////////////////////////////////////////////////////////
+    vector<int> sendcnt (dist::size());
+    vector<int> recvcnt (dist::size());
+    vector<int> senddispl (dist::size());
+    vector<int> recvdispl (dist::size());
+
+    // set up counts and displacements for MPI_Alltoallv()
+    sendcnt[0] = dim * N_points_to[0];
+    recvcnt[0] = dim * N_points_from[0];
+    senddispl[0] = recvdispl[0] = 0;
+    int N_points_local = recvcnt[0];
+    for (int p = 1; p < dist::size(); p++) {
+      sendcnt[p] = dim * N_points_to[p];
+      recvcnt[p] = dim * N_points_from[p];
+      N_points_local += recvcnt[p];
+      senddispl[p] = senddispl[p-1] + sendcnt[p-1];
+      recvdispl[p] = recvdispl[p-1] + recvcnt[p-1];
     }
-#endif
-    
-    assert (N_interp_points >= 0);
-    assert (interp_coords);
-    for (int d=0; d<dim; ++d) {
-      assert (N_interp_points==0 || interp_coords[d]);
+    assert (N_points_local % dim == 0);
+    // N_points_local is the total number of points to receive
+    // and thus the total number of points to interpolate on this processor
+    N_points_local /= dim;
+
+    // set up the per-component coordinates
+    // as offset into the single communication send buffer
+    vector<CCTK_REAL*> coords (allhomecnts.size());
+    int offset = 0;
+    for (int c = 0; c < allhomecnts.size(); c++) {
+      coords[c] = &coords_buffer.front() + dim*offset;
+      offset += allhomecnts[c];
     }
-    
-    assert (N_output_arrays >= 0);
-    if (N_interp_points > 0) {
-      assert (output_arrays);
-      for (int j=0; j<N_output_arrays; ++j) {
-        assert (output_arrays[j]);
+    assert (offset == N_interp_points);
+
+    // copy the input coordinates into the communication send buffer
+    // also remember the position of each point in the original input arrays
+    vector<int> indices (N_interp_points);
+    {
+      // totalhomecnts is the accumulated number of points over all components
+      vector<int> totalhomecnts (allhomecnts.size());
+      for (int idx = 0; idx < allhomecnts.size() - 1; idx++) {
+        totalhomecnts[idx + 1] = totalhomecnts[idx] + allhomecnts[idx];
+      }
+
+      vector<int> tmpcnts (allhomecnts.size());
+      for (int n = 0; n < N_interp_points; n++) {
+        int const idx = component_idx (dstprocs[n], source_map[n], rlev[n],
+                                       home[n]);
+        for (int d = 0; d < dim; d++) {
+          coords[idx][d + dim*tmpcnts[idx]] =
+            static_cast<CCTK_REAL const *>(coords_list[d])[n];
+        }
+        indices[n] = totalhomecnts[idx] + tmpcnts[idx];
+        tmpcnts[idx]++;
       }
+      assert (tmpcnts == allhomecnts);
     }
-    
-    
-    
-    // Assign interpolation points to components
-    vector<int> rlev (N_interp_points); // refinement level of point n
-    vector<int> home (N_interp_points); // component of point n
-    vector<int> homecnts ((maxrl-minrl) * maps * maxncomps); // number of points in component c
-    
-    assign_interpolation_points_to_components
-      (rlev, home, homecnts,
-       ml,
-       source_map,
-       interp_coords_type_code, interp_coords,
-       lower, upper, delta,
-       minrl, maxrl,
-       maxncomps,
-       N_interp_points);
-    
-    // Communicate counts
-    vector<int> allhomecnts(nprocs * (maxrl-minrl) * maps * maxncomps);
-    MPI_Allgather
-      (&homecnts   .at(0), (maxrl-minrl) * maps * maxncomps, MPI_INT,
-       &allhomecnts.at(0), (maxrl-minrl) * maps * maxncomps, MPI_INT, comm);
-    
-    
-    
-    // Create coordinate patches
-    vector<data<CCTK_INT> *>  allmaps   (nprocs * (maxrl-minrl) * maps * maxncomps);
-    vector<data<CCTK_REAL> *> allcoords (nprocs * (maxrl-minrl) * maps * maxncomps);
-    
-    create_coordinate_patches
-      (cgh, nprocs,
-       minrl, maxrl,
-       maxncomps,
-       allhomecnts,
-       allmaps, allcoords);
-    
-    fill_local_coordinate_patches
-      (cgh, nprocs,
-       minrl, maxrl,
-       maxncomps,
-       N_interp_points,
-       rlev,
-       source_map,
-       home,
-       homecnts,
-       myproc,
-       allhomecnts,
-       allmaps,
-       allcoords,
-       interp_coords);
-    
-    transfer_coordinate_patches
-      (cgh, nprocs,
-       minrl, maxrl,
-       maxncomps,
-       allmaps, allcoords);
-    
-    
-    
-    // Create output patches
-    vector<data<CCTK_REAL> *> alloutputs
-      (nprocs * (maxrl-minrl) * maps * maxncomps, static_cast<data<CCTK_REAL> *> (0));
-    vector<data<CCTK_INT> *> allstatuses
-      (nprocs * (maxrl-minrl) * maps * maxncomps, static_cast<data<CCTK_INT> *> (0));
-    
-    create_output_patches
-      (cgh, nprocs,
-       minrl, maxrl,
-       maxncomps,
-       N_output_arrays,
-       allhomecnts,
-       allcoords,
-       alloutputs,
-       allstatuses);
-    
-    
-    
-    //
-    // Do the local interpolation
+
+    // send this processor's points to owning processors,
+    // receive other processors' points for interpolation on this processor
+    {
+      vector<CCTK_REAL> tmp (dim * N_points_local);
+      MPI_Datatype const datatype = dist::datatype (tmp[0]);
+      MPI_Alltoallv (&coords_buffer[0], &sendcnt[0], &senddispl[0], datatype,
+                     &tmp[0],           &recvcnt[0], &recvdispl[0], datatype,
+                     dist::comm);
+      coords_buffer.swap (tmp);
+    }
+
+    //////////////////////////////////////////////////////////////////////
+    // Communicate the source map (if necessary)
+    //////////////////////////////////////////////////////////////////////
+    if (have_source_map) {
+      vector<CCTK_INT> tmp (N_interp_points);
+      vector<int> tmpcnts (N_points_to.size());
+      for (int n = 0; n < N_interp_points; n++) {
+        int const p = dstprocs[n];
+        int const offset = senddispl[p] + tmpcnts[p];
+        tmp[offset] = source_map[n];
+        tmpcnts[p]++;
+      }
+      assert (tmpcnts == N_points_to);
+
+      sendcnt[0] = N_points_to[0];
+      recvcnt[0] = N_points_from[0];
+      senddispl[0] = recvdispl[0] = 0;
+      for (int p = 1; p < dist::size(); p++) {
+        sendcnt[p] = N_points_to[p];
+        recvcnt[p] = N_points_from[p];
+        senddispl[p] = senddispl[p-1] + sendcnt[p-1];
+        recvdispl[p] = recvdispl[p-1] + recvcnt[p-1];
+      }
+
+      source_map.resize (N_points_local);
+      MPI_Datatype const datatype = dist::datatype (tmp[0]);
+      MPI_Alltoallv (&tmp[0],        &sendcnt[0], &senddispl[0], datatype,
+                     &source_map[0], &recvcnt[0], &recvdispl[0], datatype,
+                     dist::comm);
+    } else {
+      // No explicit source map specified
+      if (Carpet::map != -1) {
+        // Interpolate from the current map
+        source_map.resize (N_points_local, Carpet::map);
+      } else {
+        // Interpolate from map 0 if this is the only one
+        // (for backwards compatibility)
+        assert (maps == 1);
+        source_map.resize (N_points_local, 0);
+      }
+    }
+
+    //////////////////////////////////////////////////////////////////////
+    // Map (local) interpolation points to components
+    //////////////////////////////////////////////////////////////////////
+    rlev.resize (N_points_local);          // reflevel of (local) point n
+    home.resize (N_points_local);          // component of (local) point n
+    vector<int> srcprocs (N_points_local); // which processor requested point n
+    vector<int> homecnts (allhomecnts.size());  // points per components
+
+    // remember from where point n came from
+    offset = 0;
+    for (int p = 0; p < N_points_from.size(); p++) {
+      for (int n = 0; n < N_points_from[p]; n++) {
+        srcprocs[offset++] = p;
+      }
+    }
+    assert (offset == N_points_local);
+
+    // map each point in coords_buffer onto its component (srcproc, rlev, home)
+    // also accumulate the number of points in each component (homecnts)
     //
-    int const overall_ierr = do_local_interpolation
-      (cgh,
+    // In order to be somewhat more efficient, we could also map
+    // all processors' (rlev, home) points onto the same component
+    // and call the local interpolator on that (thus saving potentially
+    // nprocs-1 calls).
+    // The reason why this hasn't been implemented here is because
+    // CCTK_InterpGridArrays() is supposed to return a value which is
+    // the minimum over all local interpolator invocations' return values
+    // for the points requested by this processor. This overall minimum
+    // isn't defined anymore if we call the local interpolator on a component
+    // now containing points from different processors.
+    // (One could argue though that the per-point status codes as returned
+    // by the local interpolator could be used to determine the global
+    // interpolator return value instead.)
+    map_points (cctkGH,
+                ml, minrl, maxrl, maxncomps, N_points_local,
+                source_map,
+                NULL, coords_buffer,
+                srcprocs, N_points_to,
+                rlev, home, homecnts);
+
+    // free some memory
+    source_map.clear();
+    srcprocs.clear();
+
+    // reorder the coordinates from <dim>-tupels into <dim> vectors
+    // as expected by CCTK_InterpLocalUniform()
+    {
+      int offset = 0;
+      vector<CCTK_REAL> tmp (coords_buffer.size());
+      for (int c = 0; c < homecnts.size(); c++) {
+        for (int n = 0; n < homecnts[c]; n++) {
+          for (int d = 0; d < dim; d++) {
+            tmp[d*homecnts[c] + n + offset] = coords_buffer[n*dim + d + offset];
+          }
+        }
+        offset += dim * homecnts[c];
+      }
+      assert (offset == dim * N_points_local);
+      coords_buffer.swap (tmp);
+    }
+
+    //////////////////////////////////////////////////////////////////////
+    // Do the local interpolation on individual components
+    //////////////////////////////////////////////////////////////////////
+    vector<CCTK_REAL>  outputs_buffer (N_points_local * N_output_arrays);
+    vector<CCTK_REAL*> outputs  (homecnts.size());
+    vector<CCTK_INT>   status_and_retval_buffer (2 * dist::size());
+    CCTK_INT* per_proc_statuses = &status_and_retval_buffer.front();
+    CCTK_INT* per_proc_retvals  = per_proc_statuses + dist::size();
+
+    // set up the per-component coordinates and output arrays as offsets
+    // into the single communication buffers
+    offset = 0;
+    for (int c = 0; c < homecnts.size(); c++) {
+      coords[c]   = &coords_buffer.front()  + dim*offset;
+      outputs[c]  = &outputs_buffer.front() + N_output_arrays*offset;
+      offset += homecnts[c];
+    }
+    assert (offset == N_points_local);
+
+    interpolate_components
+      (cctkGH,
        minrl, maxrl,
        maxncomps,
        want_global_mode,
        prolongation_order_time,
        N_dims,
-       allcoords, alloutputs, allstatuses,
-       time_deriv_order,
-       allhomecnts,
+       homecnts,
+       coords, outputs, per_proc_statuses, per_proc_retvals,
+       operand_indices, time_deriv_order, have_time_derivs,
        local_interp_handle, param_table_handle,
        current_time,
        interp_coords_type_code,
-       have_time_derivs,
        N_input_arrays, N_output_arrays,
        output_array_type_codes, input_array_variable_indices);
-    
-    
-    
-    // Free coordinate patches
-    for (int p=0; p<nprocs; ++p) {
-      for (int rl=minrl; rl<maxrl; ++rl) {
-        for (int m=0; m<maps; ++m) {
-          for (int c=0; c<vhh.at(m)->components(rl); ++c) {
-            delete allmaps.at(ind_prc(p,m,rl,c));
-            allmaps.at(ind_prc(p,m,rl,c)) = NULL;
-            delete allcoords.at(ind_prc(p,m,rl,c));
-            allcoords.at(ind_prc(p,m,rl,c)) = NULL;
+
+    // free some memory
+    coords_buffer.clear();
+    coords.clear();
+    homecnts.clear();
+    home.clear();
+    rlev.clear();
+
+    //////////////////////////////////////////////////////////////////////
+    // Communicate interpolation results
+    //////////////////////////////////////////////////////////////////////
+    sendcnt[0] = N_output_arrays * N_points_from[0];
+    recvcnt[0] = N_output_arrays * N_points_to[0];
+    senddispl[0] = recvdispl[0] = 0;
+    for (int p = 1; p < dist::size(); p++) {
+      sendcnt[p] = N_output_arrays * N_points_from[p];
+      recvcnt[p] = N_output_arrays * N_points_to[p];
+      senddispl[p] = senddispl[p-1] + sendcnt[p-1];
+      recvdispl[p] = recvdispl[p-1] + recvcnt[p-1];
+    }
+
+    {
+      vector<CCTK_REAL> tmp (N_output_arrays * N_interp_points);
+      MPI_Datatype const datatype = dist::datatype (tmp[0]);
+      MPI_Alltoallv (&outputs_buffer[0], &sendcnt[0], &senddispl[0], datatype,
+                     &tmp[0],            &recvcnt[0], &recvdispl[0], datatype,
+                     dist::comm);
+      outputs_buffer.swap (tmp);
+    }
+
+    //////////////////////////////////////////////////////////////////////
+    // Communicate interpolation status codes and return values
+    //////////////////////////////////////////////////////////////////////
+    {
+      // a processor's overall status and return code
+      // is defined as the minimum over all local interpolator status and
+      // return codes across all processors for that processor
+      vector<CCTK_INT> tmp (status_and_retval_buffer.size());
+      MPI_Allreduce (&status_and_retval_buffer[0], &tmp[0], tmp.size(),
+                     dist::datatype (tmp[0]), MPI_MIN, dist::comm);
+      status_and_retval_buffer.swap (tmp);
+      per_proc_statuses = &status_and_retval_buffer.front();
+      per_proc_retvals  = per_proc_statuses + dist::size();
+    }
+
+    //////////////////////////////////////////////////////////////////////
+    // Finally, sort the received outputs back into the caller's arrays
+    //////////////////////////////////////////////////////////////////////
+    {
+      // sorting is done with the help of the indices vector
+      //
+      // It would be nice if this could be done in a single step
+      // without copying data into an intermediate buffer.
+      vector<CCTK_REAL> tmp(N_interp_points);
+      for (int d = 0; d < N_output_arrays; d++) {
+        for (int c = 0, i = 0, offset = 0; c < allhomecnts.size(); c++) {
+          for (int n = 0; n < allhomecnts[c]; n++, i++) {
+            tmp.at(i) = outputs_buffer.at(allhomecnts[c]*d + offset + n);
           }
+          offset += N_output_arrays * allhomecnts[c];
         }
-      }
-    }
-    
-    
-    
-    transfer_output_patches_back
-      (cgh, nprocs,
-       minrl, maxrl,
-       maxncomps,
-       alloutputs,
-       allstatuses);
-    
-    
-    
-    read_out_local_output_patches
-      (cgh, nprocs,
-       minrl, maxrl,
-       maxncomps,
-       myproc,
-       N_interp_points,
-       rlev,
-       source_map,
-       home,
-       N_output_arrays,
-       interp_coords_type_code,
-       alloutputs,
-       output_arrays,
-       allstatuses,
-       allhomecnts,
-       homecnts,
-       param_table_handle);
-    
-    
-    
-    // Free local output patches
-    for (int p=0; p<nprocs; ++p) {
-      for (int rl=minrl; rl<maxrl; ++rl) {
-        for (int m=0; m<maps; ++m) {
-          for (int c=0; c<vhh.at(m)->components(rl); ++c) {
-            delete alloutputs.at(ind_prc(p,m,rl,c));
-            alloutputs.at(ind_prc(p,m,rl,c)) = NULL;
-            delete allstatuses.at(ind_prc(p,m,rl,c));
-            allstatuses.at(ind_prc(p,m,rl,c)) = NULL;
+        for (int c = 0, i = 0; c < allhomecnts.size(); c++) {
+          for (int n = 0; n < allhomecnts[c]; n++, i++) {
+            static_cast<CCTK_REAL *>(output_arrays[d])[i] = tmp.at(indices[i]);
           }
         }
       }
     }
-    
-    
-    
-    int global_overall_ierr;
-    MPI_Allreduce
-      (const_cast<int *> (& overall_ierr), & global_overall_ierr,
-       1, MPI_INT, MPI_MIN, comm);
-    
-    
-    
+
+    // set this processor's overall local interpolator status code
+    int ierr = Util_TableSetInt (param_table_handle,
+                                 per_proc_statuses[dist::rank()],
+                                 "local_interpolator_status");
+    assert (ierr >= 0);
+
     // Done.
-    return global_overall_ierr;
+    return per_proc_retvals[dist::rank()];
   }
-  
-  
-  
-  static void
-  find_time_interpolation_order (cGH const * const cgh,
-                                 int const param_table_handle,
-                                 int const N_output_arrays,
-                                 int & prolongation_order_time,
-                                 CCTK_REAL & current_time,
-                                 vector<CCTK_INT> & time_deriv_order,
-                                 bool & have_time_derivs)
+
+
+
+  static int
+  extract_parameter_table_options (cGH const* const cctkGH,
+                                   int const param_table_handle,
+                                   int const N_interp_points,
+                                   int const N_input_arrays,
+                                   int const N_output_arrays,
+                                   bool& have_source_map,
+                                   bool& have_time_derivs,
+                                   int& prolongation_order_time,
+                                   CCTK_REAL& current_time,
+                                   vector<CCTK_INT>& source_map,
+                                   vector<CCTK_INT>& operand_indices,
+                                   vector<CCTK_INT>& time_deriv_order)
   {
+    // Find source map
+    assert (source_map.size() == N_interp_points);
+    int iret = Util_TableGetIntArray (param_table_handle, N_interp_points,
+                                      &source_map.front(), "source_map");
+    have_source_map = not (iret == UTIL_ERROR_TABLE_NO_SUCH_KEY);
+    if (not have_source_map) {
+      // No explicit source map specified
+      if (Carpet::map != -1) {
+        // Interpolate from the current map
+        source_map.assign (source_map.size(), Carpet::map);
+      } else if (maps == 1) {
+        // Interpolate from map 0 if this is the only one
+        // (for backwards compatibility)
+        source_map.assign (source_map.size(), 0);
+      } else {
+        CCTK_WARN (CCTK_WARN_ALERT, "No source map specified");
+        return -1;
+      }
+    } else if (iret < 0) {
+      CCTK_WARN (CCTK_WARN_ALERT, "internal error");
+      return -1;
+    } else if (iret != N_interp_points) {
+      CCTK_WARN (CCTK_WARN_ALERT, "Source map array has wrong size");
+      return -1;
+    } else {
+      iret = Util_TableGetIntArray (param_table_handle, source_map.size(),
+                                    &source_map.front(), "source_map");
+      assert (iret == source_map.size());
+      // Check source map
+      for (int n = 0; n < source_map.size(); ++n) {
+        assert (source_map[n] >= 0 and source_map[n] < maps);
+      }
+    }
+
     // Find the time interpolation order
-    int ierr;
-    
     int partype;
-    void const * const parptr
-      = CCTK_ParameterGet ("prolongation_order_time", "Carpet", &partype);
+    void const* const parptr = CCTK_ParameterGet ("prolongation_order_time",
+                                                  "Carpet", &partype);
     assert (parptr);
     assert (partype == PARAMETER_INTEGER);
-    prolongation_order_time = * (CCTK_INT const *) parptr;
-    
-    current_time = cgh->cctk_time / cgh->cctk_delta_time;
-    
-    ierr = Util_TableGetIntArray
-      (param_table_handle, N_output_arrays,
-       &time_deriv_order.front(), "time_deriv_order");
-    if (ierr == UTIL_ERROR_TABLE_NO_SUCH_KEY) {
-      for (int n=0; n<N_output_arrays; ++n) {
-        time_deriv_order.at(n) = 0;
+    prolongation_order_time = *(CCTK_INT const*) parptr;
+
+    current_time = cctkGH->cctk_time / cctkGH->cctk_delta_time;
+
+    iret = Util_TableGetIntArray (param_table_handle, N_output_arrays,
+                                 &time_deriv_order.front(), "time_deriv_order");
+    have_time_derivs = not (iret == UTIL_ERROR_TABLE_NO_SUCH_KEY);
+    if (not have_time_derivs) {
+      time_deriv_order.assign (time_deriv_order.size(), 0);
+    } else {
+      assert (iret == N_output_arrays);
+    }
+
+    // Find output variable indices
+    iret = Util_TableGetIntArray (param_table_handle, N_output_arrays,
+                                  &operand_indices.front(), "operand_indices");
+    if (iret == UTIL_ERROR_TABLE_NO_SUCH_KEY) {
+      assert (N_output_arrays == N_input_arrays);
+      for (int m = 0; m < N_output_arrays; ++m) {
+        operand_indices[m] = m;
       }
-      have_time_derivs = false;
     } else {
-      assert (ierr == N_output_arrays);
-      have_time_derivs = true;
+      assert (iret == N_output_arrays);
     }
+
+    return 0;
   }
-  
-  
-  
+
+
   static void
-  assign_interpolation_points_to_components (vector<int> & rlev,
-                                             vector<int> & home,
-                                             vector<int> & homecnts,
-                                             int const ml,
-                                             vector<CCTK_INT> const & source_map,
-                                             int const interp_coords_type_code,
-                                             void const * const interp_coords [],
-                                             vector<rvect> const & lower,
-                                             vector<rvect> const & upper,
-                                             vector<rvect> const & delta,
-                                             int const minrl,
-                                             int const maxrl,
-                                             int const maxncomps,
-                                             int const N_interp_points)
+  map_points (cGH const* const cctkGH,
+              int const ml,
+              int const minrl,
+              int const maxrl,
+              int const maxncomps,
+              int const npoints,
+              vector<CCTK_INT> const& source_map,
+              void const* const coords_list[],
+              vector<CCTK_REAL> const& coords,
+              vector<int>& procs,
+              vector<int>& N_points_to,
+              vector<int>& rlev,
+              vector<int>& home,
+              vector<int>& homecnts)
   {
-    for (int n=0; n<N_interp_points; ++n) {
-      
-      int const m = source_map.at(n);
-      
+    bool const map_onto_processors = coords_list != NULL;
+
+    if (not map_onto_processors) {
+      assert (coords.size() == dim * npoints);
+    }
+    assert (procs.size() == npoints);
+    assert (N_points_to.size() == dist::size());
+    assert (rlev.size() == npoints);
+    assert (home.size() == npoints);
+    assert (source_map.size() == npoints);
+
+
+    // Find out about the coordinates: origin and delta for the Carpet
+    // grid indices
+#if 0
+    const char * coord_system_name
+      = CCTK_CoordSystemName (coord_system_handle);
+    assert (coord_system_name);
+    rvect lower, upper, delta;
+    for (int d=0; d<dim; ++d) {
+      ierr = CCTK_CoordRange
+        (cctkGH, &lower[d], &upper[d], d+1, 0, coord_system_name);
+      assert (!ierr);
+      const ibbox & baseext = vhh.at(m)->baseextent;
+      delta[d]
+        = (upper[d] - lower[d]) / (baseext.upper()[d] - baseext.lower()[d]);
+    }
+#else
+    vector<rvect> lower (maps);
+    vector<rvect> delta (maps);
+    vector<rvect> upper (maps);
+    for (int m=0; m<maps; ++m) {
+      lower.at(m) = rvect::ref(cctkGH->cctk_origin_space);
+      delta.at(m) = rvect::ref(cctkGH->cctk_delta_space) / maxspacereflevelfact;
+      upper.at(m) = lower.at(m) + delta.at(m) * (vhh.at(m)->baseextent.upper() - vhh.at(m)->baseextent.lower());
+    }
+#endif
+
+    // Assign interpolation points to processors/components
+    for (int n = 0; n < npoints; ++n) {
+
+      int const m = source_map[n];
+
       // Find the grid point closest to the interpolation point
-      assert (interp_coords_type_code == CCTK_VARIABLE_REAL);
       rvect pos;
-      for (int d=0; d<dim; ++d) {
-        pos[d] = static_cast<CCTK_REAL const *>(interp_coords[d])[n];
+      for (int d = 0; d < dim; ++d) {
+        if (map_onto_processors) {
+          pos[d] = static_cast<CCTK_REAL const *>(coords_list[d])[n];
+        } else {
+          pos[d] = coords[dim*n + d];
+        }
       }
-      
+
       // Find the component that this grid point belongs to
-      rlev.at(n) = -1;
-      home.at(n) = -1;
-      if (all(pos>=lower.at(m) && pos<=upper.at(m))) {
-        for (int rl=maxrl-1; rl>=minrl; --rl) {
-          
-          ivect const fact = maxspacereflevelfact / spacereffacts.at(rl) * ipow(mgfact, mglevel);
-          ivect const ipos = ivect(floor((pos - lower.at(m)) / (delta.at(m) * fact) + 0.5)) * fact;
-          assert (all(ipos % vhh.at(m)->bases().at(ml).at(rl).stride() == 0));
-          
+      int rl = -1, c = -1;
+      if (all (pos >= lower.at(m) and pos <= upper.at(m))) {
+        for (rl = maxrl-1; rl >= minrl; --rl) {
+
+          ivect const fact = maxspacereflevelfact / spacereffacts.at(rl) *
+                             ipow(mgfact, mglevel);
+          ivect const ipos = ivect(floor((pos - lower.at(m)) / (delta.at(m) *
+                             fact) + 0.5)) * fact;
+          assert (all (ipos % vhh.at(m)->bases().at(ml).at(rl).stride() == 0));
+
           // TODO: use something faster than a linear search
-          for (int c=0; c<vhh.at(m)->components(rl); ++c) {
+          for (c = 0; c < vhh.at(m)->components(rl); ++c) {
             if (vhh.at(m)->extents().at(ml).at(rl).at(c).contains(ipos)) {
-              rlev.at(n) = rl;
-              home.at(n) = c;
               goto found;
             }
           }
         }
       }
-      CCTK_VWarn (3, __LINE__, __FILE__, CCTK_THORNSTRING,
-                  "Interpolation point #%d at [%g,%g,%g] is not on any grid patch",
-                  n, pos[0], pos[1], pos[2]);
+      // Point could not be mapped onto any processor's domain.
       // Map the point (arbitrarily) to the first component of the
       // coarsest grid
-      rlev.at(n) = minrl;
-      home.at(n) = 0;
-    found:
-      assert (rlev.at(n)>=minrl && rlev.at(n)<maxrl);
-      assert (home.at(n)>=0 && home.at(n)<vhh.at(m)->components(rlev.at(n)));
-      ++ homecnts.at(ind_rc(m, rlev.at(n), home.at(n)));
-      
-    } // for n
-  }
-  
-  
-  
-  static void
-  create_coordinate_patches (cGH const * const cgh,
-                             int const nprocs,
-                             int const minrl,
-                             int const maxrl,
-                             int const maxncomps,
-                             vector<int> const & allhomecnts,
-                             vector<data<CCTK_INT> *> & allmaps,
-                             vector<data<CCTK_REAL> *> & allcoords)
-  {
-    // Create coordinate patches
-    for (int p=0; p<nprocs; ++p) {
-      for (int rl=minrl; rl<maxrl; ++rl) {
-        for (int m=0; m<maps; ++m) {
-          for (int c=0; c<vhh.at(m)->components(rl); ++c) {
-            {
-              ivect lo (0);
-              ivect up (1);
-              up[0] = allhomecnts.at(ind_prc(p,m,rl,c));
-              up[1] = 1;
-              ivect str (1);
-              ibbox extent (lo, up-str, str);
-              allmaps.at(ind_prc(p,m,rl,c)) = new data<CCTK_INT>;
-              allmaps.at(ind_prc(p,m,rl,c))->allocate (extent, p);
-            }
-            {
-              ivect lo (0);
-              ivect up (1);
-              up[0] = allhomecnts.at(ind_prc(p,m,rl,c));
-              up[1] = dim;
-              ivect str (1);
-              ibbox extent (lo, up-str, str);
-              allcoords.at(ind_prc(p,m,rl,c)) = new data<CCTK_REAL>;
-              allcoords.at(ind_prc(p,m,rl,c))->allocate (extent, p);
-            }
-          }
-        }
-      }
-    }
-  }
-  
-  
-  
-  static void
-  fill_local_coordinate_patches (cGH const * const cgh,
-                                 int const nprocs,
-                                 int const minrl,
-                                 int const maxrl,
-                                 int const maxncomps,
-                                 CCTK_INT const N_interp_points,
-                                 vector<int> const & rlev,
-                                 vector<CCTK_INT> const & source_map,
-                                 vector<int> const & home,
-                                 vector<int> const & homecnts,
-                                 int const myproc,
-                                 vector<int> const & allhomecnts,
-                                 vector<data<CCTK_INT> *> & allmaps,
-                                 vector<data<CCTK_REAL> *> & allcoords,
-                                 CCTK_POINTER_TO_CONST const interp_coords [])
-  {
-    // Fill in local coordinate patches
-    vector<int> tmpcnts ((maxrl-minrl) * maps * maxncomps);
-    for (int n=0; n<N_interp_points; ++n) {
-      int const rl = rlev.at(n);
-      int const m = source_map.at(n);
-      int const c = home.at(n);
-      assert (rl>=minrl && rl<maxrl);
-      assert (m>=0 && m<maps);
-      assert (c>=0 && c<vhh.at(m)->components(rl));
-      assert (tmpcnts.at(ind_rc(m,rl,c)) >= 0);
-      assert (tmpcnts.at(ind_rc(m,rl,c)) < homecnts.at(ind_rc(m,rl,c)));
-      assert (dim==3);
-      (*allmaps.at(ind_prc(myproc,m,rl,c)))[ivect(tmpcnts.at(ind_rc(m,rl,c)),0,0)]
-        = source_map.at(n);
-      for (int d=0; d<dim; ++d) {
-        (*allcoords.at(ind_prc(myproc,m,rl,c)))[ivect(tmpcnts.at(ind_rc(m,rl,c)),d,0)]
-          = static_cast<CCTK_REAL const *>(interp_coords[d])[n];
-      }
-      ++ tmpcnts.at(c + maxncomps * (m + maps * (rl-minrl)));
-    }
-    for (int rl=minrl; rl<maxrl; ++rl) {
-      for (int m=0; m<maps; ++m) {
-        for (int c=0; c<vhh.at(m)->components(rl); ++c) {
-          assert (tmpcnts.at(ind_rc(m,rl,c)) == homecnts.at(ind_rc(m,rl,c)));
-        }
+      if (map_onto_processors) {
+        CCTK_VWarn (CCTK_WARN_PICKY, __LINE__, __FILE__, CCTK_THORNSTRING,
+                    "Interpolation point #%d at [%g,%g,%g] is not on "
+                    "any grid patch", n, pos[0], pos[1], pos[2]);
+        rl = minrl;
+        c = 0;
+      } else {
+        assert (0 and "point is not mapped");
       }
-    }
-  }
-  
-  
-  
-  static void
-  transfer_coordinate_patches (cGH const * const cgh,
-                               int const nprocs,
-                               int const minrl,
-                               int const maxrl,
-                               int const maxncomps,
-                               vector<data<CCTK_INT> *> & allmaps,
-                               vector<data<CCTK_REAL> *> & allcoords)
-  {
-    // Transfer coordinate patches
-    for (comm_state state; !state.done(); state.step()) {
-      for (int p=0; p<nprocs; ++p) {
-        for (int rl=minrl; rl<maxrl; ++rl) {
-          for (int m=0; m<maps; ++m) {
-            for (int c=0; c<vhh.at(m)->components(rl); ++c) {
-              allmaps.at(ind_prc(p,m,rl,c))->change_processor
-                (state, vhh.at(m)->processors().at(rl).at(c));
-              allcoords.at(ind_prc(p,m,rl,c))->change_processor
-                (state, vhh.at(m)->processors().at(rl).at(c));
-            }
-          }
-        }
+    found:
+      assert (rl >= minrl and rl < maxrl);
+      assert (c >= 0 and c < vhh.at(m)->components(rl));
+
+      if (map_onto_processors) {
+        procs[n] = vhh.at(m)->proc(rl, c);
+        ++ N_points_to[procs[n]];
       }
-    }
+      rlev[n] = rl;
+      home[n] = c;
+      ++ homecnts.at(component_idx (procs[n], m, rl, c));
+
+    } // for n
   }
-  
-  
-  
+
+
   static void
-  create_output_patches (cGH const * const cgh,
-                         int const nprocs,
-                         int const minrl,
-                         int const maxrl,
-                         int const maxncomps,
-                         CCTK_INT const N_output_arrays,
-                         vector<int> const & allhomecnts,
-                         vector<data<CCTK_REAL> *> & allcoords,
-                         vector<data<CCTK_REAL> *> & alloutputs,
-                         vector<data<CCTK_INT> *> & allstatuses)
-  {
-    for (int p=0; p<nprocs; ++p) {
-      for (int rl=minrl; rl<maxrl; ++rl) {
-        for (int m=0; m<maps; ++m) {
-          for (int c=0; c<vhh.at(m)->components(rl); ++c) {
-            ivect const lo (0);
-            ivect up (1);
-            up[0] = allhomecnts.at(ind_prc(p,m,rl,c));
-            up[1] = N_output_arrays;
-            ivect const str (1);
-            ibbox const extent (lo, up-str, str);
-            alloutputs.at(ind_prc(p,m,rl,c)) = new data<CCTK_REAL>;
-            alloutputs.at(ind_prc(p,m,rl,c))->allocate
-              (extent, vhh.at(m)->processors().at(rl).at(c));
-            
-            ivect const slo (0);
-            ivect sup (1);
-            sup[0] = allhomecnts.at(ind_prc(p,m,rl,c));
-            ivect const sstr (1);
-            ibbox const sextent (lo, up-str, str);
-            allstatuses.at(ind_prc(p,m,rl,c)) = new data<CCTK_INT>;
-            allstatuses.at(ind_prc(p,m,rl,c))->allocate
-              (sextent, vhh.at(m)->processors().at(rl).at(c));
-          }
-        }
-      }
-    }
-  }
-  
-  
-  
-  static int
-  do_local_interpolation (cGH const * const cgh,
+  interpolate_components (cGH const* const cctkGH,
                           int const minrl,
                           int const maxrl,
                           int const maxncomps,
                           bool const want_global_mode,
                           int const prolongation_order_time,
                           int const N_dims,
-                          vector<data<CCTK_REAL> *> & allcoords,
-                          vector<data<CCTK_REAL> *> & alloutputs,
-                          vector<data<CCTK_INT> *> & allstatuses,
-                          vector<CCTK_INT> & time_deriv_order,
-                          vector<int> & allhomecnts,
+                          vector<int> & homecnts,
+                          vector<CCTK_REAL*>& coords,
+                          vector<CCTK_REAL*>& outputs,
+                          CCTK_INT* const per_proc_statuses,
+                          CCTK_INT* const per_proc_retvals,
+                          vector<CCTK_INT>& operand_indices,
+                          vector<CCTK_INT>& time_deriv_order,
+                          bool const have_time_derivs,
                           CCTK_INT const local_interp_handle,
                           CCTK_INT const param_table_handle,
                           CCTK_REAL const current_time,
                           int const interp_coords_type_code,
-                          bool const have_time_derivs,
                           int const N_input_arrays,
                           int const N_output_arrays,
                           CCTK_INT const output_array_type_codes [],
                           CCTK_INT const input_array_variable_indices [])
   {
-    // Do the local interpolation
-    int ierr;
-    int overall_ierr = 0;
-    
-    BEGIN_GLOBAL_MODE(cgh) {
+    // Find out about the number of time levels for interpolation
+    // for all input arrays
+    vector<CCTK_INT> interp_num_time_levels (N_input_arrays, 0);
+    for (int n = 0; n < N_input_arrays; n++) {
+      int const vi = input_array_variable_indices[n];
+      if (vi >= 0) {
+        int const gi = CCTK_GroupIndexFromVarI (vi);
+        assert (gi >= 0 and gi < CCTK_NumGroups ());
+        int const table = CCTK_GroupTagsTableI (gi);
+        assert (table >= 0);
+
+        Util_TableGetInt (table, &interp_num_time_levels[n],
+                          "InterpNumTimelevels");
+      }
+    }
+
+    BEGIN_GLOBAL_MODE(cctkGH) {
       for (int rl=minrl; rl<maxrl; ++rl) {
-        enter_level_mode (const_cast<cGH*>(cgh), rl);
-        
+        enter_level_mode (const_cast<cGH*>(cctkGH), rl);
+
         // Number of necessary time levels
-        CCTK_REAL const level_time = cgh->cctk_time / cgh->cctk_delta_time;
+        CCTK_REAL const level_time = cctkGH->cctk_time / cctkGH->cctk_delta_time;
         bool const need_time_interp
           = (have_time_derivs
-             || (fabs(current_time - level_time)
+             or (fabs(current_time - level_time)
                  > 1e-12 * (fabs(level_time) + fabs(current_time)
-                            + fabs(cgh->cctk_delta_time))));
+                            + fabs(cctkGH->cctk_delta_time))));
         assert (! (! want_global_mode
-                   && ! have_time_derivs
-                   && need_time_interp));
+                   and ! have_time_derivs
+                   and need_time_interp));
         int const num_tl = need_time_interp ? prolongation_order_time + 1 : 1;
-        
-        BEGIN_MAP_LOOP(cgh, CCTK_GF) {
-          BEGIN_LOCAL_COMPONENT_LOOP(cgh, CCTK_GF) {
-            
-            ierr = interpolate_within_component
-              (cgh,
-               minrl, maxrl, maxncomps,
-               N_dims,
-               allcoords, alloutputs, allstatuses,
-               time_deriv_order,
-               allhomecnts,
-               local_interp_handle, param_table_handle,
-               current_time,
-               interp_coords_type_code,
-               num_tl,
-               need_time_interp,
-               N_input_arrays, N_output_arrays,
-               output_array_type_codes,
-               input_array_variable_indices);
-            overall_ierr = min(overall_ierr, ierr);
-            
+
+        BEGIN_MAP_LOOP(cctkGH, CCTK_GF) {
+          BEGIN_LOCAL_COMPONENT_LOOP(cctkGH, CCTK_GF) {
+
+            for (int p = 0; p < dist::size(); p++) {
+              int const idx = component_idx (p, Carpet::map,reflevel,component);
+              if (homecnts[idx] > 0) {
+                interpolate_single_component
+                  (cctkGH,
+                   minrl, maxrl, maxncomps, N_dims,
+                   homecnts[idx], coords[idx], outputs[idx],
+                   per_proc_statuses[p], per_proc_retvals[p],
+                   operand_indices, time_deriv_order, interp_num_time_levels,
+                   local_interp_handle, param_table_handle,
+                   num_tl, need_time_interp, current_time,
+                   N_input_arrays, N_output_arrays,
+                   interp_coords_type_code, output_array_type_codes,
+                   input_array_variable_indices);
+              }
+            }
+
           } END_LOCAL_COMPONENT_LOOP;
         } END_MAP_LOOP;
-        
-        leave_level_mode (const_cast<cGH*>(cgh));
+
+        leave_level_mode (const_cast<cGH*>(cctkGH));
       } // for rl
-      
+
     } END_GLOBAL_MODE;
-    
-    return overall_ierr;
-  }
-  
-  
-  
-  static void
-  transfer_output_patches_back (cGH const * const cgh,
-                                int const nprocs,
-                                int const minrl,
-                                int const maxrl,
-                                int const maxncomps,
-                                vector<data<CCTK_REAL> *> & alloutputs,
-                                vector<data<CCTK_INT> *> & allstatuses)
-  {
-    // Transfer output patches back
-    for (comm_state state; !state.done(); state.step()) {
-      for (int p=0; p<nprocs; ++p) {
-        for (int rl=minrl; rl<maxrl; ++rl) {
-          for (int m=0; m<maps; ++m) {
-            for (int c=0; c<vhh.at(m)->components(rl); ++c) {
-              alloutputs.at(ind_prc(p,m,rl,c))->change_processor (state, p);
-              allstatuses.at(ind_prc(p,m,rl,c))->change_processor (state, p);
-            }
-          }
-        }
-      }
-    }
   }
-  
-  
-  
-  static void
-  read_out_local_output_patches (cGH const * const cgh,
-                                 int const nprocs,
-                                 int const minrl,
-                                 int const maxrl,
-                                 int const maxncomps,
-                                 int const myproc,
-                                 CCTK_INT const N_interp_points,
-                                 vector<int> const & rlev,
-                                 vector<CCTK_INT> const & source_map,
-                                 vector<int> const & home,
-                                 CCTK_INT const N_output_arrays,
-                                 CCTK_INT const interp_coords_type_code,
-                                 vector<data<CCTK_REAL> *> const & alloutputs,
-                                 CCTK_POINTER const output_arrays [],
-                                 vector<data<CCTK_INT> *> const & allstatuses,
-                                 vector<int> const & allhomecnts,
-                                 vector<int> const & homecnts,
-                                 CCTK_INT const param_table_handle)
-  {
-    // Read out local output patches
-    vector<int> tmpcnts ((maxrl-minrl) * maps * maxncomps);
-    CCTK_INT local_interpolator_status = 0;
-    for (int n=0; n<N_interp_points; ++n) {
-      int const rl = rlev.at(n);
-      int const m = source_map.at(n);
-      int const c = home.at(n);
-      for (int j=0; j<N_output_arrays; ++j) {
-        assert (interp_coords_type_code == CCTK_VARIABLE_REAL);
-        assert (alloutputs.at(ind_prc(myproc,m,rl,c))->proc() == dist::rank());
-        assert (output_arrays);
-        assert (output_arrays[j]);
-        static_cast<CCTK_REAL *>(output_arrays[j])[n]
-          = (*alloutputs.at(ind_prc(myproc,m,rl,c)))[ivect(tmpcnts.at(ind_rc(m,rl,c)),j,0)];
-      }
-      local_interpolator_status
-        = min (local_interpolator_status,
-               (*allstatuses.at(ind_prc(myproc,m,rl,c)))[ivect(tmpcnts.at(ind_rc(m,rl,c)),0,0)]);
-      ++ tmpcnts.at(ind_rc(m,rl,c));
-    }
-    for (int rl=minrl; rl<maxrl; ++rl) {
-      for (int m=0; m<maps; ++m) {
-        for (int c=0; c<vhh.at(m)->components(rl); ++c) {
-          assert (tmpcnts.at(ind_rc(m,rl,c)) == homecnts.at(ind_rc(m,rl,c)));
-        }
-      }
-    }
-    int ierr = Util_TableSetInt
-      (param_table_handle,
-       local_interpolator_status, "local_interpolator_status");
-    assert (ierr >= 0);
-  }
-  
- 
+
+  ///////////////////////////////////////////////////////////////////////////
+
+
   /** A list of time values corresoponding to the last few timelevels
    * on the given patch.
    *
@@ -1030,15 +900,15 @@ namespace CarpetInterp {
     }
   };
 
-  /** Represents interpolstion coefficients, or weights, for a given
+  /** Represents interpolation coefficients, or weights, for a given
    *  derivative order and set  of time levels.
-   *  
+   *
    * */
   class InterpolationWeights : private vector<CCTK_REAL>
   {
   public:
-    InterpolationWeights (CCTK_INT deriv_order, 
-		    const InterpolationTimes & times, CCTK_REAL current_time )
+    InterpolationWeights (CCTK_INT deriv_order,
+                    const InterpolationTimes & times, CCTK_REAL current_time )
       : vector<CCTK_REAL> (times.num_timelevels () )
     {
       CCTK_INT num_timelevels = times.num_timelevels ();
@@ -1059,7 +929,7 @@ namespace CarpetInterp {
           assert (0);
         }
         break;
-                
+
       case 1:
         switch (num_timelevels) {
         case 2:
@@ -1072,7 +942,7 @@ namespace CarpetInterp {
           assert (0);
         }
         break;
-        
+
       case 2:
         switch (num_timelevels) {
         case 3:
@@ -1082,7 +952,7 @@ namespace CarpetInterp {
         assert (0);
         }
         break;
-        
+
       default:
         assert (0);
       } // switch time_deriv_order
@@ -1141,248 +1011,182 @@ namespace CarpetInterp {
       at(2) = 2 / ((t[2] - t[0]) * (t[2] - t[1]));
     }
   };
-  
-  
-  static int
-  interpolate_within_component (cGH const * const cgh,
+
+
+  static void
+  interpolate_single_component (cGH const* const cctkGH,
                                 int const minrl,
                                 int const maxrl,
                                 int const maxncomps,
                                 int const N_dims,
-                                vector<data<CCTK_REAL> *> & allcoords,
-                                vector<data<CCTK_REAL> *> & alloutputs,
-                                vector<data<CCTK_INT> *> & allstatuses,
-                                vector<CCTK_INT> & time_deriv_order,
-                                vector<int> & allhomecnts,
+                                int const npoints,
+                                CCTK_REAL const* const coords,
+                                CCTK_REAL* const outputs,
+                                CCTK_INT& overall_status,
+                                CCTK_INT& overall_retval,
+                                vector<CCTK_INT>& operand_indices,
+                                vector<CCTK_INT>& time_deriv_order,
+                                vector<CCTK_INT>& interp_num_time_levels,
                                 CCTK_INT const local_interp_handle,
                                 CCTK_INT const param_table_handle,
-                                CCTK_REAL const current_time,
-                                int const interp_coords_type_code,
                                 int const num_tl,
                                 bool const need_time_interp,
+                                CCTK_REAL const current_time,
                                 int const N_input_arrays,
                                 int const N_output_arrays,
+                                int const interp_coords_type_code,
                                 CCTK_INT const output_array_type_codes [],
                                 CCTK_INT const input_array_variable_indices [])
   {
-    int ierr;
-    int overall_ierr = 0;
-    int const nprocs = CCTK_nProcs (cgh);
-    
     // Find out about the local geometry
     ivect lsh;
     rvect coord_origin, coord_delta;
     for (int d=0; d<dim; ++d) {
-      lsh[d] = cgh->cctk_lsh[d];
-      coord_delta[d] = cgh->cctk_delta_space[d] / cgh->cctk_levfac[d];
-      coord_origin[d] = cgh->cctk_origin_space[d] + (cgh->cctk_lbnd[d] + 1.0 * cgh->cctk_levoff[d] / cgh->cctk_levoffdenom[d]) * coord_delta[d];
+      lsh[d] = cctkGH->cctk_lsh[d];
+      coord_delta[d] = cctkGH->cctk_delta_space[d] / cctkGH->cctk_levfac[d];
+      coord_origin[d] = cctkGH->cctk_origin_space[d] + (cctkGH->cctk_lbnd[d] + 1.0 * cctkGH->cctk_levoff[d] / cctkGH->cctk_levoffdenom[d]) * coord_delta[d];
     }
-    
-    
-    
+
     // Find out about the grid functions
     vector<CCTK_INT> input_array_type_codes(N_input_arrays);
     vector<const void *> input_arrays(N_input_arrays);
     for (int n=0; n<N_input_arrays; ++n) {
       if (input_array_variable_indices[n] == -1) {
-        
+
         // Ignore this entry
         input_array_type_codes.at(n) = -1;
-        
+
       } else {
-        
+
         int const vi = input_array_variable_indices[n];
-        assert (vi>=0 && vi<CCTK_NumVars());
-        
+        assert (vi>=0 and vi<CCTK_NumVars());
+
         int const gi = CCTK_GroupIndexFromVarI (vi);
-        assert (gi>=0 && gi<CCTK_NumGroups());
-        
+        assert (gi>=0 and gi<CCTK_NumGroups());
+
         cGroup group;
-        ierr = CCTK_GroupData (gi, &group);
+        int ierr = CCTK_GroupData (gi, &group);
         assert (!ierr);
         assert (group.grouptype == CCTK_GF);
         assert (group.dim == dim);
         assert (group.disttype == CCTK_DISTRIB_DEFAULT);
         assert (group.stagtype == 0); // not staggered
-        
+
         input_array_type_codes.at(n) = group.vartype;
-        
+
       }
     } // for input arrays
-    
-    
-    
-    // Work on the data from all processors
-    for (int p=0; p<nprocs; ++p) {
-      assert (allcoords.at(ind_prc(p,Carpet::map,reflevel,component))->proc() == dist::rank());
-      assert (allhomecnts.at(ind_prc(p,Carpet::map,reflevel,component)) == allcoords.at(ind_prc(p,Carpet::map,reflevel,component))->shape()[0]);
-      assert (allhomecnts.at(ind_prc(p,Carpet::map,reflevel,component)) == alloutputs.at(ind_prc(p,Carpet::map,reflevel,component))->shape()[0]);
-      
-      int const npoints = allhomecnts.at(ind_prc(p,Carpet::map,reflevel,component));
-      
-      // Do the processor-local interpolation
-      const void * tmp_interp_coords[dim];
-      for (int d=0; d<dim; ++d) {
-        tmp_interp_coords[d] = &(*allcoords.at(ind_prc(p,Carpet::map,reflevel,component)))[ivect(0,d,0)];
-      }
-      
-      
-      
-      vector<vector<void *> > tmp_output_arrays (num_tl);
-      vector<void *> tmp_status_array (num_tl);
-      
-      for (int tl=0; tl<num_tl; ++tl) {
-        
-        for (int n=0; n<N_input_arrays; ++n) {
-          
-          int const vi = input_array_variable_indices[n];
-          assert (vi>=0 && vi<CCTK_NumVars());
-          
-          int const gi = CCTK_GroupIndexFromVarI (vi);
-          assert (gi>=0 && gi<CCTK_NumGroups());
-          
-          int const table = CCTK_GroupTagsTableI (gi);
-          assert (table>=0);
-          
-          int my_num_tl;
-          CCTK_INT interp_num_time_levels;
-          int const ilen = Util_TableGetInt
-            (table, &interp_num_time_levels, "InterpNumTimelevels");
-          if (ilen == UTIL_ERROR_TABLE_NO_SUCH_KEY) {
-            my_num_tl = num_tl;
-          } else if (ilen >= 0) {
-            my_num_tl = interp_num_time_levels;
-            assert (my_num_tl>0 && my_num_tl<=num_tl);
-          } else {
-            assert (0);
-          }
-          
-          if (vi == -1) {
-            input_arrays.at(n) = NULL;
-          } else {
-            // Do a dummy interpolation from a later timelevel
-            // if the desired timelevel does not exist
-            int const my_tl = tl >= my_num_tl ? 0 : tl;
-            assert (vi>=0 && vi<CCTK_NumVars());
+
+
+
+    // Do the processor-local interpolation
+    void const* tmp_coords[dim];
+    for (int d = 0; d < dim; ++d) {
+      tmp_coords[d] = coords + d*npoints;
+    }
+
+    vector<vector<void *> > tmp_output_arrays (num_tl);
+
+    for (int tl=0; tl<num_tl; ++tl) {
+
+      for (int n=0; n<N_input_arrays; ++n) {
+
+        int const vi = input_array_variable_indices[n];
+        assert (vi >= 0 and vi < CCTK_NumVars());
+
+        if (vi == -1) {
+          input_arrays.at(n) = NULL;
+        } else {
+          int const interp_num_tl = interp_num_time_levels[n] > 0 ?
+                                    interp_num_time_levels[n] : num_tl;
+
+          // Do a dummy interpolation from a later timelevel
+          // if the desired timelevel does not exist
+          int const my_tl = tl >= interp_num_tl ? 0 : tl;
 #if 0
-            input_arrays.at(n) = CCTK_VarDataPtrI (cgh, my_tl, vi);
+          input_arrays.at(n) = CCTK_VarDataPtrI (cctkGH, my_tl, vi);
 #else
-            int const vi0 = CCTK_FirstVarIndexI (gi);
-            input_arrays.at(n)
-              = ((*arrdata.at(gi).at(Carpet::map).data.at(vi-vi0))
-                 (my_tl, reflevel, component, mglevel)->storage());
+          int const gi = CCTK_GroupIndexFromVarI (vi);
+          int const vi0 = CCTK_FirstVarIndexI (gi);
+          input_arrays.at(n)
+            = ((*arrdata.at(gi).at(Carpet::map).data.at(vi-vi0))
+               (my_tl, reflevel, component, mglevel)->storage());
 #endif
-          }
-        } // for input arrays
-        
-        tmp_output_arrays.at(tl).resize (N_output_arrays);
-        for (int j=0; j<N_output_arrays; ++j) {
-          assert (output_array_type_codes[j] == CCTK_VARIABLE_REAL);
-          if (need_time_interp) {
-            tmp_output_arrays.at(tl).at(j) = new CCTK_REAL [npoints];
-          } else {
-            tmp_output_arrays.at(tl).at(j) = &(*alloutputs.at(ind_prc(p,Carpet::map,reflevel,component)))[ivect(0,j,0)];
-          }
         }
-        tmp_status_array.at(tl) = &(*allstatuses.at(ind_prc(p,Carpet::map,reflevel,component)))[ivect(0,0,0)];
-        
-        ierr = Util_TableSetPointer
-          (param_table_handle, tmp_status_array.at(tl), "per_point_status");
-        assert (ierr >= 0);
-        
-        ierr = CCTK_InterpLocalUniform
-          (N_dims, local_interp_handle, param_table_handle,
-           &coord_origin[0], &coord_delta[0],
-           npoints,
-           interp_coords_type_code, tmp_interp_coords,
-           N_input_arrays, &lsh[0],
-           &input_array_type_codes[0], &input_arrays[0],
-           N_output_arrays,
-           output_array_type_codes, &tmp_output_arrays.at(tl)[0]);
-        if (ierr) {
-          CCTK_VWarn (4, __LINE__, __FILE__, CCTK_THORNSTRING,
-                      "The local interpolator returned the error code %d", ierr);
+      } // for input arrays
+
+      tmp_output_arrays.at(tl).resize (N_output_arrays);
+      for (int j=0; j<N_output_arrays; ++j) {
+        assert (output_array_type_codes[j] == CCTK_VARIABLE_REAL);
+        if (need_time_interp) {
+          tmp_output_arrays.at(tl).at(j) = new CCTK_REAL [npoints];
+        } else {
+          tmp_output_arrays.at(tl).at(j) = outputs + j*npoints;
         }
-        overall_ierr = min(overall_ierr, ierr);
-        
-        ierr = Util_TableDeleteKey
-          (param_table_handle, "per_point_status");
-        assert (! ierr);
-        
-      } // for tl
-      
-      // Interpolate in time, if necessary
-      if (need_time_interp) {
-        
-        for (int j=0; j<N_output_arrays; ++j) {
-          
-          // Find output variable indices
-          vector<CCTK_INT> operand_indices (N_output_arrays);
-          ierr = Util_TableGetIntArray
-            (param_table_handle, N_output_arrays,
-             &operand_indices.front(), "operand_indices");
-          if (ierr == UTIL_ERROR_TABLE_NO_SUCH_KEY) {
-            assert (N_output_arrays == N_input_arrays);
-            for (int m=0; m<N_output_arrays; ++m) {
-              operand_indices.at(m) = m;
-            }
-          } else {
-            assert (ierr == N_output_arrays);
-          }
-          
-          // Find input array for this output array
-          int const n = operand_indices.at(j);
-          CCTK_INT const deriv_order = time_deriv_order.at(j);
-          
-          int const vi = input_array_variable_indices[n];
-          assert (vi>=0 && vi<CCTK_NumVars());
-          
-          int const gi = CCTK_GroupIndexFromVarI (vi);
-          assert (gi>=0 && gi<CCTK_NumGroups());
-          
-          int const table = CCTK_GroupTagsTableI (gi);
-          assert (table>=0);
-          
-          int my_num_tl;
-          CCTK_INT interp_num_time_levels;
-          int const ilen = Util_TableGetInt
-            (table, &interp_num_time_levels, "InterpNumTimelevels");
-          if (ilen == UTIL_ERROR_TABLE_NO_SUCH_KEY) {
-            my_num_tl = num_tl;
-          } else if (ilen >= 0) {
-            my_num_tl = interp_num_time_levels;
-            assert (my_num_tl>0 && my_num_tl<=num_tl);
-          } else {
-            assert (0);
-          }
-          
-          const InterpolationTimes times (my_num_tl);
-          const InterpolationWeights tfacs (deriv_order, times, current_time);
-
-          // Interpolate
-          assert (output_array_type_codes[j] == CCTK_VARIABLE_REAL);
-          for (int k=0; k<npoints; ++k) {
-            CCTK_REAL & dest = (*alloutputs.at(ind_prc(p,Carpet::map,reflevel,component)))[ivect(k,j,0)];
-            dest = 0;
-            for (int tl=0; tl<my_num_tl; ++tl) {
-              CCTK_REAL const src = ((CCTK_REAL const *)tmp_output_arrays.at(tl).at(j))[k];
-              dest += tfacs[tl] * src;
-            }
-          }
-          
-        } // for j
-        
-        for (int tl=0; tl<num_tl; ++tl) {
-          for (int j=0; j<N_output_arrays; ++j) {
-            delete [] (CCTK_REAL *)tmp_output_arrays.at(tl).at(j);
+      }
+
+      vector<CCTK_INT> per_point_status (npoints);
+      int ierr = Util_TableSetPointer
+        (param_table_handle, &per_point_status.front(), "per_point_status");
+      assert (ierr >= 0);
+
+      int retval = CCTK_InterpLocalUniform
+        (N_dims, local_interp_handle, param_table_handle,
+         &coord_origin[0], &coord_delta[0],
+         npoints,
+         interp_coords_type_code, tmp_coords,
+         N_input_arrays, &lsh[0],
+         &input_array_type_codes[0], &input_arrays[0],
+         N_output_arrays,
+         output_array_type_codes, &tmp_output_arrays.at(tl)[0]);
+      if (retval) {
+        CCTK_VWarn (CCTK_WARN_DEBUG, __LINE__, __FILE__, CCTK_THORNSTRING,
+                    "The local interpolator returned the error code %d",retval);
+      }
+
+      overall_retval = min (overall_retval, retval);
+      for (int n = 0; n < per_point_status.size(); n++) {
+        overall_status = min (overall_status, per_point_status[n]);
+      }
+      ierr = Util_TableDeleteKey (param_table_handle, "per_point_status");
+      assert (! ierr);
+
+    } // for tl
+
+    // Interpolate in time, if necessary
+    if (need_time_interp) {
+
+      for (int j=0; j<N_output_arrays; ++j) {
+
+        // Find input array for this output array
+        int const n = operand_indices.at(j);
+        CCTK_INT const deriv_order = time_deriv_order.at(j);
+
+        int const interp_num_tl = interp_num_time_levels.at(n) > 0 ?
+                                  interp_num_time_levels.at(n) : num_tl;
+        const InterpolationTimes times (interp_num_tl);
+        const InterpolationWeights tfacs (deriv_order, times, current_time);
+
+        // Interpolate
+        assert (output_array_type_codes[j] == CCTK_VARIABLE_REAL);
+        for (int k=0; k<npoints; ++k) {
+          CCTK_REAL & dest = outputs[k + j*npoints];
+          dest = 0;
+          for (int tl = 0; tl < interp_num_tl; ++tl) {
+            CCTK_REAL const src = ((CCTK_REAL const *)tmp_output_arrays.at(tl).at(j))[k];
+            dest += tfacs[tl] * src;
           }
         }
-        
-      } // if need_time_interp
-      
-    } // for processors
-    
-    return overall_ierr;
+
+      } // for j
+
+      for (int tl=0; tl<num_tl; ++tl) {
+        for (int j=0; j<N_output_arrays; ++j) {
+          delete [] (CCTK_REAL *)tmp_output_arrays.at(tl).at(j);
+        }
+      }
+
+    } // if need_time_interp
   }
-  
+
 } // namespace CarpetInterp
author	Thomas Radke <tradke@aei.mpg.de>	2005-08-26 11:51:00 +0000
committer	Thomas Radke <tradke@aei.mpg.de>	2005-08-26 11:51:00 +0000
commit	ed0dc00bca80c9a29ee27208cc5ce36ff2709b92 (patch)
tree	deefeb13cdeaa9c6a60103b042f3480ee8e60160 /Carpet/CarpetInterp
parent	d21cbeb047f2816c252845fd3b2c1d497a684626 (diff)