doc/documentation.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317

\documentclass{article}

% Use the Cactus ThornGuide style file
% (Automatically used from Cactus distribution, if you have a 
%  thorn without the Cactus Flesh download this from the Cactus
%  homepage at www.cactuscode.org)
\usepackage{../../../../doc/latex/cactus}

\begin{document}

\title{IOFlexIO}
\author{Paul Walker}
\date{$ $Date$ $}

\maketitle

% Do not delete next line
% START CACTUS THORNGUIDE

\begin{abstract}
Thorn {\bf IOFlexIO} provides I/O methods to output variables in IEEEIO
file format. It also implements checkpointing/recovery functionality.
\end{abstract}


%
\section{Purpose}
%
Thorn {\bf IOFlexIO} uses John Shalf's FlexIO library (see {\tt
http://bach.ncsa.uiuc.edu/FlexIO/} for details) to output any type of grid
variables (grid scalars, grid functions, and arrays of arbitrary dimension)
in the IEEEIO file format.\\

The thorn registers two I/O methods with the flesh's I/O interface at startup:
%
\begin{itemize}
  \item method {\tt IOFlexIOD} outputs all types of grid variables with
    arbitrary dimensions
  \item method {\tt IOFlexIO\_2D} outputs two-dimensional slices (xy-, xz-,
    and yz-slice) of three-dimensional grid functions and arrays
\end{itemize}

Data is written into files named {\tt "<varname>.ieee"} (for method {\tt IOFlexIO}) and {\tt "<varname>\_2d\_<plane>.ieee"} (for method {\tt IOFlexIO\_2D}).
Such datafiles can be used for further postprocessing (eg. visualization)
or fed back into Cactus via the filereader capabilities of thorn IOUtil.\\[3ex]


\section{{\bf IOFlexIO} Parameters}

Parameters to control the {\tt IOFlexIO} I/O method are:
\begin{itemize}
  \item {\tt IOFlexIO::out\_every} (steerable)\\
        How often to do periodic {\tt IOFlexIO} output. If this parameter is set
        in the parameter file, it will override the setting of the shared
        {\tt IO::out\_every} parameter. The output frequency can also be set
        for individual variables using the {\tt out\_every} option in an option
        string appended to the {\tt IOFlexIO::out\_vars} parameter.
  \item {\tt IOFlexIO::out\_vars} (steerable)\\
        The list of variables to output using the {\bf IOFlexIO} I/O method.
        The variables must be given by their fully qualified variable or group
        name. The special keyword {\it all} requests {\tt IOFlexIO} output for
        all variables. Multiple names must be separated by whitespaces.\\
        An option string can be appended in curly braces to a group/variable
        name. Supported options are {\tt out\_every} (to set the output
        frequency for individual variables) and hyperslab options (see section
        \ref{IOFlexIO_output_hyperslabs} for details).
  \item {\tt IOFlexIO::out\_dir}\\
        The directory in which to place the {\tt IOFlexIO} output files.
        If the directory doesn't exist at startup it will be created.\\
        If this parameter is set to an empty string {\tt IOFlexIO} output will go
        to the standard output directory as specified in {\tt IO::out\_dir}.
\end{itemize}


\section{Serial versus Parallel Output}

According to the ouptput mode parameter settings ({\tt IO::out\_mode,
IO::out\_unchunked, IO::out\_proc\_every}) of thorn {\bf IOUtil}, thorn {\bf IOFlexIO}
will output distributed data either
\begin{itemize}
  \item in serial into a single unchunked file
\begin{verbatim}
  IO::out_mode      = "onefile"
  IO::out_unchunked = "yes"
\end{verbatim}
  \item in parallel, that is, into separate files containing chunks of the
        individual processors' patches of the distributed array
\begin{verbatim}
  IO::out_mode      = "proc | np"
\end{verbatim}
\end{itemize}
The default is to output data in parallel, in order to get maximum I/O
performance. If needed, you can recombine the resulting chunked datafiles
into a single unchunked file using the recombiner utility program.
See section \ref{IOFlexIO_utility_programs} for information how to build the
recombiner program.

If you have a lot of different variables to recombine you can use the following
Bourne shell commands to recombine them.
This assumes that the chunked output files for each variable are located in a
subdirectory {\tt <varname>\_<vardim>d/}.
The recombined output file {\tt <varname>.ieee} would then be placed into
the current working directory:

\begin{verbatim}
  for var in *_3d ;                                                          \
    do                                                                       \
    {                                                                        \
      if [ ! -r $var.ieee ] ; then                                           \
        ieee_recombiner $var/$var.file_0.ieee $var.ieee;                     \
      fi;                                                                    \
    };                                                                       \
    done
\end{verbatim}
\vspace*{3ex}


\section{Output of Hyperslab Data}
\label{IOFlexIO_output_hyperslabs}

By default, thorn {\bf IOFlexIO} outputs multidimensional Cactus variables with
their full contents resulting in maximum data output. This can be changed for
individual variables by specifying a hyperslab as a subset of the data within
the N-dimensional volume. Such a subset (called a {\it hyperslab}) is generally
defined as an orthogonal region into the multidimensional dataset, with an
origin (lower left corner of the hyperslab), direction vectors (defining the
number of hyperslab dimensions and spanning the hyperslab within the
N-dimensional grid), an extent (the length of the hyperslab in each of its
dimensions), and an optional downsampling factor.

Hyperslab parameters can be set for individual variables using an option string
appended to the variables' full names in the {\tt IOFlexIO::out\_vars} parameter.

Here is an example which outputs two 3D grid functions {\tt Grid::r} and {\tt
Wavetoy::phi}. While the first is output with their full contents at every
5th iteration (overriding the {\tt IOFlexIO::out\_every} parameter for this
variable), a two-dimensional hyperslab is defined for the second grid function.
This hyperslab defines a subvolume to output, starting with a 5 grid points
offset into the grid, spanning in the yz-plane, with an extent of 10 and 20
grid points in y- and z-direction respectively. For this hyperslab, only every
other grid point will be output.

\begin{verbatim}
  IOFlexIO::out_every = 1
  IOFlexIO::out_vars  = "Grid::x{ out_every = 5 }
                         Wavetoy::phi{ origin     = {4 4 4}
                                       direction  = {0 0 0
                                                     0 1 0
                                                     0 0 1}
                                       extent     = {10 20}
                                       downsample = {2 2}   }"
\end{verbatim}

The hyperslab parameters which can be set in an option string are:
\begin{itemize}
  \item{\tt origin}\\
    This specifies the origin of the hyperslab. It must be given as an array
    of integer values with $N$ elements. Each value specifies the offset in
    grid points in this dimension into the N-dimensional volume of the grid
    variable.\\
    If the origin for a hyperslab is not given, if will default to 0.
  \item{\tt direction}\\
    The direction vectors specify both the directions in which the hyperslab
    should be spanned (each vector defines one direction of the hyperslab)
    and its dimensionality ($=$ the total number of dimension vectors).
    The direction vectors must be given as a concatenated array of integer
    values. The direction vectors must not be a linear combination of each other
    or null vectors.\\
    If the direction vectors for a hyperslab are not given, the hyperslab
    dimensions will default to $N$, and its directions are parallel to the
    underlying grid.
  \item{\tt extent}\\
    This specifies the extent of the hyperslab in each of its dimensions as
    a number of grid points. It must be given as an array of integer values
    with $M$ elements ($M$ being the number of hyperslab dimensions).\\
    If the extent for a hyperslab is not given, it will default to the grid
    variable's extent. Note that if the origin is set to
    a non-zero value, you should also set the hyperslab extent otherwise
    the default extent would possibly exceed the variable's grid extent.
  \item{\tt downsample}\\
    To select only every so many grid points from the hyperslab you can set
    the downsample option. It must be given as an array of integer values
    with $M$ elements ($M$ being the number of hyperslab dimensions).\\
    If the downsample option is not given, it will default to the settings
    of the general downsampling parameters {\tt IO::out\_downsample\_[xyz]} as
    defined by thorn {\bf IOUtil}.
\end{itemize}


\section{Checkpointing \& Recovery}

Thorn {\bf IOFlexIO} can also be used for creating IEEEIO checkpoint files and
recovering from such files later on.

Checkpoint routines are scheduled at several timebins so that you can save
the current state of your simulation atfer the initial data phase,
during evolution, or at termination. Checkpointing for thorn {\bf IOFlexIO}
is enabled by setting the parameter {\tt IOFlexIO::checkpoint = "yes"}.

A recovery routine is registered with thorn IOUtil in order to restart
a new simulation from a given {\bf IOFlexIO} checkpoint.
The very same recovery mechanism is used to implement a filereader
functionality to feed back data into Cactus.

Checkpointing and recovery are controlled by corresponding checkpoint/recovery
parameters of thorn IOUtil (for a description of these parameters please refer
to this thorn's documentation).  The parameter {\tt
  IO::checkpoint\_every\_walltime\_hours} is not (yet) supported.


\section{Importing External Data into Cactus with {\bf IOFlexIO}}

In order to import external data into Cactus (eg. to initialize some variable)
you first need to convert this data into an IEEEIO datafile which then can be
processed by the registered recovery routine of thorn {\bf IOFlexIO}.\\

The following description explains the IEEEIO file layout of an unchunked
datafile which thorn {\bf IOFlexIO} expects in order to restore Cactus variables
from it properly. There is also a well-documented example C program provided
({\tt IOFlexIO/doc/CreateIOFlexIOdatafile.c}) which illustrates how to create
a datafile with IEEEIO file layout. This working example can be used as a
template for building your own data converter program.\\

\begin{enumerate}
  \item Actual data is stored as multidimensional datasets in an IEEEIO file.

  \item The type of your data as well as its dimensions are already
        inherited by a dataset itself as metainformation. But this is not
        enough for {\bf IOFlexIO} to savely match it against a specific Cactus
        variable.
        For that reason, the variable's name, its groupname, its grouptype, the
        timelevel to restore, and the
        total number of timelevels must be attached to every dataset
        as attribute information.

  \item Finally, the recovery routine needs to know how the datafile to
        recover from was created:
        \begin{itemize}
          \item Does the file contain chunked or unchunked data ?
          \item How many processors were used to produce the data ?
          \item How many I/O processors were used to write the data ?
          \item What Cactus version is this datafile compatible with ?
        \end{itemize}
        Such information is attached as attributes to the very first dataset
        in the file. Since we assume unchunked data here
        the processor information isn't relevant -- unchunked data can
        be fed back into a Cactus simulation running on an arbitrary
        number of processors.\\
        The Cactus version ID must be present to indicate that grid variables
        with multiple timelevels should be recovered following the new
        timelevel scheme (as introduced in Cactus beta 10).
\end{enumerate}

The example C program goes through all of these steps and creates a datafile
{\tt x.ieee} in IEEEIO file layout which contains a single dataset named
{\tt "grid::x"}, with groupname {\tt "grid::coordinates"}, grouptype {\tt
CCTK\_GF} (thus identifying the variable as a grid function), the timelevel
to restore set to 0, and the total number of timelevels set to 1.\\
The global attributes are set to
{\tt "unchunked" $=$ "yes", nprocs $=$ 1,} and {\tt ioproc\_every $=$ 1}.\\

Once you've built and ran the program you can easily verify if it worked
properly with
\begin{verbatim}
  ioinfo -showattrdata x.ieee
\end{verbatim}
which lists all objects in the datafile along with their values.
Since the single dataset in it only contains zeros
it would probably not make much sense to feed this datafile into Cactus for
initializing your x coordinate grid function :-)
%
%
\section{Utility programs provided by {\bf IOFlexIO}}
%
\label{IOFlexIO_utility_programs}

Thorn {\bf IOFlexIO} provides the following utility programs:
%
\begin{itemize}
  \item {\tt ieee\_recombiner}\\
    Recombines chunked IEEEIO datafile(s) into a single unchunked IEEEIO datafile.
  \item {\tt ieee\_merge}\\
    Merges the contents of its input files into a single output file.\\
    This might be useful for {\bf IOFlexIO} datafile created by different runs.
  \item {\tt ieee\_extract}\\
    Extracts a hyperslab from all datasets of the input file.\\
    You can select a hyperslab by specifying an origin and an extent (eg.
    128x128x128+32+32+32 selects a 128-cubed hyperslab with origin (32, 32, 32)).
  \item {\tt ieee\_convert\_from\_cactus3}\\
    Converts Cactus 3 IEEEIO datafiles into Cactus 4.\\
    It also takes a textfile as input with mapping information for variable
    names.
  \item {\tt ioinfo}\\
    Displays the contents of an IEEEIO datafile (number of datasets stored,
    datatype, rank, dimenions, and number of attributes for each dataset).
\end{itemize}
%
All utility programs are located in the {\tt src/util/} subdirectory of thorn
{\bf IOFlexIO}. To build the utilities just do a

\begin{verbatim}
  make <configuration>-utils
\end{verbatim}

in the Cactus toplevel directory. The executables will then be placed in the
{\tt exe/<configuration>/} subdirectory.

All utility programs are self-explaining -- just call them without arguments
to get a short usage info.
If any of these utility programs is called without arguments it will print
a usage message.

% Do not delete next line
% END CACTUS THORNGUIDE

\end{document}