aboutsummaryrefslogtreecommitdiff
path: root/doc/documentation.tex
blob: 54c8215f3bcc1e7380fce3f3715435de7e10c2b2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
\documentclass{article}

% Use the Cactus ThornGuide style file
% (Automatically used from Cactus distribution, if you have a 
%  thorn without the Cactus Flesh download this from the Cactus
%  homepage at www.cactuscode.org)
\usepackage{../../../../doc/latex/cactus}

\begin{document}

\title{IOStreamedHDF5}
\author{Thomas Radke}
\date{$ $Date$ $}

\maketitle

% Do not delete next line
% START CACTUS THORNGUIDE

\begin{abstract}
Thorn {\bf IOStreamedHDF5} provides an I/O method to stream variables in HDF5
file format via live sockets to any connected clients.
It also implements checkpointing/recovery functionality using HDF5.
\end{abstract}
%
%
\section{Purpose}
%
Thorn {\bf IOStreamedHDF5} uses the standard I/O library HDF5 (Hierarchical Data Format
version 5, see {\tt http://hdf.ncsa.uiuc.edu/whatishdf5.html} for details)
to output any type of CCTK grid variables (grid scalars, grid functions, and
grid arrays of arbitrary dimension) in the HDF5 file format.\\

Output is done by invoking the {\tt IOStreamedHDF5} I/O method which thorn {\bf IOStreamedHDF5}
registers with the flesh's I/O interface at startup.\\

Data is streamed as serialized HDF5 files over sockets to any connected client.
Such datafiles can be used by appropriate programs for further postprocessing
(eg. remote visualization).

Data is always written unchunked by processor 0, ie. the chunks of a distributed
grid function or array will be collected from all other processors and streamed
out as a single dataset. Parallel streaming from multiple processors is not
supported yet.


\section{{\bf IOStreamedHDF5} Parameters}

Parameters to control the {\tt IOStreamedHDF5} I/O method are:
\begin{itemize}
  \item {\tt IOStreamedHDF5::out\_every} (steerable)\\
        How often to do periodic {\tt IOStreamedHDF5} output. If this parameter
        is set in the parameter file, it will override the setting of the shared
        {\tt IO::out\_every} parameter. The output frequency can also be set
        for individual variables using the {\tt out\_every} option in an option
        string appended to the {\tt IOStreamedHDF5::out\_vars} parameter.
  \item {\tt IOStreamedHDF5::out\_vars} (steerable)\\
        The list of variables to output using the {\bf IOStreamedHDF5} I/O method.
        The variables must be given by their fully qualified variable or group
        name. The special keyword {\it all} requests {\tt IOStreamedHDF5} output for
        all variables. Multiple names must be separated by whitespaces.\\
        An option string can be appended in curly braces to a group/variable
        name. Supported options are {\tt out\_every} (to set the output
        frequency for individual variables) and hyperslab options (see section
        \ref{IOStreamedHDF5_output_hyperslabs} for details).
\end{itemize}


\section{Output of Hyperslab Data}
\label{IOStreamedHDF5_output_hyperslabs}

By default, thorn {\bf IOStreamedHDF5} outputs multidimensional Cactus variables with
their full contents resulting in maximum data output. This can be changed for
individual variables by specifying a hyperslab as a subset of the data within
the N-dimensional volume. Such a subset (called a {\it hyperslab}) is generally
defined as an orthogonal region into the multidimensional dataset, with an
origin (lower left corner of the hyperslab), direction vectors (defining the
number of hyperslab dimensions and spanning the hyperslab within the
N-dimensional grid), an extent (the length of the hyperslab in each of its
dimensions), and an optional downsampling factor.

Hyperslab parameters can be set for individual variables using an option string
appended to the variables' full names in the {\tt IOStreamedHDF5::out\_vars} parameter.

Here is an example which outputs two 3D grid functions {\tt Grid::r} and {\tt
Wavetoy::phi}. While the first is output with their full contents at every
5th iteration (overriding the {\tt IOStreamedHDF5::out\_every} parameter for this
variable), a two-dimensional hyperslab is defined for the second grid function.
This hyperslab defines a subvolume to output, starting with a 5 grid points
offset into the grid, spanning in the yz-plane, with an extent of 10 and 20
grid points in y- and z-direction respectively. For this hyperslab, only every
other grid point will be output.

\begin{verbatim}
  IOStreamedHDF5::out_every = 1
  IOStreamedHDF5::out_vars  = "Grid::x{ out_every = 5 }
                               Wavetoy::phi{ origin     = {4 4 4}
                                             direction  = {0 0 0
                                                           0 1 0
                                                           0 0 1}
                                             extent     = {10 20}
                                             downsample = {2 2}   }"
\end{verbatim}

The hyperslab parameters which can be set in an option string are:
\begin{itemize}
  \item{\tt origin}\\
    This specifies the origin of the hyperslab. It must be given as an array
    of integer values with $N$ elements. Each value specifies the offset in
    grid points in this dimension into the N-dimensional volume of the grid
    variable.\\
    If the origin for a hyperslab is not given, if will default to 0.
  \item{\tt direction}\\
    The direction vectors specify both the directions in which the hyperslab
    should be spanned (each vector defines one direction of the hyperslab)
    and its dimensionality ($=$ the total number of dimension vectors).
    The direction vectors must be given as a concatenated array of integer
    values. The direction vectors must not be a linear combination of each other
    or null vectors.\\
    If the direction vectors for a hyperslab are not given, the hyperslab
    dimensions will default to $N$, and its directions are parallel to the
    underlying grid.
  \item{\tt extent}\\
    This specifies the extent of the hyperslab in each of its dimensions as
    a number of grid points. It must be given as an array of integer values
    with $M$ elements ($M$ being the number of hyperslab dimensions).\\
    If the extent for a hyperslab is not given, it will default to the grid
    variable's extent. Note that if the origin is set to
    a non-zero value, you should also set the hyperslab extent otherwise
    the default extent would possibly exceed the variable's grid extent.
  \item{\tt downsample}\\
    To select only every so many grid points from the hyperslab you can set
    the downsample option. It must be given as an array of integer values
    with $M$ elements ($M$ being the number of hyperslab dimensions).\\
    If the downsample option is not given, it will default to the settings
    of the general downsampling parameters {\tt IO::downsample\_[xyz]} as
    defined by thorn {\bf IOUtil}.
\end{itemize}


\section{Checkpointing \& Recovery}

Thorn {\bf IOStreamedHDF5} can also be used to create HDF5 checkpoints and
stream them to another Cactus simulation which recovers from such a checkpoint
at the same time.

Checkpoint routines are scheduled at several timebins so that you can save
the current state of your simulation after the initial data phase,
during evolution, or at termination. Checkpointing for thorn {\bf IOStreamedHDF5}
is enabled by setting the parameter {\tt IOStreamedHDF5::checkpoint = "yes"}.

A recovery routine is registered with thorn {\bf IOUtil} in order to restart
a new simulation from a given HDF5 checkpoint.

Checkpointing and recovery are controlled by corresponding checkpoint/recovery
parameters of thorn {\bf IOUtil} (for a description of these parameters please
refer to this thorn's documentation).


\section{Building A Cactus Configuration with {\bf IOStreamedHDF5}}
%
The Cactus distribution does not contain the HDF5 header files and library which
is used by thorn {\bf IOStreamedHDF5}. So you need to configure it as an external
software package via:
%
\begin{verbatim}
  make <configuration>-config HDF5=yes
                             [HDF5_DIR=<path to HDF5 package>]
\end{verbatim}
%
The configuration script will look in some default places for an installed
HDF5 package. If nothing is found this way you can explicitly specify it with
the {\tt HDF5\_DIR} configure variable.

Note that thorn {\bf IOStreamedHDF5} uses the {\tt Stream Virtual File Driver}
of the HDF5 library as its low-level driver. This driver is not built into
an HDF5 configuration by default. The configure script of {\bf IOStreamedHDF5} will warn you if your HDF5 configuration doesn't contain this driver and stop
the configuration process. Building an HDF5 library with {\tt Stream} driver
is very easy: just configure it with the {\tt --enable-stream-vfd} option
and build/install as usual.

Thorn {\bf IOStreamedHDF5} inherits from {\bf IOUtil} and {\bf IOHDF5Util}
so you need to include these thorns in your thorn list to build a configuration
with {\bf IOStreamedHDF5}.

% Do not delete next line
% END CACTUS THORNGUIDE

\end{document}