| Commit message (Collapse) | Author | Age |
|
|
|
| |
Also add test parameter file.
|
|
|
|
| |
Introduce parameter CarpetLib::interpolate_from_buffer_zones that allows disabling interpolation from buffer zones.
|
|
|
|
| |
this is so that Carpet can see them when checking the ghost sizes
|
| |
|
|
|
|
|
|
|
|
|
| |
* rename controlling parameter to use_higher_order_restriction
* introduce parameter restriction_order_space to control which operator
is used (currently order 1 and 3 are suppoted)
* include some comments on what the operator does
* change the way the restrictable region is computed in dh.cc/regrid to
be based on exterior.shrink(stencil_width) rather that the interior
|
| |
|
| |
|
| |
|
|
|
|
| |
Use IPM's timing regions in CarpetLib's timers (if built with IPM).
|
|
|
|
|
|
|
|
|
| |
Ignore-this: 1a389f0dd3f40a0c0edb3fdabd6e7d40
Padding grid variables means that e.g. a component of size 32x32x32 is
allocated as 33x33x33 instead, but only 32x32x32 of this storage is
used. This can improve cache performance considerably. This requires
corresponding changes to the cGH entries.
|
|
|
|
| |
Ignore-this: 309b4dd613f4af2b84aa5d6743fdb6b3
|
|
|
|
|
|
|
| |
combine_recompose recomposes all grid functions at once. This increases
memory usage, but combines the communications and may thus also increase
the speed. The default behaviour is unchanged, recomposing all grid
functions sequentially.
|
|
|
|
|
|
| |
Correct a memory leak and simplify the code in the commstate class by using
C++ datatypes instead of new and delete.
Add many assert statements to catch potential problems.
|
|
|
|
|
|
|
|
|
|
| |
Use the timing routines from FFTW library. These contain
platform-specific code for many different platforms.
Remove parameter timestat_timer, since the timer is now chosen
automatically.
darcs-hash:20080114150519-dae7b-d979aa53a1470335b3ace353e862eef13670958d.gz
|
|
|
|
|
|
|
| |
Set the number of OpenMP threads via an external function call instead
of via a CarpetLib parameter.
darcs-hash:20080114150439-dae7b-a6a6a629162ca195411852823e1ece0a2071d771.gz
|
|
|
|
|
|
|
|
|
|
|
| |
Add #pragma omp statements for loops in reduction and prolongation
operators. Change loop control variables to signed types.
Add functions to determine the number of active threads.
Add a parameter to set the number of threads if desired.
darcs-hash:20070821185237-dae7b-56827b72a69b5fa1b3d1316379a0f155696b4cb2.gz
|
|
|
|
| |
darcs-hash:20070419013528-dae7b-09f603af91ed987e74f3cb67d417b9745e4070b6.gz
|
|
|
|
|
|
|
|
|
|
|
| |
Remove some parameters which are not necessary:
CarpetLib::print_timestats
CarpetLib::timestat_disable
Allow the value -1 as well as 0 to disable output for timers and
memory statistics.
darcs-hash:20070312160854-dae7b-6c60bf0c64a5cac03da97595bb30bb2b47568165.gz
|
|
|
|
| |
darcs-hash:20070306011837-dae7b-600bdffcf60d6dff9386180543c3dc7ecf9f4028.gz
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce the number of bbox operations while setting up the
communication schedule. Cache some results. Introduce timers
throughout.
Introduce a parameter CarpetLib::check_bboxes, defaulting to "yes",
which can be used to disable the self-checks.
darcs-hash:20070304210744-dae7b-1a2756dc0aa2f30b2f1311a9475c2a35513f2cfc.gz
|
|
|
|
| |
darcs-hash:20070204212619-dae7b-f68201db57954ecac2ed7d77dcac9c43f267af64.gz
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement a second timer based on Intel's rdtsc instruction, which is
much faster and much more accurate than MPI_Wtime.
Place the timer classes into the CarpetLib namespace.
Create a TimerSet class. Make the Timer class automatically register
all timers with a singleton object, removing all global variables.
darcs-hash:20070203211128-dae7b-42765e79446eda6a2337ba22cd390869055c555a.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reorganise prolongation and restriction operators. This is a major
implementation change.
Most operators are now written as C++ templates instead of as Fortran
77 code. This simplifies the code, since C++ routines can be called
more easily, and they also have access to CarpetLib's high-level data
structures.
Previously, the operators combined temporal and spatial interpolation.
Now, time interpolation and space interpolation are handled
separately. This may be less efficient, but simplifies the code
significantly, since there are now N+M instead of N*M routines, for N
time interpolation and M space interpolation methods.
Remove the minmod prolongation operator, which was previously
disabled.
Add support for cell centering, using a method described by Simon
Hern, and suggested for Carpet by Ian Hawke.
darcs-hash:20070112205812-dae7b-5329795aa698e7bbc3671b1504134885dd830238.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new parameters:
BOOLEAN interleave_communications:
Try to interleave communications with each other; each processor
begins to communicate with its 'right neighbour' in rank, instead of
with the root processor
BOOLEAN vary_tags:
Use different tags for each communication
BOOLEAN barrier_between_stages:
Add a barrier between the communication stages (slows down, but may
make timing numbers easier to interpret)
BOOLEAN combine_sends:
Send data together and in order of processor ranks
BOOLEAN reduce_mpi_waitall:
Call MPI_Waitall only for requests that are not null
BOOLEAN use_mpi_send:
Use MPI_Send instead of MPI_Isend
BOOLEAN use_mpi_ssend:
Use MPI_Ssend instead of MPI_Isend
darcs-hash:20061206165333-dae7b-8ba40bd19fb1733336e60cb7e6bfa0ebfe0d546d.gz
|
|
|
|
|
|
|
|
| |
Write the CarpetLib timer output to files instead of to screen; the
output is lengthy, difficult to interpret, and output from all
processors is needed.
darcs-hash:20060911025609-dae7b-c1d812ae44dfdb3f8e8daae09f06a8ed3476e73f.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split recompose functions into two stages, regrid and recompose. The
first stage, regrid, changes the grid structure in the gh and dh
classes. The second stage, recompose, changes the values of the
actual grid functions, i.e., changes the gf<T> and data<T> objects.
The second stage has to be called individually for every refinement
level.
This is necessary since the boundary conditions need to be applied
after recomposing one refinement level, before the next fine
refinement level can be recomposed.
darcs-hash:20060904230433-dae7b-3ba1982460f57b34da11a6fbb6b4b524dc5b348f.gz
|
|
|
|
|
|
|
|
|
|
| |
Add timers for the new communication infrastructure.
Enhance the timers to also track the minimum and maximum time spent.
Add a parameter to output timing information to files.
darcs-hash:20060731152618-dae7b-1d049b2b37397610c14648078fd0ee92f252ca2a.gz
|
|
|
|
| |
darcs-hash:20060731152547-dae7b-c56f2acaf72d2d21d2e4bdee00691866568af6fc.gz
|
|
|
|
|
|
|
|
|
| |
Poison newly allocated memory if desired. This is potentially more
thorough that Carpet's poisoning, since it is applied to all allocated
memory. It is not applied after time level cycling, though, so it
cannot replace Carpet's poisoning.
darcs-hash:20060731152325-dae7b-d039ee958161690c9430e70a8051d400273b819e.gz
|
|
|
|
| |
darcs-hash:20060731151302-dae7b-555810bfd094f8acb70b3dc90c2775ab84a20ef7.gz
|
|
|
|
| |
darcs-hash:20060508154323-dae7b-30f14d75440c10774cd9f386bcfaca77fe3e704d.gz
|
|
|
|
| |
darcs-hash:20060508154248-dae7b-a4d6c2b35d559cf0e61adae9f580905fa7771235.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a parameter "omit_prolongation_points_when_restricting" that
controls whether to restrict to points that are used for boundary
prolongation. The default is "no", restoring Carpet's behaviour from
before the patch "CarpetLib: Do not restrict to points that are used
for boundary prolongation".
When set to yes, then there is still a logic error in the code on
multiple processors, leading to different restriction regions on
different numbers of processors.
darcs-hash:20060503235940-dae7b-4c124e68e4c2519c0f97d416e0a7fa3489c1441d.gz
|
|
|
|
| |
darcs-hash:20060208233203-dae7b-c3837264ceeca33579afa2bfcb45c8d10803ac0e.gz
|
|
|
|
|
|
|
| |
Add new parameter CarpetLib::memstat_file. If set, then memory
statistics are periodically written to this file.
darcs-hash:20051119201538-dae7b-88c8b8cd5b9d2643d1be6e682f2aa32e7a00ef2d.gz
|
|
|
|
|
|
|
|
| |
For each refinement level that is to be recomposed, check whether it
has the same structure as before, and if so, do nothing. This is
controlled by a new flag CarpetLib::fast_recomposing.
darcs-hash:20050811120347-891bb-f937c21ddeac7d909cae41d487e9fd74a5ce8cc8.gz
|
|
|
|
|
|
|
| |
These default settings are now believed to be sensible and safe for
everybody.
darcs-hash:20050808132929-891bb-48231878e0a5ea02312823f4b96cad1c79fdba9f.gz
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new parameter print_memstats_every. When non-zero, output
the current and total allocated amount of memory (per process).
Introduce a new parameter max_allowed_memory_MB. When more than that
amount should be allocated on the current processor, abort the run.
Only memory for grid variables counts; memory for administrative
overhead is ignored.
darcs-hash:20050727201851-891bb-c1ff9fc30ff949d576d500fbf70ad7fb5084836a.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
single components
The default communication scheme in Carpet (which does an individual send/recv
operation for each component) comes with two parameters for fine tuning:
CarpetLib::use_lightweight_buffers
CarpetLib::combine_recv_send
the defaults of which are set to use a well-tested but also slower
communication pattern (as turned out during benchmark runs).
This patch cleans up the implementation of this communication scheme so that the
fastest communication pattern (combined posting of send/recv; use of lightweight
buffers) is now always used. The above parameters therefore became obsolete
and shouldn't be used anymore in parfiles.
darcs-hash:20050526114253-776a0-780933a1539a260d74da8b92522fa2f48c714964.gz
|
|
|
|
|
|
|
|
| |
Using CarpetLib::use_waitall = "yes" seems to improve Carpet performance
both for the standard and for the collective buffers communication scheme.
So I made it the default.
darcs-hash:20050411173355-776a0-a1046bde7c4ccb4eebc00765b4264701b012c8d8.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code to minimise the number of outstanding communication requests is
superseded by the collective buffers communication code. Therefore the
corresponding parameter has been deactivated (but not removed in order to keep
backwards compatibility with older checkpoints).
It is marked as deprecated in the param.ccl file and should not be used anymore
(use CarpetLib::use_collective_communication_buffers instead).
A level-2 warning of that meaning is printed at startup if the parameter is
still set in a user's parfile.
darcs-hash:20050411155524-776a0-ed9919869cc1f2821ab8b2fa23b4abea203b72ed.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Collective buffers are used to gather all components' data on a processor
before it gets send off to other processors in one go. This minimizes the
number of outstanding MPI communications down to O(N-1) and thus improves
overall efficiency as benchmarks show.
Each processor allocates a pair of single send/recv buffers to communicate
with all other processors. For this the class (actually, the struct) comm_state
was extended by 3 more states:
state_get_buffer_sizes: accumulates the sizes for the send/recv buffers
state_fill_send_buffers: gathers all the data into the send buffers
state_empty_recv_buffers: copies the data from the recv buffer back into
the processor's components
Send/recv buffers are exchanged during state_fill_send_buffers and
state_empty_recv_buffers. The constructor for a comm_state struct now takes
an argument <datatype> which denotes the CCTK datatype to use for the
attached collective buffers. If a negative value is passed here then it falls
back to using the old send/recv/wait communication scheme. The datatype
argument has a default value of -1 to maintain backwards compatibility to
existing code (which therefore will keep using the old scheme).
The new communication scheme is chosen by setting the parameter
CarpetLib::use_collective_communication_buffers to "yes". It defaults to "no"
meaning that the old send/recv/wait scheme is still used.
So far all the comm_state objects in the higher-level routines in thorn Carpet
(restriction/prolongation, regridding, synchronization) have been enabled to
use collective communication buffers.
Other thorns (CarpetInterp, CarpetIO*, CarpetSlab) will follow in separate
commits.
darcs-hash:20050330152811-776a0-51f426887fea099d1a67b42bd79e4f786979ba91.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch greatly reduces the number of outstanding MPI_Isend/MPI_Irecv
communication requests by moving the loop over comm_states (recv,send,wait)
from the outermost to the innermost.
This resolves problems with certain MPI implementations (specifically LAM,
MPICH-NCSA, and Mvapich over Infiniband) which potentially resulted in some
communication buffer overflow and caused the Cactus application to abort or
hang forever.
Preliminary benchmarks with BSSN_MoL show that the patch does not have a
negative impact on myrinet clusters (measured to 64 processors).
It even improves the Carpet performance on GigE clusters (measured up to 16
processors).
The order of the communication loops is controlled by the boolean parameter
CarpetRegrid::minimise_outstanding_communications
which defaults to "no" (preserving the old behaviour).
darcs-hash:20050311160040-3fd61-04d40ac79ef218252f9364a8d18796e9b270d295.gz
|
|
|
|
|
|
|
|
|
|
|
| |
Lightweight communication buffers use essentially only a vector<T>
instead of a data<T> to transfer data between processors. This should
reduce the computational overhead.
Set the parameter "use_lightweight_buffers" to use this feature. This
feature is completely untested.
darcs-hash:20050102173524-891bb-6a3999cbd63e367c8520c175c8078374d294eaa8.gz
|
|
|
|
| |
darcs-hash:20050101162121-891bb-ac9d070faecc19f91b4b57389d3507bfc6c6e5ee.gz
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Collect more timing statistics in the data class. Print these
statistics to stdout when the Cactus parameter print_timestats is set.
Create a timer class "timestat". This is a timer that can be started,
stopped, and it prints the total time as well as some statistics.
For memory allocation statistics, count the number of objects as well
as the number of bytes.
darcs-hash:20041230212136-891bb-c14edfa7d539ae9b135eee76afadaad51fd0b098.gz
|
|
|
|
|
|
|
|
|
| |
Introduce parameter a parameter to post MPI_Irecv and MPI_Isend at the
same time.
Use two queues instead of one vector to store the MPI_Requests.
darcs-hash:20041208222541-891bb-c7c8994a0c41b6cfb37f6dc023bc1172238f3619.gz
|
|
|
|
| |
darcs-hash:20041208222503-891bb-ac6173fd6d238be4a3325e839ff83d84032f7184.gz
|
|
|
|
|
|
|
|
| |
Add a parameter CarpetLib::use_waitall that switches from using a
series of MPI_Wait statements to using a single MPI_Waitall
statement. This might improve performance on many processors.
darcs-hash:20041124235118-891bb-034efea054db236a187022b1858e4574da867fa3.gz
|
|
|
|
|
|
| |
Replace all CVS header tags with the standard "$Header:$".
darcs-hash:20040918132147-891bb-dea889bdd94a479ec412d14d08e9efca63e5c24d.gz
|