Local:Running Fluidity in parallel

From AMCGMedia

Jump to: navigation, search

Fluidity is parallelized using MPI and standard domain decomposition techniques. If you're a sensible person in sensible company then (you know) you have access to a cluster (or supercomputer) which already has MPI and queuing software installed.


New Options Parallel

Generating a Mesh

Meshes can be generated using any program that outputs triangle meshes, or outputs mesh formats that can be converted to triangle. In particular, there are conversion programs for:

  • Gmsh - gmsh2triangle in tools
  • GiD - gid2triangle, part of fltools
  • GEM - gem2triangle, part of fltools in the legacy branch

All surfaces on which boundary conditions are applied should have appropriate boundary IDs, and if multiple regions are used then mesh regions should be assigned appropriate region IDs. See respective mesh generator pages for instructions on how to do this.

Making Fluidity

To be able to use fldecomp, run the following inside your fluidity folder:

make fltools

The fldecomp binary will then be created in the bin/ directory.

Decomposing the Mesh

To decompose the triangle mesh, run:

 fldecomp -m triangle -n [PARTS] [BASENAME]

where BASENAME is the triangle mesh base name (excluding extensions). "-m triangle" instruct fldecomp to perform a triangle-to-triangle decomposition. This will create PARTS partition triangle meshes together with PARTS .halo files.

Parallel Specific Options

In the options file, select "triangle" under /geometry/mesh/from_file/format for the from_file mesh. For the mesh filename, enter the triangle mesh base name excluding all file and process number extensions.


  • Remember to select parallel compatible preconditioners in prognostic field solver options. eisenstat is not suitable for parallel simulations.

Launching Fluidity

To launch a new options parallel simulation, add "[OPTIONS FILE]" to the Fluidity command line, e.g.:

 mpiexec fluidity -v2 -l [OPTIONS FILE]

To run in a batch job on cx1, using something like the following PBS script:

 # Job name
 #PBS -N backward_step
 # Time required in hh:mm:ss
 #PBS -l walltime=48:00:00
 # Resource requirements
 # Always try to specify exactly what we need and the PBS scheduler
 # will make sure to get your job running as quick as possible. If
 # you ask for too much you could be waiting a while for sufficient
 # resources to become available. Experiment!
 #PBS -l select=2:ncpus=4
 # Files to contain standard output and standard error
 ##PBS -o stdout
 ##PBS -e stderr
 echo Working directory is $PBS_O_WORKDIR 
 rm -f stdout* stderr* core*
 module load intel-suite
 module load mpi
 module load vtk
 module load cgns
 module load petsc/2.3.3-p1-amcg
 module load python/2.4-fake
 # This will put the location of the temporary directory into a temporary file
 # in case you need to check it's progress 
 mpiexec $PWD/fluidity -v2 -l $PWD/$PROJECT

This will run on 8 processors (2 * 4 from the line PBS -l select=2:ncpus=4).

Visualising Data

The output from a parallel run is a bunch of .vtu and .pvtu files. A .vtu file is output for each processor and each timestep, e.g. backward_facing_step_3d_191_0.vtu is the .vtu file for step 191 from processor 0. A .pvtu file is generated for each timestep, e.g. backward_facing_step_3d_191.pvtu is for timestep 191.

The best way to view the output is using paraview. Simply open the .ptvu file.

On cx1, you will need to load the paraview module: module load paraview/3.4.0

Limitations / Known Issues

  • Only one from_file mesh may be specified.

GEM Options Parallel

The first thing to do run flgem/gem as usual. Second, you need to partition the output into subdomains which will be the input files for parallel Fluidity. You should usually do this interactively as its serial and you generally only have to do it once while you may want to rerun the parallel problem multiple times. It's usually not too expensive anyhow. If you have a really large input you can always do this elsewhere on a machine which has sufficient memory and scp the result back to the cluster you want to run on. So, for example:

flgem annulus.gem
fldecomp -n 16 annulus

This gem's and partitions the project annulus into 16 subdomains. To run this in parallel you must first modify your batch queue script to request 16 cores. If you are using PBS on the Imperial College Cluster you could do this using:

#PBS -l select=4:ncpus=4:mem=4950mb:icib=true

This also selects Infiniband on the IC cluster. If you're at IC you better be using this option when you're running in parallel or else you might get compute-nodes which only have an ethernet interconnect (i.e. it's going to be slow). It doesn't matter of course if your parallel problem is small enough to sit inside a single SMP node. Finally, add a line to actually run fluidity in parallel:

mpiexec ./dfluidity annulus

When you want to visualize dump files you have two options. Either use:

fl2vtu annulus 1

which will create a parallel VTK file which you can visualize using something like:

mayavi -d annulus_1.pvtu -m SurfaceMap

(note the .pvtu). Alternatively, you can merge the partitions to form a single file

fl2vtu -m annulus 1
mayavi -d annulus_1.vtu -m SurfaceMap

Although be warned - this is not a good idea if your dump files start getting very big in which case you have to think about using Paraview for parallel visualization.

Running in parallel on Ubuntu - OpenMPI

Example 1 - Straight run

gormo@rex:~$ cat host_file
mpirun -np 4 --hostfile host_file $PWD/dfluidity tank.flml

Example 2 - running inside gdb

xhost +rex
gormo@rex:~$ echo $DISPLAY
mpirun -np 4 -x DISPLAY=:0.0 xterm -e gdb $PWD/dfluidity-debug
This page was last modified on 7 November 2012, at 22:32. This page has been accessed 5,493 times.