LAPLACE_RCC
Run FreeFem++ on the FSU RCC HPC Cluster


http://people.sc.fsu.edu/~jburkardt/freefem++/laplace_rcc/laplace_rcc.html


LAPLACE_RCC, FreeFem++ scripts which solve the Laplace equation on the Florida State University (FSU) Research Computing Center (RCC) High Performance Computing (HPC) cluster, with interactive/batch and serial/parallel options.

We assume you already know how to run FreeFem++, that you have a FreeFem++ input script, perhaps called "myprog.edp", and that you have an account on the FSU RCC HPC cluster.

You will need to use sftp to log into hpc-login.rcc.fsu.edu and transfer your file "myprog.edp" from your local machine to the HPC file system, using commands like:

        sftp hpc-login.rcc.fsu.edu
        put myprog.edp
        quit
      
(Learn about other commands, like "get" and "lcd" and "cd" which allow you to get files, and to change your local and remote directories!)

You will need to use ssh to log into hpc-login.rcc.fsu.edu, and because FreeFem++ typically opens a graphics window during execution, you should probably include the "-Y" option to enable X Windows.

        ssh -Y hpc-login.rcc.fsu.edu
        (type commands for your interactive session here)
        quit
      

Interactive Serial Mode

The HPC login node is shared by many users, and is not intended for intensive computational work. However, it is reasonable to try out small memory short time jobs on the login node, before setting up a big job to run on the cluster. The simplest case involves using FreeFem++ as though you were back on your own desktop machine. Here we run FreeFem++ using the input file "myprog.edp":

        /opt/hpc/gnu/bin/FreeFem++ myprog.edp
      
As usual, FreeFem++ will probably open a graphics windown, which you can page through by hitting RETURN and close with an ESCAPE.

Interactive Parallel Mode

The RCC also provides a version of FreeFem++ that has been compiled with the openmpi library, for parallel execution. Please note that the location and name of this executable are different from those of the serial version! You can try a small example, using a small number of processes, on the login node. You must use a module command to load the appropriate openmpi libraries and executables before running FreeFem++ with, in this case, 2 processes:

        module load gnu-openmpi
        mpirun -np 2 /opt/hpc/gnu/openmpi/bin/FreeFem++-mpi myprog.edp
      
As usual, FreeFem++ will probably open a graphics windown, which you can page through by hitting RETURN and close with an ESCAPE.

Batch Serial Mode

First, note that there is no reason to run FreeFem++ in serial mode on the FSU RCC HPC cluster. It won't run any faster than running it on your own machine. However, we will consider how to do this because it gives you at least an introduction to batch file usage without the extra complications associated with parallel programming.

Essentially, we need to create a shell script that begins with some commands for the SLURM queue manager, followed by the commands we would type in the interactive serial mode. The script might look like this:

#!/bin/bash
#
#SBATCH -N 1          <-- I request access to 1 node
#SBATCH -c 1          <-- I need one core on that node
#SBATCH -J myprog     <-- Call this job "myprog"
#SBATCH -p backfill   <-- Run the job on the backfill queue
#SBATCH -t 00:05:00   <-- Use no more than 5 minutes.
#
/opt/hpc/gnu/bin/FreeFem++ myprog.edp
      

Assuming these commands are in a file called myprog.sh, you submit them to the queue by the command

        sbatch myprog.sh
      
The SLURM scheduler immediately gives your job an ID number
        Submitted batch job 909856
      
which is useful for cancelling or tracking your job. The number 909856 is an identifier for this job. In particular, output from your program will be returned to you in a file called slurm-909856.out (although if you are picky you can ask SLURM to use a different naming convention).

Note that the output file may appear in your directory before the job is finished, containing the output "so far". In some cases, you might want your program to print out a final "End of execution!" message just so you can know that it has completed normally.

After you have submitted the job, and before it is complete, there are some useful SLURM commands:

Batch Parallel Mode

Presumably, the reason you are using the cluster is to run FreeFem++ in parallel. The system commands for doing this are essentially the same as we saw for interactive parallel mode; some of the SLURM commands will differ from the batch serial mode, because now we need to ask for multiple cores. The simplest way to do this is simply to specify the number of cores (here, we ask for 2) with the command #SBATCH -n 2". We have dropped the "#SBATCH -N 1" switch because we do not insist that all the cores be on the same node. We have dropped the "#SBATCH -c 2", which specified how many cores we wanted on the node we asked for. The "#SBATCH -n 2" command finds us two cores somewhere in the system.

#!/bin/bash
#
#SBATCH -n 2          <-- Ask for 2 cores, which we will invoke in the MPIRUN command.
#SBATCH -J myprog
#SBATCH -p backfill
#SBATCH -t 00:05:00
#
module load gnu-openmpi
#
mpirun -np 2 /opt/hpc/gnu/openmpi/bin/FreeFem++-mpi myprog.edp
      
For a particular problem, you should find that increasing the number of cores decreases the run time. The decrease should be linear for a while; that is, asking for twice the number of cores ought to halve the running time. However, you don't want to ask for too many cores, for several reasons. First, if you can stay on one node, and use all the cores, that's probably most efficient (If you want to do this, you will actually want to use the "#SBATCH -N 1" switch, and you may need to learn about the "SBATCH -C(type)" switch as well). Secondly, if you ask for too many cores, you will see the improvement gradually reduces, and eventually the running time can even begin to increase. Finally, it's easier for the SLURM scheduler to find available resources for your job if your request is relatively small.

Licensing:

The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.

LAPLACE.EDP defines the problem.

LAPLACE_LOCAL_SERIAL runs the example "locally" (that is, on the RCC login node) in serial mode.

LAPLACE_LOCAL_MPI runs the example "locally" (that is, on the RCC login node) in parallel under OpenMPI.

LAPLACE_RCC_SERIAL runs the example in batch mode (that is, on the RCC compute nodes) in serial mode.

LAPLACE_RCC_MPI runs the example in batch mode (that is, on the RCC compute nodes) in parallel under OpenMPI.

You can go up one level to the FreeFem++ web page.


Last revised on 12 November 2015.