Using OpenMP on the FSU RCC HPC Cluster

OPENMP_RCC is a C program which uses OpenMP parallel programming directives, and is to be run on the Florida State University (FSU) Research Computing Center (RCC) High Performance Computing (HPC) Cluster, under the SLURM queue manager.

We will assume you have a working OpenMP program, and an account on the FSU RCC system, that you have used SFTP to transfer a copy of your program source code (perhaps called "myprog.c") to the HPC login node, and that have used SSH to login to which is the interactive login node.

The login node has 24 cores. You are not supposed to use the login node for computation, since it is a shared resource where many users may be logged in, doing editing or compiling or other tasks. However, it is reasonable to run a small version of your program with a few cores as a check. To do this, you can interactively type commands like this (assuming that your program does not need a huge amount of memory or time!):

        gcc -fopenmp myprog.c
        mv a.out myprog
        export OMP_NUM_THREADS=1
        export OMP_NUM_THREADS=2

Let's assume your program compiles, and that it runs substantially faster with 2 threads. That means you are ready to run the full version of your program, which may need a lot more memory and time. But in that case, the program must be sent to a computational node, and your commands must be put into a command file, along with some information for the SLURM queue manager.

Here is an example of a SLURM script, which might be called "":

        #SBATCH -N 1            <-- Ask for 1 node.  OpenMP can only run on 1 node.
        #SBATCH -c 8            <-- Ask for 8 cores, because we want 8 threads.
        #SBATCH -J myprog       <-- Name the job.
        #SBATCH -p backfill     <-- Use the "backfill" queue
        #SBATCH -t 00:15:00     <-- Time limit 15 minutes.
        gcc -fopenmp -o myprog myprog.c
        export OMP_NUM_THREADS=8
        srun ./myprog           <-- "srun" must run your program.
Note that the srun command is required in order to run your program.

Because an OpenMP program uses shared memory, all threads must be on the same node. Thus, the number of cores you can access is limited by the number that are available on that node.

On the FSU RCC HPC cluster, there are several kinds of nodes. INTEL nodes have 16 cores, "old" AMD nodes have 8 cores, and recent AMD nodes have 48 cores. You can use the switch "#SBATCH -C YEAR2010" to specify that you want a 48 core AMD node (more, but slower cores) or "#SBATCH -C intel" to specify that you want a 16 core INTEL node (fewer, but faster, cores).

You can compile your program interactively, in which case you can remove the compile instruction from the batch job.

To submit the job to the queue, use the command

You will see an immediate response like
        Submitted batch job 909856
The number 909856 is an identifier for this job. In particular, output from your program will be returned to you in a file called slurm-909856.out (although if you are picky you can ask SLURM to use a different naming convention).

Note that the output file may appear in your directory before the job is finished, containing the output "so far". In some cases, you might want your program to print out a final "End of execution!" message just so you can know that it has completed normally.

After you have submitted the job, and before it is complete, there are some useful SLURM commands:


The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.


OPENMP_RCC is available in a C version and a C++ version and a FORTRAN90 version.

Related Data and Programs:

DIJKSTRA_OPENMP, a C program which uses OpenMP to parallelize a simple example of Dijkstra's minimum distance algorithm for graphs.

FFT_OPENMP, a C program which demonstrates the computation of a Fast Fourier Transform in parallel, using OpenMP.

HEATED_PLATE_OPENMP, a C program which solves the steady (time independent) heat equation in a 2D rectangular region, using OpenMP to run in parallel.

HELLO_OPENMP, a C program which prints out "Hello, world!" using the OpenMP parallel programming environment.

MANDELBROT_OPENMP, a C program which generates an ASCII Portable Pixel Map (PPM) image of the Mandelbrot fractal set, using OpenMP for parallel execution.

MD_OPENMP, a C program which carries out a molecular dynamics simulation using OpenMP.

MULTITASK_OPENMP, a C program which demonstrates how to "multitask", that is, to execute several unrelated and distinct tasks simultaneously, using OpenMP for parallel execution.

MXM_OPENMP, a C program which computes a dense matrix product C=A*B, using OpenMP for parallel execution.

OPENMP, C programs which illustrate the use of the OpenMP application program interface for carrying out parallel computations in a shared memory environment.

POISSON_OPENMP, a C program which computes an approximate solution to the Poisson equation in a rectangle, using the Jacobi iteration to solve the linear system, and OpenMP to carry out the Jacobi iteration in parallel.

PRIME_OPENMP, a C program which counts the number of primes between 1 and N, using OpenMP for parallel execution.

QUAD_OPENMP, a C program which approximates an integral using a quadrature rule, and carries out the computation in parallel using OpenMP.

RANDOM_OPENMP, a C program which illustrates how a parallel program using OpenMP can generate multiple distinct streams of random numbers.

SATISFY_OPENMP, a C program which demonstrates, for a particular circuit, an exhaustive search for solutions of the circuit satisfiability problem, using OpenMP for parallel execution.

SCHEDULE_OPENMP, a C program which demonstrates the default, static, and dynamic methods of "scheduling" loop iterations in OpenMP to avoid work imbalance.

SGEFA_OPENMP, a C program which reimplements the SGEFA/SGESL linear algebra routines from LINPACK for use with OpenMP.

ZIGGURAT_OPENMP, a C program which demonstrates how the ZIGGURAT library can be used to generate random numbers in an OpenMP parallel program.

Source Code:

Examples and Tests:

HEATED_PLATE_LOCAL runs the program interactively.

HEATED_PLATE_RCC runs the program in batch mode on the FSU RCC HPC cluster.

You can go up one level to the C source codes.

Last revised on 16 November 2015.