schedule_openmp, a Fortran90 code which demonstrates the use of default, static and dynamic scheduling of loop iterations in OpenMP.
By default, when OpenMP, executing with T threads, encounters a parallel loop of N iterations, it assigns the first N/T iterations to thread 0, the second chunk of N/T iterations to thread 1 and so on.
Sometimes this simple, default "scheduling" of the iterations is not ideal. It may be the case that iterations of high index represent more work. In that case, the lower numbered threads will finish early and have nothing to do.
The static schedule clause modifies the iteration assignment procedure by essentially "dealing out" the iterations. The clause schedule(static,5), for instance, indicates that the N iterations are to be dealt out in groups of 5, until all are assigned. This schedule might divide up the work more evenly.
In more complicated situations, where the work involved with each iteration can vary dramatically, the dynamic schedule clause allows the user to parcel out a small number of iterations initially, and then to wait until a thread is finished that work, at which point it is given another group of iterations. The format of this clause is schedule(dynamic,7) where here 7 iterations are assigned to each thread initially, and the remaining work is assigned, 7 iterations at a time, to threads that finish what they have already been assigned.
For simplicity, we assume that we have a loop of 16 iterations, which has been parallelized by OpenMP, and that we are about to execute that loop using 2 threads.
In default scheduling
In static scheduling, using a "chunksize" of 4:
In dynamic scheduling, using a "chunksize" of 3:
In the BASH shell, the program could be run with 2 threads using the commands:
export OMP_NUM_THREADS=2 ./schedule_openmp
The information on this web page is distributed under the MIT license.
schedule_openmp is available in a C version and a C++ version and a Fortran90 version.
openmp_test, a Fortran90 code which uses the OpenMP application code interface for carrying out parallel computations in a shared memory environment.