An Introduction to MPI


This document is intended as a true introduction. In other words, it is a short presentation of enough information to allow you to decide whether to pursue an interest in MPI.

MPI enables parallel programming: MPI is a system for setting a computation to be carried out by many cooperating processors. The reason for involving many processors is simple: to get the computation done faster. It is not unusual to run MPI programs using 10, or 100, or 1,000 processors. When running on 1,000 processors, it's reasonable to hope for a speedup on the order of 500; the ideal speedup would be 1000, of course. Actual speedups depend on the type of problem being solved, the properties of the algorithm used, and the skill of the programmer.

MPI works on distributed memory systems: for efficiency, MPI is usually run on clusters of many computers of the same type, with high speed connections. MPI could also be run (with less reliable speedup) on collections of several types of computers with relatively slow connections. The important thing is that MPI assumes a distributed memory, that is, that each processor has a separate memory, and that the data for the problem has been divided up among these processors. This means that, from time to time, one processor may need to "ask" another processor for a copy of certain data or results. (If your system uses shared memory, you'll want instead to look at OpenMP instead.)

MPI is implemented as a library of functions: A C or FORTRAN programmer doesn't have to learn a new language to use MPI; if a sequential program already exists, much of that program can be used for an MPI version. An MPI program compiles just like a regular program. At link time, the MPI library must be accessed. However, standard debuggers such as GDB will not be useful for analyzing problems with the executable program.

One program is executed on all processors: To set up a parallel computation, the user writes a single program. The executable version of this program is copied to all the cooperating processes, and started on them simultaneously. Because each processor is assigned a unique identifier, it is possible to have the processors do different work, even though they are running the same program. Thus, processor 0 is often "put in charge" of the computation, for instance.

Data is "scattered" across processors: All the memory of the processors is available to the programmer. This is one way to solve memory-intensive problems that would exhaust the memory resources of a single processor. On the other hand, data may reside on one processor and be needed by a different one. MPI functions exist to handle such data transfers; however, data transfer is relatively slow, communication lines are limited, and a careful programmer avoids excessive communication.

Converting an Existing Code Takes Work: to take advantage of the multiple processors available, the programmer must rethink the sequential algorithm. Even in simple cases, it is usually necessary to explicitly divide up the problem, assign one "chunk" of the data or work to each processor, and gather the results together at the end. Some problems can't easily be converted to this approach; some programs may be very resistant to being recast in this format. Moreover, it is usually not possible to convert a program to MPI one step at a time. Until the whole program is converted, it can't be tried out.

OpenMP is an Alternative: Another attractive approach to parallel programming is called OpenMP. This system is used when a shared memory machine is available, that is, when several processors share a single physical memory (they're on the same chip), or a number of processors can address the same "logical" memory (the compiler effectively lumps all the memory together, and the hardware allows the processors to effectively share each other's memory). An existing sequential program can be turned into an OpenMP program simply by inserting special comment statements. These comment statements make a particular loop execute in parallel, and divide the loop variables into "shared" and "private" classes. A program can be converted to OpenMP in stages, one loop at a time.


You can return to the HTML web page.


Last revised on 27 November 2007