MOAB
A Batch Job Scheduler for Clusters


MOAB is a batch job scheduler for clusters. MOAB allows a user to set up a batch file describing how a program is to be executed in parallel. Once the batch file is submitted, it goes into a queue and MOAB waits for a time when the desired number of processors are available. At that time, MOAB allows the job to begin execution. On job completion, MOAB gathers the program output and returns it to the user. Until the job is complete, MOAB allows the user to query its current status.

A user typically logs into a special login node of the cluster, which is intended only for editing, file management, job submission, and other small interactive tasks.

The user wishes to run a parallel program on several processors of the cluster. To do so, the user must create an executable version of the program, write a suitable MOAB batch job script describing the job limits, and listing the commands to be executed, and then submit the script to MOAB for processing.

The job script can be thought of as consisting of two parts:

The user has several separate issues when preparing a first job script:

and and

Local Installation:

Users of the FSU HPC system must apply for an account by going to the web page: http://www.hpc.fsu.edu, and choosing, under the "Your HPC Account" menu on the side, the topic "Apply for an Account". Accounts are available as general access accounts (anyone) and owner-based accounts (requiring authorization from the "owner").

Any FSU faculty member can get a general access account; any researcher can also get a general access account if they have an FSU faculty sponsor.

Some researchers support the HPC system, and in return have priority access to components of the system - they are, in essence, "owners" of part of the system. You can get an account on an owner-based component of the HPC at the discretion of the owner.

Once you have applied for an account, and it has been approved, you can access the system, using an ID (which may be the same as your FSU ID, or not) and a password associated with the HPC system. To log into the system, use ssh and the address of the component to which you have been assigned. For instance, I log in using the command:

        ssh sc.hpc.fsu.edu
      
or
        ssh -Y sc.hpc.fsu.edu
      
to enable X window graphics, which are needed, for instance, if you want to work interactively with MATLAB.

To transfer files between a local system and the HPC, you need the sftp command. I do this in a second window, so that I have interactive access in my ssh window, while file transfers occur in the sftp window. The command that makes the connection for me is:

        sftp sc.hpc.fsu.edu
      
and I put files from my local system to the HPC by a command like
        put fred.txt
      
and get files from the HPC back to the local system by
        get jeff.txt
      

The main reason for using the HPC is to be able to compute in batch mode - one or many jobs, submitted to a queue, to run "eventually". You can log out after you submit jobs, and log in later at your convenience to collect the output from completed jobs. Parallel programs can be run on multiple processors this way. Matlab programs, whether parallel or sequential, can also be submitted to the batch queue.

The commands that make a job run in the batch queue form a MOAB script. The first part of the script contains commands to the MOAB job scheduler. These commands begin with the string "#MOAB" and specify the maximum time limit, the number of processes, the particular queue you will use, and so on. While general users might use the "classroom" queue, I access the queue "gunzburg_q" associated with my research group.

After the MOAB commands come a sequence of commands that you might imagine typing in interactively; that is, these might be the normal sequence of UNIX commands you would issue to run a particular job. For details, refer to information elsewhere on MOAB.

Briefly, if you have a MOAB script file called fred.sh you can submit it to the queue by a command like

        msub fred.sh
      
You can check to see the status of your job by the command
        qstat
      
and when the job is completed, you should find an output file in your HPC directory containing the output, or the error messages that explain why you didn't actually get any output.

Licensing:

The computer code and data files made available on this web page are distributed under the GNU LGPL license.

Related Data and Programs:

MATLAB_COMMANDLINE, MATLAB programs which illustrate how MATLAB can be run from the UNIX command line, that is, not with the usual MATLAB command window.

MATLAB_COMPILER, MATLAB programs which illustrate the use of the Matlab compiler, which allows you to run a Matlab application outside the Matlab environment.

MATLAB_PARALLEL, examples which illustrate local parallel programming on a single computer with MATLAB's Parallel Computing Toolbox.

MPI, C programs which illustrate the use of the MPI application program interface for carrying out parallel computations in a distributed memory environment.

OPENMP, FORTRAN90 programs which illustrate the use of the OpenMP application program interface for carrying out parallel computations in a shared memory environment.

Source Code:

ENVIRON is a batch job script that simply queries the values of certain environment variables, in particular PBS_O_WORKDIR, which can be useful when trying to set up a program to run under the batch system.

HELLO is a batch job script that works with MOAB to compile and run a program. The script also "cleans up" after itself, that is, it discards the executable program once the job is complete.

HELLO_OPENMP illustrates the compilation and execution of a program that includes OpenMP directives.

HELLO_MPI illustrates the compilation and execution of a program that includes MPI directives.

POWER_TABLE shows how a MATLAB program can be run through MOAB. We prepare a file of input commands, and invoke MATLAB on the commandline.

You can go up one level to the EXAMPLES source code page.


Last revised on 19 April 2013.