pbs_psc, examples which illustrate the use of the Portable Batch System (PBS), a scheduler which controls the submission and execution of jobs on the Pittsburgh Supercomputing Center (PSC) computer clusters.
A user typically logs into a special login node of the cluster, which is intended only for editing, file management, job submission, and other small interactive tasks.
The user wishes to run a parallel program on several processors of the cluster. To do so, the user must create an executable version of the program, write a suitable PBS batch job script describing the job limits, and listing the commands to be executed, and then submit the script to for processing using the qsub command.
The job script can be thought of as consisting of two parts:
The user has several separate issues when preparing a first job script:
cd $PBS_O_WORKDIRwhich changes to the exact directory from which you submitted the script. This is a very natural thing to do.
To log into the system, use ssh and the address of the component to which you have been assigned. For instance, you might log into the PSC Greenfield system by inserting your username in the following command:
ssh USERNAME@greenfield.psc.xsede.org
To transfer files, you need the sftp command. I do this in a second window, so that I have interactive access in my ssh window, while file transfers occur in the sftp window. The command that makes the connection for me is:
sftp USERNAME@greenfield.psc.xsede.organd I put files from my local system to the PSC by a command like
put fred.txtand get files from the PSC back to the local system by
get jeff.txt
The main reason for using a cluster is to be able to compute in batch mode - one or many jobs, submitted to a queue, to run "eventually". You can log out after you submit jobs, and log in later at your convenience to collect the output from completed jobs. Parallel programs can be run on multiple processors this way. Matlab programs, whether parallel or sequential, can also be submitted to the batch queue.
The commands that make a job run in the batch queue form a PBS script. The first part of the script contains commands to the job scheduler. These commands begin with the string "#PBS" and specify the maximum time limit, the number of processes, the particular queue you will use, and so on. Most batch jobs should go into the "batch" queue.
The PBS command "#PBS -l nodes=1:ppn=15" specifies a limit of just 1 node, and all 15 cores on that node. In fact, on the PSC greenfield machine, the number of cores should always be 15. In fact, your job won't run with any other choice.
After the PBS commands come a sequence of commands that you might imagine typing in interactively; that is, these might be the normal sequence of UNIX commands you would issue to run a particular job.
Briefly, if you have a PBS script file called fred.sh you can submit it to the queue by a command like
qsub fred.shYou can check to see the status of your job by the command
qstator
qstat -uUSERNAMEWhen the job is completed, you should find an output file in your directory containing the output, or the error messages that explain why you didn't actually get any output.
The information on this web page is distributed under the MIT license.
cplex_slurm_arc, examples which uses the slurm() job scheduler to submit a cplex() job to Virginia Tech's Advanced Research Computing (ARC) computer cluster.
mpi, a C code which uses the MPI application program interface for carrying out parallel computations in a distributed memory environment.
openmp, a Fortran90 code which uses the OpenMP application program interface for carrying out parallel computations in a shared memory environment.
slurm_h2p, examples which demonstrate the use of the SLURM batch job scheduler for the h2p computer cluster, as administered by the Center for Research Computing (CRC) at the University of Pittsburgh.
slurm_rcc, examples which use slurm, which is a job scheduler for batch execution of jobs on the FSU Research Computing Center (RCC) computer cluster.
ENVIRON is a batch job script that simply queries the values of certain environment variables, in particular PBS_O_WORKDIR, which can be useful when trying to set up a program to run under the batch system.
FENICS_BVP_01 is a batch job script that invokes the FENICS program with the input file bvp_01.py, and an auxilliary program timestamp.py
FENICS_POISSON is a batch job script that invokes the FENICS program with the input file poisson.py, and an auxilliary program timestamp.py.
FREEFEM_MPI is a batch job script that runs the parallel version of FreeFem++ with the input file schwarz_mpi.edp.
FREEFEM_SEQUENTIAL is a batch job script that runs the sequential version of FreeFem++ with the input file membrane.edp.
HELLO is a batch job script that compiles and runs a "hello" program.
HELLO_MPI illustrates the compilation and execution of a program that includes MPI directives.
HELLO_OPENMP illustrates the compilation and execution of a program that includes OpenMP directives.
MATLAB_PARALLEL runs MATLAB with a parfor command for parallel execution.
POWER_TABLE shows how a MATLAB program can be run through the batch system. We prepare a file of input commands, and invoke MATLAB on the commandline.