Portable Batch System (PBS) is the name of computer software that performs job scheduling. Its primary task is to allocate computational tasks, i.e., batch jobs, among the available computing resources.
You can now get this tutorial with HPC playgrounds from the “Scientific Computing Essentials” course available at the Scientific Programming School.
The following versions of PBS are currently available:
Before the emergence of clusters, the Unix-based Network Queuing System (NQS
) from NASA Ames Research Center was a commonly used batch-queuing system. Then with the emergence of parallel distributed system, NQS
began to show its limitations. Consequently, Ames
then led an effort to develop requirements and specifications for a
newer, cluster-compatible system. With NASA’s funding this efforts
resulted into PBS in the early 1990s. In 2003, PBS was acquired by Altair Engineering and is now marketed as PBS Pro by Altair Grid Technologies, a subsidiary of Altair Engineering.
However, PBS Pro is now an open source project and has become a part of the OpenHPC software stack. It is readily available for download from pbspro.org along with its full source code.
PBS commands (from the Learn to Use HPC and Supercomputers course)
The PBS allows to perform three actions:
When you are new to PBS, the place to start is the qsub
command, which submits jobs to your HPC systems. The only jobs that the qsub
accepts are scripts, so you’ll need to package your tasks appropriately. Here is a simple example script (myjob.pbs
#PBS -N demo // job name
#PBS -o demo.txt // output file
#PBS -e demo.txt // error output file
#PBS -q workq // queue name
#PBS -l mem=1024mb // requested memory size
mpiexec -machinefile /etc/myhosts -np 4 /home/user/area
The first line specified the shell to use in interpreting the script, while the next few lines starting with #PBS
are directives that are passed to PBS. The first names the job, the
next two specify where output and error output go, the next to last
identifies the queue that is used, and the last lists a resource that
will be needed, in this case 1024 MB
of memory. The blank line signals the end of PBS directives. Lines that
follow the blank line indicate the actual job (details on the mpi
commands are discussed later). Once you have created the batch script for your job, the qsub
command is used to submit the job:
$ qsub job.pbs
PBS Script Example (from the Learn to Use HPC and Supercomputers course)
However, the newer PBS (PBS Pro), comes with the concept of resource chunking using a select
parameter, let’s see another job submission script example:
#PBS -q workq
#PBS -l select=2:ncpus=16:mpiprocs=16
#PBS -l walltime=<hh:mm:ss>
#PBS -o <output-file>
#PBS -e <error-file>module load intel
module load mpi/intel.4.2.0
mpirun <exec> <inputfiles>
Where the line -l select=2:ncpus=16:mpiprocs=16
is the number of processors required for the MPI job. select specifies the number of nodes (or chunks of resource) required; ncpus
indicates the number of CPUs per chunk required; and mpiprocs
represents the number of MPI processes to run out of the CPUs selected (normally ncpus
would equal mpiprocs
The flag ompthreads
can be used to specify the number of OpenMP threads used in the job script. For example: #PBS -l select=2:ncpus=16:mpiprocs=8:ompthreads=2
This would automatically set the environment variable OMP_NUM_THREADS=2
and it can easily be seen when using OpenMP with MPI that ompthreads
multiplied by mpiprocs
should equal ncpus
if you want all MPI tasks to each have the same number of threads.