PBS at TPAC
The Portable Batch Systems (PBS) used at TPAC varies between compute systems. Vortex and Katabatic use Maui/Torque package while Eddy and kunanyi use open-source PBS Pro.
PBS Pro User Documentation
Maui/Torque Documentation
The following is intended to be generic information that can be applied to both PBS types.
Quick Syntax guide
qstat
|
Standard queue status command supplied by PBS. See man qstat for details of options.
Some common uses are:
|
|||||||||||||||||||||||||||
qdel jobid | Delete your unwanted jobs from the queues. The jobid is returned by qsub at job submission time, and is also displayed in the nqstat output. |
|||||||||||||||||||||||||||
qsub |
Submit jobs to the queues. The simplest use of the qsub command is typified by the following example (note that there is a carriage-return after ./a.out ):
or
where
You submit this script for execution by PBS using the command:
You may need to enter data to the program and may be used to doing this interactively when prompted by the program. There are two ways of doing this in batch jobs. If, for example, the program requires the numbers 1000 then 50 to be entered when prompted. You can either create a file called, say, input containing these values
then run the program as
or the data can be included in the batch job script as follows:
Notice that the PBS directives are all at the start of the script, that there are no blank lines between them, and there are no other non-PBS commands until after all the PBS directives. qsub options of note:
|
|||||||||||||||||||||||||||
qps jobid | show the processes of a running job | |||||||||||||||||||||||||||
qls jobid | list the files in a job’s jobfs directory | |||||||||||||||||||||||||||
qcat jobid | show a running job’s stdout, stderr or script | |||||||||||||||||||||||||||
qcp jobid | copy a file from a running job’s jobfs directory |
The man pages for these commands on the system detail the various options you will probably need to use.
An SSH client is required to connect to all HPC systems. For windows users the PuTTY client is a good free client but there are many others. Mac OS X users can use the builtin Terminal app.
Your account will have been enabled only for the HPC system appropriate for your project(s).
Access to Eddy and kunanyi is via jumpbox.tpac.org.au. Once connected to jumpbox please read the “Message of the Day” (MOTD) for cluster status and instructions for how to connect to each cluster.
Requests to TPAC helpdesk can be submitted online via the TPAC Jira Portal or via email to helpdesk@tpac.org.au. Through the Jira portal you will be able to track the progress of your request.
How to get online help
To find a command or library routine that performs a required function, try searching by keyword e.g.
man -k keyword
or
apropos keyword
Use “man command_name” to find details on how to use a unix command or library e.g.
man cat
man ls
If no man page is found, try “info command_name” e.g.
info module
Manipulating files and directories
- ls
List contents of current directory - cd
Change directory - rm
Remove file of directory - mkdir
Make a new directory
Use the “man” command for more information on the above.
A few notes on Unix directory names.
A Unix file full path name is constructed of the directory and subdirectory names separated by slashes “/”. ie. /u/jsmith/work/file1. When you first login you will be in your “home” directory at TPAC this is usually /u/username. In most shells this can also be referenced as ~username.
For example if your username is asmith then “cd ~asmith/work” will take you to the “work” directory in your home directory.
All Unix full path names start with “/”, (There are no Drive/Volume names as in Windows). Hence any filename starting with “/” is a full pathname.
A filename containing one or more slashes “/” will refer to a subdirectory of the “current working directory”. The current working directory may also be referenced as dot “.” i.e. ./subdirectory/file.
The parent of the “current working directory” may be referenced as dot-dot “..”. For example if you have two directories in your home directory work1 and work2 and you cd to work1 you can then change to work2 by typing the command “cd ../work2”.
On each of the compute systems you can run “module avail” which will provide a list of installed software and their versions. Additional software and or versions can be requested via the TPAC helpdesk or email to helpdesk@tpac.org.au.
If the application you are running on a node produces output on STDOUT it is important to ensure that this output is captured to a file in your home directory. If it isn’t redirected it will be captured to a file within the node’s local storage which has limited space. If the local storage file system fills up it may cause the job that is running to terminate early or produce inconsistent data.
It is recommended that you use the -e and -o options when running qsub. To ensure these are not forgotten it would be best to create a shell script to start your job as follows:
qsub_job.sh:
#!/bin/bash
#PBS -e errors.txt
#PBS -o output.txt
cmds
Alternatively the two output streams can be joined together. The following sends the error stream to the output stream:
qsub_job.sh:
#!/bin/bash
#PBS -j oe
#PBS -o output.txt
cmds
It is also possible on kunanyi to see the output and error streams while the job is still running using jobtail.sh and jobcat.sh.
Use “qstat -f” to provide information about the current status of the job.
In your job startup script you should redirect STDOUT to a file in your home directory. This may give you information about what the job is doing depending on your application.
#PBS -l mem=600mb
Because of the unique architecture of eddy cpus and memory are linked. If you specify both it is possible to end up with a job that can’t be run and will remain queued until deleted. When submitting jobs on Eddy specify memory or cpus but not both.
On kunanyi each node has 128GB of RAM and 28 CPUs if you are using whole nodes then there is no need to specify memory. If you are using a portion of a node then specifying memory accurately will allow the schedule to find a suitable node more easily to run your jobs
When specifying an array request to qsub please be aware of the following. The version of PBS running on kunanyi has replaced the “-t” option with “-J”. This change has not been reflected in the qsub man pages which still refer to “-t” but qsub will complain with invalid option ‘-t’ if used.