Submitting, and checking job status
The most important command is the one for submitting your job script to the queue:
qsub accepts also a lot of options, so that you could in principle avoid to use the script and run a single long command with all the option you need. This is possible, but not recommended.
For checking the available options, just read the man pages:
Some of the options are implementation dependent, therefore may not all be available in the DTU installation. Once you submit the job, you can still track it. The job is given a job-id, that is shown by the output of the qstat command:
qstat Job ID Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 3597252.hpc-fe1 ...-COOETC_R9-16 s012345 3401:00: R hpc 3640759.hpc-fe1 xterm-linux s012345 0 R app
You can see all your running jobs (User column), with their job-id (first column), the name you assigned them (-N option of PBS), the current runtime, the status (column S), and the name of the queue.Some common status letters are :
Q job is queued, eligible to run or routed
R job is running
C Job is completed after having run
H Job is held
qstat can be used to extract a lot more information about your job. For a list of the possible options, type
If your job is queued (Q), and you want to have an idea about when it will run, pick its job-id from the qstat output, and type
This will show the estimated start, and stop of your job execution. It is just an estimate, based on the requested resources of your job, and the current schedule of the queue.
To get some more information about your jobs, use the command checkjob, eventually with the option -v, if you want a verbose output:
It is sometimes necessary to remove a job from a queue. This can be done in any stage, i.e. when the program is still waiting to be run (state Q), or during the run (state R). Just get the job id with qstat, and type
It can be useful to have an overview of the status of a whole queue. This can be done with the showq command:
Additional useful commands
A short overview of the current system load can be obtained issuing the command
classstat (3 “s”). It can be used with the queue name as an argument.
classstat hpc queue total used avail ---------------------------------------- hpc 1348 1248 100
It gives an overview of the system clusters capabilities and load, showing the number of cores in total, used and available.
To have an idea of the system load at node level, use
nodestat, eventually followed by the name of a specific queue, e.g.:
nodestat hpc Name State Procs Load ... n-62-12-9 Running 3:8 5.08 n-62-12-10 Busy 0:8 8.10 n-62-12-11 Running 2:8 5.99 ... n-62-23-7 Busy 0:20 20.07 n-62-23-8 Busy 0:20 20.00 n-62-23-13 Busy 0:20 20.08 n-62-23-16 Busy 0:20 20.23 n-62-23-1 Idle 20:20 0.11
This is a long output. It shows the nodename, the state if the node, the pair available-cores:total-cores, and the load.