Power ClusterPower is a Linux cluster system running CentOS release 5.5. The cluster consists of a single head node (power), and 25 compute nodes (some with 16GB, others with 36GB of memory) and 8 to 12 XEON cores each. Users belonging to netgroup 'power' can login and run their batch jobs on it. The Faculty Computer Coordinators can change their netgroup from general to power.
Users jobs are executed on the compute nodes (c0-0 - c0-25) under control of a queuing system (PBS). Users are able to logon to the head node, power, via ssh (where their home directory is mounted from the CC filer, the same as on the other CC servers) and submit their jobs to the batch system.
Start by :
ssh power -l username
- Create a batch job script, for example, file named script that contains the following lines:
- Send the script to be executed in one of the existing queues, for example:
qsub -q medium script
- You can see the status of your executing jobs by executing:
qstat -u your-username
- You can see the status of all the executing jobs by executing:
- To see the current available queues and their cputime and memory limits, execute:
More detailed information on any queue limits may be viewed by:
qmgr -c "list queue queuename"
qmgr -c "list queue medium"
Default queue limits are enforced unless specified otherwise (up to max values) on 'qsub' command, for example:
qsub -q hugemem -lpmem=2000mb,pvmem=3000mb script-name
- The standard output and standard error files will be written at the end of the execution to files in your home directory: script.o#n and script.e#n (where #n is the job number given to your job by the batch queueing system).
- Interactive sessions (line mode) are enabled too - via
'qsub -I' command - which enables execution of the interactive jobs under the PBS queuing system.
- Interactive sessions with X:
qsub -I -X -q 'queuename'
But - if you want to run matlab via an interactive session, -X slows the matlab execution. The workaround (necessary only in case of matlab) is to replace -X by -v DISPLAY=....
qsub -I -v DISPLAY=your.machine.name:0
xhost+ power (on your.machine.name)
- Use the
qdel job_id to delete job from the queue
- Parallel jobs can be executed in the cluster - using up to 8 cores for a job. For example,
jobs compiled with mpich can be submitted with the following command:
qsub -l nodes=2:ppn=8 -q parallel your-script-filename
- Multithreaded matlab jobs can be submitted with the following command:
qsub -l nodes=1:ppn=8 -q medium your-matlab-script-name