Difference between revisions of "QUonG HowTo"

From APEWiki
Jump to navigationJump to search
Line 25: Line 25:
 
* 8 nodes (q004-11) in the '''run''' queue (4 hrs. run time limit)
 
* 8 nodes (q004-11) in the '''run''' queue (4 hrs. run time limit)
 
* remaining 4 nodes (q012-15) in the '''run''' queue under SLURM reservation, needs permission for '--reservation=apenet_development' option - ask us if you need access
 
* remaining 4 nodes (q012-15) in the '''run''' queue under SLURM reservation, needs permission for '--reservation=apenet_development' option - ask us if you need access
 
=== Reviewing resources on QUonG ===
 
* 'sinfo': list of currently available queues and their status
 
 
Example output:
 
 
  PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
 
debug*      up      30:00      2  alloc q[000-001]
 
debug*      up      30:00      2  idle q[002-003]
 
run          up    4:00:00      4  maint q[012-015]
 
run          up    4:00:00      6  alloc q[004-009]
 
run          up    4:00:00      2  idle q[010-011]
 
 
* 'squeue': list of currently queued jobs in the available queues
 
 
Example output:
 
 
  JOBID PARTITION    NAME    USER  ST      TIME  NODES NODELIST(REASON)
 
656479    debug  job7.sh    delia  R      4:32      1 q001
 
656480    debug  job8.sh    delia  R      4:24      1 q000
 
656473      run  job1.sh    delia  R      5:07      1 q008
 
656474      run  job2.sh    delia  R      5:04      1 q009
 
656475      run  job3.sh    delia  R      4:57      1 q004
 
656476      run  job4.sh    delia  R      4:44      1 q005
 
656477      run  job5.sh    delia  R      4:42      1 q006
 
656478      run  job6.sh    delia  R      4:36      1 q007
 
  
 
=== To run on QUonG: ===
 
=== To run on QUonG: ===
Line 101: Line 75:
 
  q013.qng
 
  q013.qng
 
  q013.qng
 
  q013.qng
 +
 +
==== sinfo ====
 +
'''sinfo''' lists currently available queues and their status.
 +
 +
Example output:
 +
 +
  PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
 +
debug*      up      30:00      2  alloc q[000-001]
 +
debug*      up      30:00      2  idle q[002-003]
 +
run          up    4:00:00      4  maint q[012-015]
 +
run          up    4:00:00      6  alloc q[004-009]
 +
run          up    4:00:00      2  idle q[010-011]
 +
 +
==== squeue ====
 +
'''squeue''' lists currently queued jobs in the available queues.
 +
 +
Example output:
 +
 +
  JOBID PARTITION    NAME    USER  ST      TIME  NODES NODELIST(REASON)
 +
656479    debug  job7.sh    delia  R      4:32      1 q001
 +
656480    debug  job8.sh    delia  R      4:24      1 q000
 +
656473      run  job1.sh    delia  R      5:07      1 q008
 +
656474      run  job2.sh    delia  R      5:04      1 q009
 +
656475      run  job3.sh    delia  R      4:57      1 q004
 +
656476      run  job4.sh    delia  R      4:44      1 q005
 +
656477      run  job5.sh    delia  R      4:42      1 q006
 +
656478      run  job6.sh    delia  R      4:36      1 q007
  
 
=== Notes ===
 
=== Notes ===
  
 
* As can you see from the examples, the default for the '-n/-np' option when launching by 'mpirun' is to run on all cores on all allocated nodes. The nodes are chosen
 
* As can you see from the examples, the default for the '-n/-np' option when launching by 'mpirun' is to run on all cores on all allocated nodes. The nodes are chosen

Revision as of 11:13, 13 November 2013

Available Resources on QUonG:

  • For hardware:
    • 16 nodes (SuperMicro X8DTG-D)
    • each node is dual socket, with one 4-core Intel Xeon E5620 2.40GHz per socket
    • each node owns 2 Tesla M2075
    • each node owns 1 InfiniBand card - Mellanox ConnectX (MT26428) PCIe Gen2 on a x4 PCIe slot
  • For sofware:
    • each node OS is (diskless, boot from network) Centos 6.4 with kernel 2.6.32-358.2.1.el6.x86_64 (x86_64 arch)
    • GNU C/C++/Fortran compiler is version 4.4.7
    • OpenMPI 1.5.4 (standard package within CentOS 6.4) - configurable with module load openmpi.x86_64
    • OpenMPI 1.7 - install path is /usr/local/ompi-trunk
    • MVAPICH2-1.8 - install path is /usr/local/mvapich2-1.8
    • MVAPICH2-1.9a2 - install path is /usr/local/mvapich2-1.9a2
    • NVIDIA driver version 295.41 with CUDA 4.2 SDK (install path is /usr/local/cuda)
    • SLURM batch job manager

Using QUonG

Available SLURM queues on QUonG

  • 4 nodes (q000-03) in the debug queue (30 min. run time limit)
  • 8 nodes (q004-11) in the run queue (4 hrs. run time limit)
  • remaining 4 nodes (q012-15) in the run queue under SLURM reservation, needs permission for '--reservation=apenet_development' option - ask us if you need access

To run on QUonG:

salloc

salloc allocates a number of nodes from a queue, drops you into an interactive shell where you can run from with 'mpirun' or 'srun' (see below); you exit the shell to relinquish the resources.

Example: allocate 2 nodes (-N option) from the 'debug' queue (-p option), run 'hostname' on them, then exit

[user@quong ~]$ salloc -N 2 -p debug
salloc: Granted job allocation 656482
[user@quong ~]$ mpirun hostname
q002.qng
q002.qng
q003.qng
q003.qng
[user@quong ~]$ exit
exit
salloc: Relinquishing job allocation 656482

srun

srun launches an executable or a script onto the first available nodes of the requested queue allocating them first, if not already within a 'salloc' shell.
THIS IS THE ONLY WORKING WAY WITH MVAPICH1.9a2!

Example: run 'hostname' as 4 processes (-n option) on 2 nodes from the 'run' queue

[user@quong ~]$ srun -N 2 -n 4 -p run hostname
q010.qng
q010.qng
q011.qng
q011.qng

sbatch

sbatch submits a script (and only a script!) into a queue asking for a number of nodes.

Example: submit a script that runs 'hostname' by 'mpirun' on 3 nodes from the 'run' queue in the 'apenet_development' reservation (--reservation option)

[user@quong ~]$ cat test.sh
#!/bin/bash
mpirun hostname
[fsimula@quong ~]$ sbatch -p run --reservation=apenet_development -N 3 ./test.sh
Submitted batch job 656517

If not else specified, stdout is redirected to a file 'slurm_$jobid.out'

[fsimula@quong ~]$ cat slurm-656517.out
q012.qng
q012.qng
q014.qng
q014.qng
q013.qng
q013.qng

sinfo

sinfo lists currently available queues and their status.

Example output:

 PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
debug*       up      30:00      2  alloc q[000-001]
debug*       up      30:00      2   idle q[002-003]
run          up    4:00:00      4  maint q[012-015]
run          up    4:00:00      6  alloc q[004-009]
run          up    4:00:00      2   idle q[010-011]

squeue

squeue lists currently queued jobs in the available queues.

Example output:

 JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
656479     debug  job7.sh    delia   R       4:32      1 q001
656480     debug  job8.sh    delia   R       4:24      1 q000
656473       run  job1.sh    delia   R       5:07      1 q008
656474       run  job2.sh    delia   R       5:04      1 q009
656475       run  job3.sh    delia   R       4:57      1 q004
656476       run  job4.sh    delia   R       4:44      1 q005
656477       run  job5.sh    delia   R       4:42      1 q006
656478       run  job6.sh    delia   R       4:36      1 q007

Notes

  • As can you see from the examples, the default for the '-n/-np' option when launching by 'mpirun' is to run on all cores on all allocated nodes. The nodes are chosen