Computational Cluster Tiger

How to Run Programs on Tiger

User programs

All programs running on the cluster must be compatible with 64-bit system libraries so that they could be migrated among the nodes. This means that all programs should be compiled locally -- there are C, C++ and F95 compilers from GCC suite 4.4.7 available. Please, feel free to contact Tiger's administrator if you suffer lack of some tools or libraries needed to compile and run your programs.

Migration of processes among the cluster nodes poses some further limitations:

program has to be dynamically linked
program's and linked libraries' binaries must not be changed during the run
program should not rely on standard input and output (see below)
program must not try to access files outside the /home directory
program must run in a single thread

Program invocation

All computational jobs have to be registered to the clustering software Wimpy which controls their location on the cluster nodes. This is achieved by running programs via utility mpirun:

 user@tiger: ~ > mpirun my_program arg1 arg2 ...

Jobs are automatically detached from the calling terminal; therefore, it is not necessary to use & sign for execution on background. There are several options how to start the program via mpirun, which should be used acording to the type of the process. Please, check the manual pages and rules sections for details.

When starting a large number of jobs at once, it is better to start them immediately one by one, which minimises number of scheduler invocations. On the other hand, all jobs are being started on one specific node and it is important not to overload it. Currently, it is considered to be safe to start < 100 jobs at once provided their total memory consumption does not exceed 64GB. Subsequent large batch of jobs should only be started after the previous one has ben moved away from the starting node (see page Hardware for its ID number). Typically, this should happen no longer than 10 minutes after their start.

Don't start programs from the Midnight Commander command line. This may lead to failure during manipulation with the program's file descriptors and, consequently, to its unexpected behavior.

Standard input & output

When invoked by mpirun, the program's standard input, output and error output (file descriptors 0, 1 and 2) are by default redirected to /dev/null, i.e. you will never see any output on terminal, nor will you be able to interact with the program via keyboard. All standard file descriptors can be redirected to regular files via options of the mpirun command. Standard output can be then monitored with tail -f, e.g.:

 user@tiger: ~ > mpirun --stdout=my_output my_program
 user@tiger: ~ > tail -f my_output

Short-lived programs

Registering of a new job invokes scheduler which places jobs on individual nodes. This usually leads to enhanced migration of processes. The same happens after the job finishes. Instead of e.g. running many individual jobs with different arguments, users should, if possible, run one job with internal loop. A limited munber of short lived jobs (no more than ten simultaneously) can be run directly on the headnode Sirrah, i.e. without use of the mpirun utility.

Scripts

Never run wrapping script via mpirun! Following example is strongly forbidden. In this way, only the script.sh which, however, finishes immediately would be registered to the system, while CPU-time consuming tasks would remain hidden to Wimpy and could not be distributed across the cluster:

 #!/bin/sh
 # incorrect script that invokes several programs
 ./my_program a &
 ./my_program b &
 ./my_program c &
 # end of the script

 user@tiger: ~ > mpirun ./script.sh

Correct way how to do that (notice that all programs will be started immediately):

 #!/bin/sh
 # correct script that invokes several programs
 mpirun ./my_program a
 mpirun ./my_program b
 mpirun ./my_program c
 # end of the script

 user@tiger: ~ > ./script.sh

It is evident from the correct variant that wrapping several jobs into shell script cannot help to solve the issue of short-lived jobs described in the previous section. (Each process is started with its own PID.)

Further control of running processes

In contrary to transparent systems (e.g. MOSIX), processes migrated from the headnode disappear from its userspace. Hence, it is necessary to use special utilities for their management (see page Utilities for details): List of all processes running on the cluster can be obtained via mpilist. They can be signalled (e.g. killed) with use of mpikill or reniced with mpirenice. If the user utilities response: "connect to server failed", it indicates that the Wimpy server daemon died. Please, don't start new processes in that case.

Last updated: 20.12.2014 (L.)