Mpirun on Rocks clusters is used to launch jobs that are linked with the Ethernet device for MPICH.
You must run HPL as a regular user (that is, not root). If you don't have a user account on the cluster, create one for yourself, and propogate the information to the compute nodes with:
|
For example, to interactively launch the benchmark "High-Performance Linpack" (HPL) on two processors:
Create a file in your home directory named machines, and put two entries in it, such as:
compute-0-0 compute-0-1 |
Download the the two-processor HPL configuration file and save it as HPL.dat in your home directory.
Now launch the job from the frontend:
$ ssh-agent $SHELL $ ssh-add $ /opt/mpich/gnu/bin/mpirun -nolocal -np 2 -machinefile machines /opt/hpl/mpich-hpl/bin/xhpl |
mpirun from OpenMPI is present at /opt/openmpi/bin/mpirun on a Rocks frontend. To use this version of MPI to run the linpack benchmark interactively, the procedure given below must be followed.
Download the the two-processor HPL configuration file and save it as HPL.dat in your home directory.
Create a file in your home directory named machines, and put two entries in it, such as:
compute-0-0 compute-0-1 |
Now launch the job from the frontend:
$ ssh-agent $SHELL $ ssh-add $ /opt/openmpi/bin/mpirun -np 2 -machinefile machines /opt/hpl/openmpi-hpl/bin/xhpl |
Cluster-Fork runs a command on compute nodes of your cluster.
Often we want to execute parallel jobs consisting of standard UNIX commands. By "parallel" we mean the same command runs on multiple nodes of the cluster. We use these simple parallel jobs to move files, run small tests, and to perform various administrative tasks.
Rocks provides a simple tool for this purpose called cluster-fork. For example, to list all your processes on the compute nodes of the cluster:
$ cluster-fork ps -U$USER |
By default, cluster-fork uses a simple series of ssh connections to launch the task serially on every compute node in the cluster. Cluster-fork is smart enough to ignore dead nodes. Usually the job is "blocking": cluster-fork waits for the job to start on one node before moving to the next. By using the --bg flag you can instruct cluster-fork to start the jobs in the background. This corresponds to the "-f" ssh flag.
$ cluster-fork --bg hostname |
Often you wish to name the nodes your job is started on. This can be done by using an SQL statement or by specifying the nodes using a special shorthand.
The first method of naming nodes uses the SQL database on the frontend. We need an SQL statement that returns a column of node names. For example, to run a command on compute nodes in the first rack of your cluster exectute:
$ cluster-fork --query="select name from nodes where name like 'compute-1-%'" [cmd] |
The next method of requires us to explicitly name each node. When launching a job on many nodes of a large cluster this often becomes cumbersome. We provide a special shorthand to help with this task. This shorthand, borrowed from the MPD job launcher, allows us to specify large ranges of nodes quickly and concisely.
The shorthand is based on similarly-named nodes and uses the --nodes option. To specify a node range compute-0-0 compute-0-1 compute-0-2, we write --nodes=compute-0-%d:0-2. This scheme works best when the names share a common prefix, and the variables between names are numeric. Rocks compute nodes are named with such a convention.
Other shorthand examples:
Discontinuous ranges:
compute-0-%d:0,2-3 --> compute-0-0 compute-0-2 compute-0-3
Multiple elements:
compute-0-%d:0-1 compute-1-%d:0-1 --> compute-0-0 compute-0-1 compute-1-0 compute-1-1
Factoring out duplicates:
2*compute-0-%d:0-1 compute-0-%d:2-2 --> compute-0-0 compute-0-0 compute-0-1 compute-0-1 compute-0-2
$ cluster-fork --nodes="compute-2-%d:0-32 compute-3-%d:0-32" ps -U$USER |
The previous example lists the processes for the current user on 64 nodes in racks two and three.