Using mpirun To Construct An RSL Script For You


Up: Programming Tools Next: Using mpirun By Supplying Your Own RSL Script Previous: mpirun and Globus

You would use this method if you wanted to launch a single executable file, which implies a set of one or more binary-compatible machines that all share the same filesystem (i.e., they can all access the executable file).

Using mpirun to construct an RSL script for you requires a machines file. The mpirun command determines which machines file to use as follows:

    1. If a -machinefile <machinefilename> argument is specified on the mpirun command, it uses that; otherwise,
    2. it looks for a file machines in the directory in which you typed mpirun; and finally,
    3. it looks for /usr/local/mpich/bin/machines where /usr/local/mpich is the mpich installation directory.
If it cannot find a machines file from any of those places then mpirun fails.

The machines file is used to list the computers upon which you wish to run your application. Computers are listed by naming the Globus ``service'' on that machine. For most applications the default service can be used, which requires specifying only the fully qualified domain name. Consult your local Globus administrator or the Globus web site www.globus.org for more information regarding special Globus services. For example, consider the following pair of binary-compatible machines, {m1,m2}.utech.edu, that have access to the same filesystem. Here is what a machines file that uses default Globus services might look like.


"m1.utech.edu" 10 
"m2.utech.edu" 5 
The number appearing at the end of each line is optional (default=1). It specifies the maximum number of nodes that can be created in a single RSL subjob on each machine. mpirun uses the -np specification by ``wrapping around'' the machines file. For example, using the machines file above mpirun -np 8 creates an RSL with a single subjob with 8 nodes on m1.utech.edu, while mpirun -np 12 creates two subjobs where the first subjob has 10 nodes on m1.utech.edu and the second has 2 nodes on m2.utech.edu, and finally mpirun -np 17 creates three subjobs with 10 nodes on m1.utech.edu followed by 5 nodes on m2.utech.edu ending with the third a final subjob having two nodes on m1.utech.edu again. Note that inter-subjob messaging is always communicated over TCP, even if the two separate subjobs are the same machine.



Up: Programming Tools Next: Using mpirun By Supplying Your Own RSL Script Previous: mpirun and Globus


Using mpirun By Supplying Your Own RSL Script


Up: Using mpirun To Construct An RSL Script For You Next: MPMD Programs Previous: Using mpirun To Construct An RSL Script For You

You would use mpirun supplying your own RSL script if you were submitting to a set of machines that could not run or access the same executable file (e.g., machines that are not binary compatible and/or do not share a file system). In this situation, we must currently use something called a Resource Specification Language (RSL) request to specify the executable filename for each machine. This technique is very flexible, but rather complex; work is currently underway to simplify the manner in which these issues are addressed.

The easiest way to learn how to write your own RSL request is to study the one generated for you by mpirun. Consider the example where we wanted to run an application on a cluster of workstations. Recall our machines file looked like this:


"m1.utech.edu" 10 
"m2.utech.edu" 5 
To view the RSL request generated in this situation, without actually launching the program, we type the following mpirun command:

% mpirun -dumprsl -np 12 myapp 123 456

which produces the following output:


+ 
( &(resourceManagerContact="m1.utech.edu")  
   (count=10) 
   (jobtype=mpi) 
   (label="subjob 0") 
   (environment=(GLOBUS_DUROC_SUBJOB_INDEX 0)) 
   (arguments=" 123 456") 
   (directory=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus) 
   (executable=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus/myapp) 
) 
( &(resourceManagerContact="m2.utech.edu")  
   (count=2) 
   (jobtype=mpi) 
   (label="subjob 1") 
   (environment=(GLOBUS_DUROC_SUBJOB_INDEX 1)) 
   (arguments=" 123 456") 
   (directory=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus) 
   (executable=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus/myapp) 
) 
Note that (jobtype=mpi) may appear only in those subjobs whose machines have vendor-supplied implementations of MPI. Additional environment variables may be added as in the example below:


+ 
( &(resourceManagerContact="m1.utech.edu")  
   (count=10) 
   (jobtype=mpi) 
   (label="subjob 0") 
   (environment=(GLOBUS_DUROC_SUBJOB_INDEX 0) 
                (MY_ENV 246)) 
   (arguments=" 123 456") 
   (directory=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus) 
   (executable=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus/myapp) 
) 
( &(resourceManagerContact="m2.utech.edu")  
   (count=2) 
   (jobtype=mpi) 
   (label="subjob 1") 
   (environment=(GLOBUS_DUROC_SUBJOB_INDEX 1)) 
   (arguments=" 123 456") 
   (directory=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus) 
   (executable=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus/myapp) 
) 
After editing your own RSL file you may submit that directly to mpirun as follows:

% mpirun -globusrsl <myrslrequestfile>

Note that when supplying your own RSL it should be the only argument you specify to mpirun.

RSL is a flexible language capable of doing much more than has been presented here. For example, it can be used to stage executables and to set environment variables on remote computers before starting execution. A full description of the language can be found at http://www.globus.org.



Up: Using mpirun To Construct An RSL Script For You Next: MPMD Programs Previous: Using mpirun To Construct An RSL Script For You