Workstation clusters and the ch_p4 device


Up: Programming Tools Next: Checking your machines list Previous: Running programs with mpirun

Most massively parallel processors (MPPs) provide a way to start a program on a requested number of processors; mpirun makes use of the appropriate command whenever possible. In contrast, workstation clusters require that each process in a parallel job be started individually, though programs to help start these processes exist (see Section Using the Secure Server below). Because workstation clusters are not already organized as an MPP, additional information is required to make use of them. Mpich should be installed with a list of participating workstations in the file machines.<arch> in the directory /usr/local/mpich/share. This file is used by mpirun to choose processors to run on (using heterogeneous clusters is discussed in Section The P4 Procgroup File ). The rest of this section discusses some of the details of this process, and how you can check for problems. Also see Section In Case of Trouble , particularly the discussion of common problems. Also see ``In Case of Trouble'' in the full manual, particularly the discussion of common problems.



Up: Programming Tools Next: Checking your machines list Previous: Running programs with mpirun


Checking your machines list


Up: Workstation clusters and the ch_p4 device Next: Changing the Remote Shell Program Previous: Workstation clusters and the ch_p4 device

Use the script tstmachines in /usr/local/mpich/sbin to ensure that you can use all of the machines that you have listed. This script performs an rsh and a short directory listing; this tests that you both have access to the node and that a program in the current directory is visible on the remote node. If there are any problems, they will be listed. These problems must be fixed before proceeding.

The only argument to tstmachines is the name of the architecture; this is the same name as the extension on the machines file. For example,

    /usr/local/mpich/sbin/tstmachines sun4 
tests that a program in the current directory can be executed by all of the machines in the sun4 machines list. This program is silent if all is well; if you want to see what it is doing, use the -v (for verbose) argument:
    /usr/local/mpich/sbin/tstmachines -v sun4 
The output from this command might look like
Trying true on host1.uoffoo.edu ... 
Trying true on host2.uoffoo.edu ... 
Trying ls on host1.uoffoo.edu ...  
Trying ls on host2.uoffoo.edu ... 
Trying user program on host1.uoffoo.edu ... 
Trying user program on host2.uoffoo.edu ... 
If tstmachines finds a problem, it will suggest possible reasons and solutions. In brief, there are three tests:
    1. Can processes be started on remote machines? tstmachines attempts to run the shell command true on each machine in the machines files by using the remote shell command. Note that the ch_p4 devices does not require a remote shell command and can use alternative methods (see Section Using the Secure Server and Using the Secure Shell ).


    2. Is current working directory available to all machines? This attempts to ls a file that tstmachines creates by running ls using the remote shell command. Note that ch_p4 does not require that all processors have access to the same file system (see Section The P4 Procgroup File ), but the mpirun command does require this.


    3. Can user programs be run on remote systems? This checks that shared libraries and other components have been properly installed on all machines.



Up: Workstation clusters and the ch_p4 device Next: Changing the Remote Shell Program Previous: Workstation clusters and the ch_p4 device


Changing the Remote Shell Program


Up: Workstation clusters and the ch_p4 device Next: Using the Secure Shell Previous: Checking your machines list

You can change the remote shell command that the ch_p4 device uses to start the remote processes with the environment variable P4_RSHCOMMAND. For example, if the default remote shell program is rsh but you wish to use the secure shell ssh, you can do

    setenv P4_RSHCOMMAND ssh 
    mpirun -np 4 a.out 
This only works for different remote shell commands that accept the same command line arguments. If you are having trouble using the remote shell commands, consider using either the secure shell or the ch_p4mpd device.



Up: Workstation clusters and the ch_p4 device Next: Using the Secure Shell Previous: Checking your machines list


Using the Secure Shell


Up: Workstation clusters and the ch_p4 device Next: Using the Secure Server Previous: Changing the Remote Shell Program

Section Configuring with ssh explains how to set up your environment so that the ch_p4 device on networks will use the secure shell ssh instead of rsh. This is useful on networks where for security reasons the use of rsh is discouraged or disallowed.



Up: Workstation clusters and the ch_p4 device Next: Using the Secure Server Previous: Changing the Remote Shell Program


Using the Secure Server


Up: Workstation clusters and the ch_p4 device Next: SMP Clusters Previous: Using the Secure Shell

Because each workstation in a cluster (usually) requires that a new user log into it, and because this process can be very time-consuming, mpich provides a program that may be used to speed this process. This is the secure server, and is located in serv_p4 in the directory /usr/local/mpich/bin. The script chp4_servs in the same directory may be used to start serv_p4 on those workstations that you can rsh programs on. You can also start the server by hand and allow it to run in the background; this is appropriate on machines that do not accept rsh connections but on which you have accounts.

Before you start this server, check to see if the secure server has been installed for general use; if so, the same server can be used by everyone. In this mode, root access is required to install the server. If the server has not been installed, then you can install it for your own use without needing any special privileges with

    chp4_servs -port=1234 
This starts the secure server on all of the machines listed in the file /usr/local/mpich/share/machines.<arch>.

The port number, provided with the option -port=, must be different from any other port in use on the workstations.

To make use of the secure server for the ch_p4 device, add the following definitions to your environment:

    setenv MPI_USEP4SSPORT yes 
    setenv MPI_P4SSPORT 1234 
The value of MPI_P4SSPORT must be the port with which you started the secure server. When these environment variables are set, mpirun attempts to use the secure server to start programs that use the ch_p4 device. (The command line argument -p4ssport to mpirun may be used instead of these environment variables; mpirun -help will give you more information.)



Up: Workstation clusters and the ch_p4 device Next: SMP Clusters Previous: Using the Secure Shell


SMP Clusters


Up: Workstation clusters and the ch_p4 device Next: Heterogeneous networks and the ch_p4 device Previous: Using the Secure Server

When using a cluster of symmetric multiprocessors (SMPs) (with the ch_p4 device configured with -comm=shared), you can control the number of processes that communicate with shared memory on each SMP node. First, you need to modify the machines file (see Section Workstation clusters and the ch_p4 device ) to indicate the number of processes that should be started on each host. Normally this number should be no greater than the number of processors; on SMPs with large numbers of processors, the number should be one less than the number of processors in order to leave one processor for the operating system. The format is simple: each line of the machines file specifies a hostname, optionally followed by a colon (:) and the number of processes to allow. For example, the file containing the lines

mercury 
venus 
earth 
mars:2 
jupiter:15 
specifies three single processor machines (mercury, venus, and earth), a 2 processor machine (mars), and a 15 processor machine (jupiter).

By default, mpirun will use at most the number of processors specified in the machines list for each node, upto 16 processes on each machine. By setting the environment variable MPI_MAX_CLUSTER_SIZE to a positive integer value, mpirun will use upto that many processes, sharing memory for communication, on a host. For example, if MPI_MAX_CLUSTER_SIZE had the value 4, then mpirun -np 9 with the above machine file create one process on each of mercury, venus, and earth, 2 on mars (2 because the machines file specifies that mars may have 2 processes sharing memory) and 4 on jupiter (because jupiter may have 15 processes and only 4 more are needed). If 10 processes were needed, mpirun would start over from the beginning of the machines file, creating an additional process on mercury; the value of MPI_MAX_CLUSTER_SIZE prevents mpirun from starting a fifth process sharing memory on jupiter.



Up: Workstation clusters and the ch_p4 device Next: Heterogeneous networks and the ch_p4 device Previous: Using the Secure Server


Heterogeneous networks and the ch_p4 device


Up: Workstation clusters and the ch_p4 device Next: The P4 Procgroup File Previous: SMP Clusters

A heterogeneous network of workstations is one in which the machines connected by the network have different architectures and/or operating systems. For example, a network may contain 3 Sun SPARC (sun4) workstations and 3 SGI IRIX workstations, all of which communicate via the TCP/IP protocol. The mpirun command may be told to use all of these by using multiple -arch and -np arguments. For example, to run a program on 3 sun4s and 2 SGI IRIX workstations, use

    mpirun -arch sun4 -np 3 -arch IRIX -np 2 program.%a 
The special program name program.%a allows you to specify the different executables for the program, since a Sun executable won't run on an SGI workstation and vice versa. The %a is replaced with the architecture name; in this example, program.sun4 runs on the Suns and program.IRIX runs on the SGI IRIX workstations. You can also put the programs into different directories; for example,
    mpirun -arch sun4 -np 3 -arch IRIX -np 2 /tmp/%a/program 
It is important to specify the architecture with -arch before specifying the number of processors. Also, the first arch command must refer to the processor on which the job will be started. Specifically, if -nolocal is not specified, then the first -arch must refer to the processor from which mpirun is running.



Up: Workstation clusters and the ch_p4 device Next: The P4 Procgroup File Previous: SMP Clusters


The P4 Procgroup File


Up: Workstation clusters and the ch_p4 device Next: Using special or multiple interconnects Previous: Heterogeneous networks and the ch_p4 device

For even more control over how jobs get started, we need to look at how mpirun starts a parallel program on a workstation cluster. Each time mpirun runs, it constructs and uses a new file of machine names for just that run, using the machines file as input. (The new file is called PIyyyy, where yyyy is the process identifier.) If you specify -keep_pg on your mpirun invocation, you can use this information to see where mpirun ran your last few jobs. You can construct this file yourself and specify it as an argument to mpirun. To do this for ch_p4, use

    mpirun -p4pg pgfile myprog 
where pfile is the name of the file. The file format is defined below.

This is necessary when you want closer control over the hosts you run on, or when mpirun cannot construct it automatically. Such is the case when

* You want to run different executables on different hosts (your program is not SPMD).
* You want to run on a network of shared-memory multiprocessors and need to specify the number of processes that will share memory on each machine.

The format of a ch_p4 procgroup file is a set of lines of the form
   <hostname>  <#procs>  <progname>  [<login>] 
An example of such a file, where the command is being issued from host sun1, might be
    sun1   0  /users/jones/myprog 
    sun2   1  /users/jones/myprog 
    sun3   1  /users/jones/myprog 
    hp1    1  /home/mbj/myprog    mbj 
The above file specifies four processes, one on each of three suns and one on another workstation where the user's account name is different. Note the 0 in the first line. It is there to indicate that no other processes are to be started on host sun1 than the one started by the user by his command.

You might want to run all the processes on your own machine, as a test. You can do this by repeating its name in the file:

    sun1 0 /users/jones/myprog 
    sun1 1 /users/jones/myprog 
    sun1 1 /users/jones/myprog 
This will run three processes on sun1, communicating via sockets.

To run on a shared-memory multiprocessor, with 10 processes, you would use a file like:

    sgimp  9  /u/me/prog 
Note that this is for 10 processes, one of them started by the user directly, and the other nine specified in this file. This requires that mpich was configured with the option -comm=shared; see the installation manual for more information.

If you are logged into host gyrfalcon and want to start a job with one process on gyrfalcon and three processes on alaska, where the alaska processes communicate through shared memory, you would use

    local    0  /home/jbg/main 
    alaska   3  /afs/u/graphics     
It is not possible to provide different command line argument to different MPI processes.



Up: Workstation clusters and the ch_p4 device Next: Using special or multiple interconnects Previous: Heterogeneous networks and the ch_p4 device


Using special or multiple interconnects


Up: Workstation clusters and the ch_p4 device Next: Using Shared Libraries with the ch_p4 device Previous: The P4 Procgroup File

In some installations, certain hosts can be connected in multiple ways. For example, the ``normal'' Ethernet may be supplemented by a high-speed FDDI ring. Usually, alternate host names are used to identify the high-speed connection. All you need to do is put these alternate names in your machines.xxxx file. In this case, it is important not to use the form local 0 but to use the name of the local host. For example, if hosts host1 and host2 have ATM connected to host1-atm and host2-atm respectively, the correct ch_p4 procgroup file to connect them (running the program /home/me/a.out) is

    host1-atm 0 /home/me/a.out 
    host2-atm 1 /home/me/a.out 



Up: Workstation clusters and the ch_p4 device Next: Using Shared Libraries with the ch_p4 device Previous: The P4 Procgroup File


Using Shared Libraries with the ch_p4 device


Up: Workstation clusters and the ch_p4 device Next: Setting the Working Directory for the p4 Device Previous: Using special or multiple interconnects

As described at the end of Section Using Shared Libraries , As described at the end of ``Using Shared Libraries'' in the full manual, it is sometime necessary to ensure that environment variables have been communicated to the remote machines before the program that makes use of shared libraries starts. The various remote shell commands (e.g., rsh and ssh) do not do this. Fortunately, the secure server (Section Using the Secure Server ) does communicate the environment variables. This server is built and installed as part of the ch_p4 device, and can be installed on all machines in the machines file for the current architecture (assuming that there is a working remote shell command) with

    chp4_servs -port=1234 
The secure server propagates all environment variables to the remote process, and ensures that the environment in which that process (containing your MPI program) contains all environment variables that start with LD_ (just in case the system uses LD_SEARCH_PATH or some other name for finding shared libraries).



Up: Workstation clusters and the ch_p4 device Next: Setting the Working Directory for the p4 Device Previous: Using special or multiple interconnects


Setting the Working Directory for the p4 Device


Up: Workstation clusters and the ch_p4 device Next: MPMD Programs Previous: Using Shared Libraries with the ch_p4 device

By default, the working directory for processes running remotely with ch_p4 device is the same as that of the executable. To specify a different working directory, use -p4wdir as follows:


     mpirun -np 4 myprog -p4wdir myrundir 



Up: Workstation clusters and the ch_p4 device Next: MPMD Programs Previous: Using Shared Libraries with the ch_p4 device