How the Daemons Work


Up: Contents Next: Change summary from the previous version Previous: Mpirun Usage

Once the daemons are started they are connected in a ring:

A ``console'' process ( mpirun, mpdtrace, mpdallexit, etc.) can connect to any mpd, which it does by using a Unix named socket set up in /tmp by the local mpd.

If it is an mpirun process, it requests that a number of processes be started, starting at the machine given by -MPDLOC- as described above. The location defaults to the mpd next in the ring after the one contacted by the console. Then the following events take place.

* The mpd's fork that number of manager processes (the executable is called mpdman and is located in the mpich/mpid/mpd directory). The managers are forked consecutively by the mpd's around the ring, wrapping around if necessary.
* The managers form themselves into a ring, and fork the application processes, called clients.
* The console disconnects from the mpd and reconnects to the first manager. stdin from mpirun is delivered to the client of manager 0.
* The managers intercept standard I/O fro the clients, and deliver command-line arguments and the environment variables that were specified on the mpirun command. The sockets carrying stdout and sdterr form a tree with manager 0 at the root.

At this point the situation looks something like Figure 2 .


Figure 2: Mpds with console, managers, and clients

When the clients need to contact each other, they use the managers to find the appropriate process on the destination host. The mpirun process can be suspended, in which case it and the clients are suspended, but the mpd's and managers remain executing, so that they can unsuspend the clients when mpirun is unsuspended. Killing the mpirun process kills the clients and managers.

The same ring of mpd's can be used to run multiple jobs from multiple consoles at the same time. Under ordinary circumstances, there still needs to be a separate ring of mpd's for each user. For security purposes, each user needs to have a .mpdpasswd file in the user's home directory, readable only by the user, containing a password. This file is read when the mpd is started. Only mpd's that know this password can enter a ring of existing mpd's.

A new feature is the ability to configure the mpd system so that the daemons can be run as root. To do this, after configuring mpich you need to reconfigure in the mpid/mpd directory with --enable-root and remake. Then mpirun should be installed as a setuid program. Multiple users can use the same set of mpd's, which are run as root, although their mpirun, managers, and clients will be run as the user who invoked mpirun.



Up: Contents Next: Change summary from the previous version Previous: Mpirun Usage