Using the MPD System Daemons with the ch_p4mpd device


Up: Running an MPI Program Next: Installation Previous: Managing the servers

The new MPD system, together with its advantages in speed of startup and management of stdio a is described in detail in the companion document to this one, the User's Guide for MPICH. Here we briefly discuss the installation process.



Up: Running an MPI Program Next: Installation Previous: Managing the servers


Installation


Up: Using the MPD System Daemons with the ch_p4mpd device Next: Starting and Managing the MPD Daemons Previous: Using the MPD System Daemons with the ch_p4mpd device

To build mpich with the ch_p4mpd device, configure mpich with

    configure --with-device=ch_p4mpd -prefix=<installdir> <other options> 
It is particularly important to specifiy an install directory with the prefix argment (unless you want to use the default installation directory, which is /usr/local), since the ch_p4mpd device must be installed before use.

If you intend to run the MPD daemons as root, then you must configure with --enable-root as well. Then it will be possible for multiple users to use the same set of MPD daemons to start jobs.

After configuration, the usual

    make 
    make install 
will install mpich and the MPD executables in the <installdir>/bin directory, which should be added to your path.



Up: Using the MPD System Daemons with the ch_p4mpd device Next: Starting and Managing the MPD Daemons Previous: Using the MPD System Daemons with the ch_p4mpd device


Starting and Managing the MPD Daemons


Up: Using the MPD System Daemons with the ch_p4mpd device Next: Special Considerations for Running with Shared Memory Previous: Installation

Running MPI programs with the ch_p4mpd device assumes that the mpd daemon is running on each machine in your cluster. In this section we describe how to start and manage these daemons. The mpd and related executables are built when you build and install MPICH after configuring with

   --with-device=ch_p4mpd -prefix=<prefix directory> <other options> 
and are found in <prefix-directory>/bin, which you should ensure is in your path. A set of MPD daemons can be started with the command
    mpichboot <file> <num> 
where file is the name of a file containing the host names of your cluster and num is the number of daemons you want to start. The startup script uses rsh to start the daemons, but if it is more convenient, they can be started in other ways. The first one can be started with mpd -t. The first daemon, started in this way, will print out the port it is listening on for new mpds to connect to it. Each subsequent mpd is given a host and port to connect to. The mpichboot script automates this process. At any time you can see what mpds are running by using mpdtrace.

An mpd is identified by its host and a port. A number of commands are used to manage the ring of mpds:

mpdhelp
prints this information
mpdcleanup
deletes Unix socket files /tmp/mpd.* if necessary.
mpdtrace
causes each mpd in the ring to respond with a message identifying itself and its neighbors.
mpdringtest count
sends a message around the ring ``count'' times and times it
mpdshutdown mpd_id
shuts down the specified mpd; mpd_id is specified as host_portnum.
mpdallexit
causes all mpds to exit gracefully.
mpdlistjobs
lists active jobs managed by mpds in ring.
mpdkilljob job_id
aborts the specified job.

Several options control the behavior of the daemons, allowing them to be run either by individual users or by root without conflicts. The current set of command-line options comprises the following:
{ -h <host to connect to>}
{ -p <port to connect to>}
{ -c}
allow console (the default)
{ -n}
don't allow console
{ -d <debug (0 or 1)>}
{ -w <working directory>}
{ -l <listener port>}
{ -b}
background; daemonize
{ -e}
don't let this mpd start processes, unless root
{ -t}
echo listener port at startup

The -n option allows multiple mpds to be run on a single host by disabling the console on the second and subsequent daemons.



Up: Using the MPD System Daemons with the ch_p4mpd device Next: Special Considerations for Running with Shared Memory Previous: Installation