The MPI profiling interface provides a convenient way for you to add performance analysis tools to any MPI implementation. We demonstrate this mechanism in mpich, and give you a running start, by supplying three profiling libraries with the mpich distribution. MPE users may build and use these libraries with any MPI implementation.
The first profiling library is simple. The profiling version of each MPI_Xxx routine calls PMPI_Wtime (which delivers a time stamp) before and after each call to the corresponding PMPI_Xxx routine. Times are accumulated in each process and written out, one file per process, in the profiling version of MPI_Finalize. The files are then available for use in either a global or process-by-process report. This version does not take into account nested calls, which occur when MPI_Bcast, for instance, is implemented in terms of MPI_Send and MPI_Recv. The file mpe/src/trc_wrappers.c implements this interface, and the option -mpitrace to any of the compilation scripts (e.g., mpicc) will automatically include this library.
The second profiling library is called MPE logging libraries which generate logfiles, they are files of timestamped events for CLOG and timestamped states for SLOG. During execution, calls to MPE_Log_event are made to store events of certain types in memory, and these memory buffers are collected and merged in parallel during MPI_Finalize. During execution, MPI_Pcontrol can be used to suspend and restart logging operations. (By default, logging is on. Invoking MPI_Pcontrol(0) turns logging off; MPI_Pcontrol(1) turns it back on again.) The calls to MPE_Log_event are made automatically for each MPI call. You can analyze the logfile produced at the end with a variety of tools; these are described in Sections Upshot and Nupshot and Jumpshot-2 and Jumpshot-3 .
In addition to using the predefined MPE logging libraries to log all MPI calls, MPE logging calls can be inserted into user's MPI program to define and log states. These states are called user defined states. States may be nested, allowing one to define a state describing a user routine that contains several MPI calls, and display both the user-defined state and the MPI operations contained within it. The routine MPE_Log_get_event_number should be used to get unique event numbers* from the MPE system. The routines MPE_Describe_state and MPE_Log_event are then used to describe user-defined states. The following example illustrates the use of these routines.
int eventID_begin, eventID_end; ... eventID_begin = MPE_Log_get_event_number(); eventID_end = MPE_Log_get_event_number(); ... MPE_Describe_state( eventID_begin, eventID_end, "Amult", "bluegreen" ); ... MyAmult( Matrix m, Vector v ) { /* Log the start event along with the size of the matrix */ MPE_Log_event( eventID_begin, m->n, (char *)0 ); ... Amult code, including MPI calls ... MPE_Log_event( eventID_end, 0, (char *)0 ); }The logfile generated by this code will have the MPI routines within the routine MyAmult indicated by a containing bluegreen rectangle. The color used in the code is chosen from the file, rgb.txt, provided by X server installation, e.g. rgb.txt is located in /usr/X11R6/lib/X11 on Linux.
If the MPE logging library, liblmpe.a, is not linked with the user program, MPE_Init_log and MPE_Finish_log must be called before and after all the MPE calls. The sample programs cpilog.c and fpi.f, available in MPE source directory contrib/test or the installed directory share/examples, illustrate the use of these MPE routines.
The third library does a simple form of real-time program animation. The MPE graphics library contains routines that allow a set of processes to share an X display that is not associated with any one specific process. Our prototype uses this capability to draw arrows that represent message traffic as the program runs.