Troubleshooting Shared Libraries


Up: In Case of Trouble Next: Other Problems Previous: The Most Common Problems

Shared libraries provide a useful and powerful tool for speeding linking and execution of programs. However, the implementation of shared libraries in most operating systems leaves much to be desired. This section covers some of the problems and workarounds for them.

Some (most?) systems do not remember where the shared library was found when the executable was linked!* Instead, they depend on finding the shared library in either a default location (such as /lib) or in a directory specified by an environment variable such as LD_LIBRARY_PATH or by a command line argument such as -R or -rpath (more on this below). The mpich configure tests for this and will report whether an executable built with shared libraries remembers the location of the libraries. It also attemps to use a compiler command line argument to force the executable to remember the location of the shared library.

If you need to set an environment variable to indicate where the mpich shared libraries are, you need to ensure that both the process that you run mpirun from and any processes that mpirun starts gets the enviroment variable. The easiest way to do this is to set the environment variable within your .cshrc (for csh or tcsh users) or .profile (for sh and ksh users) file.

However, setting the environment variable within your startup scripts can cause problems if you use several different systems. For example, you may have a single .cshrc file that you use with both an SGI (IRIX) and Solaris system. You do not want to set the LD_LIBRARY_PATH to point the SGI at the Solaris version of the mpich shared libraries*. Instead, you would like to set the environment variable before running mpirun:


    setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/usr/local/mpich/lib/shared 
    mpirun -np 4 cpi 
Unfortunately, this won't always work. Depending on the method that mpirun and mpich use to start the processes, the environment variable may not be sent to the new process. This will cause the program to fail with a message like


    ld.so.1: /home/me/cpi: fatal: libpmpich.so.1.0: open failed: No such  
    file or directory 
    Killed 
Some devices support starting new processes with the current environment; check the documentation for mpirun (Section Running programs with mpirun ).

An alternative to using LD_LIBRARY_PATH and the secure server is to add an option to the link command that provides the path to use in searching for shared libraries. Unfortunately, the option that you would like is ``append this directory to the search path'' (such as you get with -L). Instead, many compilers provide only ``replace the search path with this path.''* For example, some compilers allow -Rpath:path:...:path to specify a replacement path. Thus, if both mpich and the user provide library search paths with -R, one of the search paths will be lost. Eventually, mpicc and friends can check for -R options and create a unified version, but they currently do not do this. You can, however, provide a complete search path yourself if your compiler supports an option such as -R.

The preceeding may sound like a lot of effort to go to, and in some ways it is. For large clusters, however, the effort will be worth it: programs will start faster and more reliably, because there is less network and file system traffic.



Up: In Case of Trouble Next: Other Problems Previous: The Most Common Problems