The Most Common Problems

Up: In Case of Trouble Next: Troubleshooting Shared Libraries Previous: Submitting bug reports

This section describes some of the most common problems encountered when building and using mpich. See also Section Frequently Asked Questions which covers frequently asked questions, including some additional problems.

Connection Refused.
This problem may be caused by Internet security settings on your system that restrict the number and frequency of interprocess connection operations. Check with your systems administrator. Linux users (depending on the Linux distribution) should try running the following commands:
   iptables --list 
   ipchains --list 
Look for any limits, restrictions on source or destination ports, or limits on syn (a type of TCP packet used in establishing connections). If you find such limits, study your security documentation and decide how you want to modify the security settings. We normally recommend that a cluster be placed behind a firewall rather than having each cluster node limit the use of TCP.

Also check the file /etc/inetd.conf to ensure that it allow more processes per minute for rsh. See the FAQ entry (Appendix Frequently Asked Questions ) on ``poll: protocol failure during circuit creation''.

Missing symbols when linking.
The most common source of missing symbols is a failure of the mpich configure step to determine how to pass command line arguments to Fortran. Check the output of the configure step for any error messages or warnings about building the Fortran libraries. If you do not require Fortran, reconfigure mpich using the configure option --disable-f77 and remake mpich. If you need Fortran and cannot figure out how to make mpich work with Fortran, send a bug report to [email protected].

Another common problem with programs that mix Fortran and C is missing libraries. The mpich configure attempts to determine the libraries that are necessary when linking C with Fortran, but may miss some. There are additional suggestions for this problem in Section Problems compiling or linking Fortran programs .

Any message that mentions SIGSEGV is refering to a ``segmentation violation'' during program execution. This is usually due to an error in the user's program, such as an array overwrite or use of an uninitialized variable in referencing storage.

Up: In Case of Trouble Next: Troubleshooting Shared Libraries Previous: Submitting bug reports