Up: In Case of Trouble
Next: Troubleshooting Shared Libraries
Previous: Submitting bug reports
This section describes some of the most common problems encountered
when building and using mpich. See also Section Frequently Asked Questions
which
covers frequently asked questions, including some additional problems.
-
Connection Refused.
-
This problem may be caused by Internet security settings on your
system that restrict the number and frequency of interprocess
connection operations. Check with your systems administrator.
Linux users (depending on the Linux distribution) should try running
the following commands:
iptables --list
ipchains --list
Look for any limits, restrictions on source or destination ports, or
limits on syn (a type of TCP packet used in establishing
connections). If you find such limits, study your security
documentation and decide how you want to modify the security
settings. We normally recommend that a cluster be placed behind a
firewall rather than having each cluster node limit the use of TCP.
Also check the file /etc/inetd.conf to ensure that it allow
more processes per minute for rsh. See the FAQ entry
(Appendix Frequently Asked Questions
) on ``poll: protocol failure during
circuit creation''.
-
Missing symbols when linking.
-
The most common source of missing symbols is a failure of the
mpich configure step to determine how to pass command line arguments
to Fortran. Check the output of the configure step for any error
messages or warnings about building the Fortran libraries. If you do
not require Fortran, reconfigure mpich using the configure option
--disable-f77 and remake mpich. If you need Fortran and
cannot figure out how to make mpich work with Fortran, send a bug
report to [email protected].
Another common problem with programs that mix Fortran and C is missing
libraries. The mpich configure attempts to determine the
libraries that are necessary when linking C with Fortran, but may miss
some. There are additional suggestions for this problem in
Section Problems compiling or linking Fortran programs
.
-
{ SIGSEGV}.
-
Any message that mentions SIGSEGV is refering to a
``segmentation violation'' during program execution. This is usually
due to an error in the user's program, such as an array overwrite or
use of an uninitialized variable in referencing storage.
Up: In Case of Trouble
Next: Troubleshooting Shared Libraries
Previous: Submitting bug reports