For maximum performance and scalability, the Ganglia gmond daemons on compute nodes in the cluster are run in "deaf" mode. While compute nodes report their own Ganglia data to the frontend, they do not listen for information from their peers. This reduces the resource footprint of compute nodes.
Running the compute node monitors in deaf mode means they cannot be queried for cluster state. This may be a problem if your parallel jobs use Ganglia data for performance analysis or fault tolerance purposes. If you would like to re-enable Ganglia's full functionality on your compute nodes, follow the instructions below.
Ganglia daemons were switched to the deaf mode by default starting in the Matterhorn Rocks release 3.1.0. |
Add a new XML node file called replace-ganglia-client.xml (see section "3.2. Customizing Configuration of Compute Nodes" in the Base Roll Documentation for details on how to create a replacement XML node file).
Put the following contents in the new file:
<?xml version="1.0" standalone="no"?> <kickstart> <description> UCB's Ganglia Monitor system for client nodes in the cluster. </description> <post> /sbin/chkconfig --add gmetad </post> </kickstart> |
Reinstall your compute nodes. They will now have access to the full monitoring tree. This procedure places the compute nodes on the same level monitoring level as the frontend.