Cluster-Fork runs a command on compute nodes of your cluster.
Often we want to execute parallel jobs consisting of standard UNIX commands. By "parallel" we mean the same command runs on multiple nodes of the cluster. We use these simple parallel jobs to move files, to run small tests, and to perform various administrative tasks.
Rocks provides a simple tool for this purpose called cluster-fork. For example, to list all your processes on the compute nodes of the cluster:
$ cluster-fork ps -U$USER |
By default, cluster-fork uses a simple series of ssh connections to launch the task serially on every compute node in the cluster. Cluster-fork is smart enough to ignore dead nodes. Usually the job is "blocking": cluster-fork waits for the job to start on one node before moving to the next. By using the --bg flag you can instruct cluster-fork to start the jobs in the background. This corresponds to the "-f" ssh flag.
$ cluster-fork --bg hostname |
Often you wish to name the nodes your job is started on. This can be done by using an SQL statement or by specifying the nodes using a special shorthand.
The first method of naming nodes uses the SQL database on the frontend. We need an SQL statement that returns a column of node names. For example, to run a command on compute nodes in the first rack of your cluster exectute:
$ cluster-fork --query="select name from nodes where name like 'compute-1-%'" [cmd] |
The next method of requires us to explicitly name each node. When launching a job on many nodes of a large cluster this often becomes cumbersome. We provide a special shorthand to help with this task. This shorthand, borrowed from the MPD job launcher, allows us to specify large ranges of nodes quickly and concisely.
The shorthand is based on similarly-named nodes and uses the --nodes option. To specify a node range compute-0-0 compute-0-1 compute-0-2, we write --nodes=compute-0-%d:0-2. This scheme works best when the names share a common prefix, and the variables between names are numeric. Rocks compute nodes are named with such a convention.
Other shorthand examples:
Discontinuous ranges:
compute-0-%d:0,2-3 --> compute-0-0 compute-0-2 compute-0-3
Multiple elements:
compute-0-%d:0-1 compute-1-%d:0-1 --> compute-0-0 compute-0-1 compute-1-0 compute-1-1
Factoring out duplicates:
2*compute-0-%d:0-1 compute-0-%d:2-2 --> compute-0-0 compute-0-0 compute-0-1 compute-0-1 compute-0-2
$ cluster-fork --nodes="compute-2-%d:0-32 compute-3-%d:0-32" ps -U$USER |
The previous example lists the processes for the current user on 64 nodes in racks two and three.