3. Slon daemons

The programs that actually perform Slony-I replication are the slon daemons.

You need to run one slon instance for each node in a Slony-I cluster, whether you consider that node a "master" or a "slave". On Windows™ when running as a service things are slightly different. One slon service is installed, and a seperate configuration file registered for each node to be serviced by that machine. The main service then manages the individual slons itself. Since a MOVE SET or FAILOVER can switch the roles of nodes, slon needs to be able to function for both providers and subscribers. It is not essential that these daemons run on any particular host, but there are some principles worth considering:

Warning

Do not run a slon that is responsible to service a particular node across a WAN link if at all possible. Any problems with that connection can kill the connection whilst leaving "zombied" database connections on the node that (typically) will not die off for around two hours. This prevents starting up another slon, as described in the FAQ under multiple slon connections.

Historically, slon processes have been fairly fragile, dying if they encounter just about any significant error. This behaviour mandated running some form of "watchdog" which would watch to make sure that if one slon fell over, it would be replaced by another.

There are two "watchdog" scripts currently available in the Slony-I source tree:

The slon_watchdog2 script is probably usually the preferable thing to run. It was at one point not preferable to run it whilst subscribing a very large replication set where it is expected to take many hours to do the initial COPY SET. The problem that came up in that case was that it figured that since it hasn't done a SYNC in 2 hours, something was broken requiring restarting slon, thereby restarting the COPY SET event. More recently, the script has been changed to detect COPY SET in progress.

In Slony-I version 1.2, the structure of the slon has been revised fairly substantially to make it much less fragile. The main process should only die off if you expressly signal it asking it to be killed.

A new approach is available in the Section 19.3 script which uses slon configuration files and which may be invoked as part of your system startup process.