Cluster Reliability in Unreliable Network Environment

CloverETL Server instances must cooperate with each other to form a cluster together. If the connection between nodes doesn't work at all, or if it's not configured, cluster can't work properly. This chapter describes cluster nodes behavior in environment, where the connection between nodes is somehow unreliable.

Nodes use three channels to exchange status info or data
  1. synchronous calls (via HTTP/HTTPS)

    Typically NodeA requests some operation on NodeB, e.g. job execution. HTTP/HTTPS is also used for streaming data between workers of parallel execution

  2. asynchronous messaging (TCP connection on port 7800 by default)

    Typically heart-beat or events, e.g. job started or finished.

  3. shared database – each node must be able to create DB connection

    Shared configuration data, execution history etc.

Following scenarios are described below one by one, however they may occur together: