Chapter 30. Components

Components (nodes) are the most important graph elements. They all serve to process data. Most of them have ports through which they can receive data and/or send the processed data out. Most components work only when edges are connected to these ports. Each edge in a graph connected to a port must have metadata assigned to it. Metadata describes the structure of data flowing through the edge from one component to another.

You can configure the properties of any graph component in the following way:

Groups of Components

All components can be divided into several groups:

Readers

Readers are usually the initial nodes of a graph. Readers read data from input files (either local or remote), receive it from a connected input port, read it from a dictionary or generate data.

Writers

Writers are the terminal nodes of a graph. Writers receive data through their input port(s) and write it to files (either local or remote), send it out through a connected output port, send e-mails, write data to a dictionary or discard the received data.

Transformers

Transformers are intermediate nodes of a graph. Transformers receive data and copy it to all output ports, deduplicate, filter or sort data, concatenate, gather or merge received data through many ports and send it out through a single output port, distribute records among many connected output ports, intersect data received through two input ports, aggregate data to get new information or transform data in a more complicated way.

Joiners

Joiners are also intermediate nodes of a graph. Joiners receive data from two or more sources, join them according to a specified key, and send the joined data out through the output ports.

Job Control

Job Control is a group of components focused on execution and monitoring of various job types. These components allow running ETL graphs, jobflows and any interpreted scripts. Graphs and jobflows can be monitored and optionally aborted.

[Tip]Tip

Note if you cannot see this component category, navigate to WindowPreferencesCloverETLComponents in Palette and tick both checkboxes next to Job Control.

File Operations

File Operations are components suitable for handling files on the file system - either local or remote (via FTP). They can also access files in Clover Server sandboxes.

[Tip]Tip

Note if you cannot see this component category, navigate to WindowPreferencesCloverETLComponents in Palette and tick both checkboxes next to File Operations.

Cluster Components

The Data Partitioning serve to distribute data records among various nodes of a Cluster of CloverETL Server instances or to gather these records together.

Graphs with Cluster Components run in parallel in a Cluster.

Data Quality

The Data Quality is a group of components performing various tasks related to quality of your data - determining information about the data, finding and fixing problems, etc.

Other

The Others group is a heterogeneous group of components. They can perform different tasks - execute system, Java or DB commands; run CloverETL graphs or send HTTP requests to a server. Other components of this group can read from or write to lookup tables, check the key of some data and replace it with another one, check the sort order of a sequence or slow down processing of data flowing through the component.

Subgraphs

Subgraph is a special type of graph that can be used as a component in another graph. Subgraph belongs to Job Control components.

Deprecated

Component is Deprecated, should not be used anymore and we do not describe them.

Component Properties

Some properties are common to most of components or all components.

Other properties are common to each of the groups:

For information about individual components see Part VIII, Component Reference.