Components (nodes) are the most important graph elements. They all serve to process data. Most of them have ports through which they can receive data and/or send the processed data out. Most components work only when edges are connected to these ports. Each edge in a graph connected to a port must have metadata assigned to it. Metadata describes the structure of data flowing through the edge from one component to another.
You can configure the properties of any graph component in the following way:
You can double-click the component in the Graph Editor.
You can mark (click) the component and/or its item in the Outline pane and edit the items in the Properties tab.
You can select the component item in the Outline pane and press Enter.
You can also open the context menu by right-clicking the component in the Graph Editor and/or in the Outline pane. Then you can select the item from the context menu and edit the items in the Edit component wizard.
All components can be divided into several groups:
Readers are usually the initial nodes of a graph. Readers read data from input files (either local or remote), receive it from a connected input port, read it from a dictionary or generate data.
Writers are the terminal nodes of a graph. Writers receive data through their input port(s) and write it to files (either local or remote), send it out through a connected output port, send e-mails, write data to a dictionary or discard the received data.
Transformers are intermediate nodes of a graph. Transformers receive data and copy it to all output ports, deduplicate, filter or sort data, concatenate, gather or merge received data through many ports and send it out through a single output port, distribute records among many connected output ports, intersect data received through two input ports, aggregate data to get new information or transform data in a more complicated way.
Joiners are also intermediate nodes of a graph. Joiners receive data from two or more sources, join them according to a specified key, and send the joined data out through the output ports.
Job Control is a group of components focused on execution and monitoring of various job types. These components allow running ETL graphs, jobflows and any interpreted scripts. Graphs and jobflows can be monitored and optionally aborted.
Tip | |
---|---|
Note if you cannot see this component category, navigate to → → → and tick both checkboxes next to Job Control. |
File Operations are components suitable for handling files on the file system - either local or remote (via FTP). They can also access files in Clover Server sandboxes.
Tip | |
---|---|
Note if you cannot see this component category, navigate to → → → and tick both checkboxes next to File Operations. |
The Data Partitioning serve to distribute data records among various nodes of a Cluster of CloverETL Server instances or to gather these records together.
Graphs with Cluster Components run in parallel in a Cluster.
The Data Quality is a group of components performing various tasks related to quality of your data - determining information about the data, finding and fixing problems, etc.
The Others group is a heterogeneous group of components. They can perform different tasks - execute system, Java or DB commands; run CloverETL graphs or send HTTP requests to a server. Other components of this group can read from or write to lookup tables, check the key of some data and replace it with another one, check the sort order of a sequence or slow down processing of data flowing through the component.
Subgraph is a special type of graph that can be used as a component in another graph. Subgraph belongs to Job Control components.
Component is Deprecated, should not be used anymore and we do not describe them.
Some properties are common to most of components or all components.
Other properties are common to each of the groups:
For information about individual components see Part IX, Component Reference.