Auto-propagated Metadata

Introduction
Sources of Auto-Propagated Metadata
Explicitly Propagated Metadata
Priorities of Metadata
Jobflow

Introduction

CloverETL is able to detect metadata on edges in many cases automatically via metadata propagation. Metadata propagation is a process which propagates metadata through the graph based on a set of rules. The metadata to propagate is taken from sources such as edges with manually assigned metadata and from components that can inject metadata into the graph.

Metadata propagation: metadata is propagated from the first edge on the left side to all connected edges.

Figure 32.5. Metadata propagation: metadata is propagated from the first edge on the left side to all connected edges.


You do not have to assign metadata manually to each edge in graph as metadata is propagated by default. You can override metadata on edges by manually selecting propagated metadata or by user-defined metadata.

Changing auto-propagated metadata to user-defined.

Figure 32.6. Changing auto-propagated metadata to user-defined.


Changing user-defined metadata to auto-propagated.

Figure 32.7. Changing user-defined metadata to auto-propagated.


Principles of Propagation

Metadata propagation depends on graph layout, priorities of metadata propagation of particular component and manually assigned metadata. Metadata is propagated through directly or indirectly connected components and edges. To propagate metadata to edge in separated part of graph, an action from user is needed.

Components may have different priorities of metadata propagation from both sides or can propagate one way only (e.g. Reformat).

Different priorities of metadata propagation

Figure 32.8. Different priorities of metadata propagation


At least some metadata must be known: assigned by user or propagated from template on a port of a component.

Sources of Auto-Propagated Metadata

Component
Edge

Component

Some components have metadata templates assigned to their ports. The metadata from templates propagates from the component to the connected edge.

E.g. metadata for error records are auto-propagated on the second output port of SpreadsheetDataReader. Another example of component having metadata assigned on port is ListFiles. Subgraph component can propagate metadata from itself too.

Metadata propagated from the component

Figure 32.9. Metadata propagated from the component


Metadata propagated from the component II.

Figure 32.10. Metadata propagated from the component II.


Metadata propagated from the component, metadata template is defined within the component.

Figure 32.11. Metadata propagated from the component, metadata template is defined within the component.


Edge

Some components (e.g. SimpleCopy) propagate metadata from input to output ports. Thus metadata can be auto-propagated on an edge as coming from a different edge, even several components away.

Metadata propagated from the another edge

Figure 32.12. Metadata propagated from the another edge


Metadata propagated from a distant edge

Figure 32.13. Metadata propagated from a distant edge


Advanced metadata propagation - DataIntersection

Figure 32.14. Advanced metadata propagation - DataIntersection


Metadata can be propagated from the left to the right or from the right to the left. Some components can propagate metadata between ports at the same side of the component using the port on the other side. Components not changing metadata structure (e.g. Filter, SimpleCopy, ...) usually propagate metadata from both sides.

The component-specific metadata propagation details can be found in the reference of particular components.

Overview of directions of metadata propagation

Figure 32.15. Overview of directions of metadata propagation


Explicitly Propagated Metadata

Egde can have explicitly assigned metadata of another edge of the graph. The both edges do not have to be connected through any other components and edges. User has to define an edge from which metadata is propagated.

Metadata propagated from an unconnected distant edge

Figure 32.16. Metadata propagated from an unconnected distant edge


Let us explain the figure above: the metadata orders assigned to the edge between FlatFileReader and Filter are propagated through Filter.

We need the same metadata to read records using CloverDataReader as CloverDataWriter uses, therefore we define that the edge between CloverDataReader and ExtSort takes metadata (see the green symbol) from the edge (the blue symbol) between Filter and CloverDataWriter.

Assigning Explicitly Propagated Metadata

Right click the edge you need to assign metadata and choose Select MetadataSelect Metadata from Edge

The message informs you about activation of selection tool.

The cursor has changed and the graph editor pane shadows.

Click the edge you need to take metadata from.

The metadata has been propagated. The blue symbol denotes the source edge of metadata, the green one denotes the target edge.

Priorities of Metadata

Auto-propagated metadata have lower priority than explicitly defined metadata. You are free to override metadata assigned to the edge with different metadata. The auto-propagated metadata can be overridden in the same way as assigning new metadata to the egde: either by drag and drop from outline or by right click on the edge and choosing Select Metadata or New metadata.

Jobflow

Auto-propagated metadata work also with jobflow components.

Metadata propagated from the another edge

Figure 32.17. Metadata propagated from the another edge


[Note]COMPATIBILITY NOTICE

Auto-propagated metadata is available since CloverETL 4.0.0.

In CloverETL Designer 3.5.x and earlier the assigned metadata need to be propagate manually through a component to other edges.

To propagate metadata, you must also open the context menu by right-clicking the edge, then select the Propagate metadata item. The metadata will be propagated until it reaches a component in which metadata can be changed (for example: Reformat, Joiners, etc.).