LookupTables in CloverETL Cluster Environment

To understand how lookup tables work in cluster environment is necessary to understand how clustered graphs are processed, how clustered graphs are split into several separate graphs and distributed among cluster nodes. Description of these details is available in CloverETL Server documentation in chapter "Parallel Data Processing". In short, clustered graph is executed in several instances according transformation plan - let's call them worker graphs. Transformation plan is result of a transformation analysis, where component allocation, usage of partitioned sandbox and occurrences of clustered components are taken into consideration. Transformation plan says how many instances of the graph, on which cluster nodes will be executed. Moreover, transformation plan says how the worker graphs should be updated for clustered run, which components actually will be running in particular worker and which will be removed.

CloverETL Server cluster environment does not provide any special support for lookup tables. Each clustered graph instance creates its own set of lookup tables. The lookup tables instances does not cooperate with each other. So for example in case usage of SimpleLookupTable, each instance of clustered graph has its own SimpleLookupTable instance, which loads data from specified data file separately. So data file is read by each clustered graph and each instance has separate set of cached records. DBLookupTable works seamlessly in cluster environment, of course internal cache for databases responses is managed by each worker graph separately.

Be aware of writing data records into a lookup table using LookupTableReaderWriterComponent. Here it is really necessary to consider, which worker does the writing, since the lookup table update is performed only locally. So ensure the LookupTableReaderWriter component runs on all workers, where the update lookup will be necessary.