This lookup table is commercial and can only be used with the commercial license of CloverETL Designer.
This type of lookup table serves a great number of data records.
The data records are stored in a files; only a few records are cached in main memory.
These files are in
jdbm
format (
http://jdbm.sourceforge.net).
When you specify file name, two files will be created: with db
and lg
extensions.
Persistent lookup table can work in two modes: with key duplicates and without key duplicate. If you switch between the modes, you should delete and refill the lookup table.
With Allow key duplicates property unchecked, the persistent lookup table does not allow storing multiple records with the same key value. You can choose whether to store the first one or the last one with help of Replace checkbox.
This is the default option.
With Allow key duplicates property enabled, you can store multiple records with the same key to the table. The Replace property is not used. Key duplicates in persistent lookup table are available since 4.3.0.
Persistent lookup table internally uses B+Tree to store the records. If node is mentined here, it is the node of the B+Tree.
In the first step of wizard, choose Persistent lookup.
In the second step of wizard, set up the requied properties: give a Name to the lookup table, select the corresponding Metadata, specify the File where the data records of the lookup table will be stored and the Key that should be used to look up data records from the table.
To overwrite old records by newer ones, check the Replace checkbox. With the checkbox checked, the latest record with the same key is stored. Otherwise the first record with the same key would be stored.
You can disable transactions with
. Disabling transactions increases graph performance, however, it can cause data loss if manipulation with the table is interupted.Commit interval defines the number of records that are committed at once. When the limit or end of phase is reached, the records are committed to the lookup table.
By specifying Page size, you are defining the number of entries (records) per node of B+Tree (in the implementation).
Cache size specifies the maximum number of nodes (of B+Ttree) in cache.
Allow key duplicates allows storing multiple records with the same key value.
Important | |
---|---|
Replace checkbox is ignored in lookup tables with key duplicates. |
At the end, you only need to click
and then .Figure 34.14. Persistent Lookup Table Wizard
Performance of persistent lookup table can be affected by the advanced parameters. These parameters configure the internal B+Tree implementation and size of caches.
To speed up reading, increase cache size.
To speed up writing, increase commit interval.
Since 4.3.0, you can use Allow key duplicates to allow storing duplicated key values into the table.