Short Description |
Ports |
Metadata |
CloverDataWriter Attributes |
Details |
Examples |
Compatibility |
See also |
CloverDataWriter writes data to files in our internal binary Clover data format.
Component | Data output | Input ports | Output ports | Transformation | Transf. required | Java | CTL | Auto-propagated metadata |
---|---|---|---|---|---|---|---|---|
CloverDataWriter | Clover binary file | 1 | 0-1 |
Port type | Number | Required | Description | Metadata |
---|---|---|---|---|
Input | 0 | For received data records | Any | |
Input | 0 | For port writing. See Writing to Output Port. | byte or cbyte |
CloverDataWriter does not propagate metadata.
CloverDataWriter has no metadata template.
Input metadata can have any metadata type.
Output metadata of CloverDataWriter has one field.
The field has datatype byte
or cbyte
.
Attribute | Req | Description | Possible values |
---|---|---|---|
Basic | |||
File URL | yes | Attribute specifying where received data will be written (Clover data file, dictionary). See Supported File URL Formats for Writers. | |
Append | By default, new records overwrite the older ones. If
set to true , new records are appended to
the older records stored in the output file(s). | false (default) | true | |
Advanced | |||
Create directories | By default, non-existing directories are not created.
If set to true , they are created. | false (default) | true | |
Compress level | Sets the compression level (0 - no compression,
1 - fastest compression,
9 - best compression). | 1 (default) | 0-9 | |
Number of skipped records | Number of records to be skipped. See Selecting Output Records. | 0-N | |
Max number of records | Maximum number of records to be written to the output file. See Selecting Output Records. | 0-N | |
Records per file | Limits the number of records written to one file. | 0-N | |
Exclude fields | Sequence of field names separated by semicolon that will not be written to the output. | any field(s), e.g. field1;field3 | |
Partition key | Sequence of field names separated by semicolon defining the records distribution into different output files. Records with the same Partition key are written to the same output file. According to the selected Partition file tag use the proper placeholder ($ or #) in the file name mask, see Partitioning Output into Different Output Files. Field(s) to be used in partitioning to several output files. | any field(s), e.g. field1;field3 | |
Partition lookup table | ID of lookup table serving for selecting records that should be written to output file(s). See Partitioning Output into Different Output Files for more information. | e.g. MyLookupTable001 | |
Partition file tag | By default, output files are numbered.
If it is set to Key file tag , output files
are named according to the values of Partition key
or Partition output fields.
See Partitioning Output into Different Output Files for more information. | Number file tag (default) | Key file tag | |
Partition output fields | Fields of Partition lookup table whose values serve to name output file(s). See Partitioning Output into Different Output Files for more information. | ||
Partition unassigned file name | Name of the file into which the unassigned records should be written if there are any. If not specified, data records whose key values are not contained in Partition lookup table are discarded. See Partitioning Output into Different Output Files for more information. | ||
Sorted input | In case of partitioning into multiple output files is turned on, all output files are opened at once. Which could lead to undesirable memory footprint for many output files (thousands). Moreover, for example unix-based OS usually have very strict limitation of number of simultaneously opened files (1024) per process. In case you run into one of these limitations, consider sorting the data according to partition key using one of our standard sorting components and set this attribute to true. The partitioning algorithm does not need to keep all output files opened, just the last one is opened at one time. See Partitioning Output into Different Output Files for more information. | false (default) | true | |
Create empty files | If set to false ,
prevents the component from creating empty output file
when there are no input records. | true (default) | false | |
Deprecated | |||
Save metadata | This attribute is ignored since CloverETL version 4.0. | false (default) | true | |
Save index | This attribute is ignored since CloverETL version 4.0. | false (default) | true |
CloverDataWriter internally uses compression by default. Additional zipping is redundant. See the Compress level attribute.
CloverDataWriter can write maps and lists.
With this component, you can write data in this internal format that allows fast access to data. CloverDataWriter is faster than FlatFileWriter.
Write records to Clover file.
Set up the File URL attribute.
Attribute | Value |
---|---|
File URL | ${DATAOUT_DIR}/my-clover-file.cdf |
If the file exists, the data in the file is overwritten.
Append records of each graph run to an existing file my-clover-file.cdf
.
Set up File URL and Append attributes.
Attribute | Value |
---|---|
File URL | ${DATAOUT_DIR}/my-clover-file.cdf |
Append | true |
Write data to file my-clover-file.cdf
in the directory cdrw
.
The directory may not exist.
Use attributes File URL and Create directories.
Attribute | Value |
---|---|
File URL | ${DATAOUT_DIR}/cdrw/my-clover-file.cdf |
Create directories | true |
The first 10 records should be omitted. Write the rest of the records.
Use attributes File URL and Number of skipped records.
Attribute | Value |
---|---|
File URL | ${DATAOUT_DIR}/my-clover-file.cdf |
Number of skipped records | 10 |
Write at most 100 records.
Use attributes File URL and Max number of records.
Attribute | Value |
---|---|
File URL | ${DATAOUT_DIR}/my-clover-file.cdf |
Max number of records | 100 |
Metadata on the input edge of CloverDataWriter has fields
ID, Firstname,Surname and Salary.
Save list containing Firstname and Surname to Clover data file employees.cdf
.
Use attributes File URL and Exclude fields.
Attribute | Value |
---|---|
File URL | ${DATAOUT_DIR}/employees.cdf |
Exclude fields | ID;Salary |
List of students contains fields Firstname, Lastname and Mark.
Categorize records into several files according to the mark.
The created files will have names: students_A.cdf
, ... students_F.cdf
.
Use attributes File URL, Partition key and Partition file tag.
Attribute | Value |
---|---|
File URL | ${DATAOUT_DIR}/students_#.cdf |
Partition key | Mark |
Partition file tag | Key file tag |
Note: Records with students without mark will be saved into file students_.cdf
.
The input data contain number of active customers for particular countries. The countries are of different regions. Categorize records into the files according to the region.
CZ|105 UK|651 US|827 ...
The input metadata contain fields CountryCode and but nothing in the record denotes the region directly. You have list of country codes with corresponding region to be used for partitioning.
CZ|Europe UK|Europe US|America ...
Some country codes may not be present in the list,
store records with country codes not present in the list into separate file region_missing.cdf
.
Use the attributes File URL, Partition key, Partition lookup table, Partition file tag, Partition output fields, Partition unassigned file name. You need a lookup table CountryCodeRegion too.
Attribute | Value |
---|---|
File URL | ${DATAOUT_DIR}/region_#.cdf |
Partition key | CountryCode |
Partition lookup table | CountryCodeRegion |
Partition file tag | Key file tag |
Partition output fields | Continent |
Partition unassigned file name | missing |
The files region_Europe.cdf
, region_America.cdf
,
... and region_missing.cdf
will be created.
Since 2.9 version of CloverETL CloverDataWriter writes also a header to output files with version number. For this reason, CloverDataReader expects that files in Clover binary format contain such a header with the version number. CloverDataReader 2.9 cannot read files written by older versions of CloverETL nor these older versions can read data written by CloverDataWriter 2.9.
The internal structure of zip archive has changed, graphs relying on the structure will stop working. Graphs using a plain file URL without any internal entry specification are not affected.
zip:(${DATAIN_DIR}/customers.zip) - will work zip:(${DATAIN_DIR}/customers.zip)#DATA/customers - won't work
As Clover format can use compression internally, addition of next compression level is redundant.
Values of parameters Save metadata and Save index are not used since CloverETL 4.0.
Since 4.4.0-M2, CloverDataWriter can write to output port just to byte
or cbyte
field.