LibraryToggle FramesPrintFeedback
Name Default Value Description
overwrite true The file can be overwritten
bufferSize 4096 The buffer size used by HDFS
replication 3 The HDFS replication factor
blockSize 67108864 The size of the HDFS blocks
fileType NORMAL_FILE

It can be SEQUENCE_FILE, MAP_FILE, ARRAY_FILE, or BLOOMMAP_FILE, see Hadoop

fileSystemType HDFS It can be LOCAL for local filesystem
keyType NULL

The type for the key in case of sequence or map files. See below.

valueType TEXT

The type for the key in case of sequence or map files. See below.

splitStrategy

A string describing the strategy on how to split the file based on different criteria. See below.

openedSuffix opened

When a file is opened for reading/ writing the file is renamed with this suffix to avoid to read it during the writing phase.

readSuffix read

Once the file has been read is renamed with this suffix to avoid to read it again.

initialDelay 0

For the consumer, how much to wait (milliseconds) before to start scanning the directory.

delay 0

The interval (milliseconds) between the directory scans.

pattern *

The pattern used for scanning the directory

chunkSize 4096

When reading a normal file, this is split into chunks producing a message per chunk

Comments powered by Disqus