Configuration Settings

Tachyon configuration parameters fall into four categories: Master, Worker, Common (Master and Worker), and User configurations. The environment configuration file responsible for setting system properties is under conf/tachyon-env.sh. These variables should be set as variables under the TACHYON_JAVA_OPTS definition. A template is provided with the zip: conf/tachyon-env.sh.template.

Additional Java VM options can be added to TACHYON_MASTER_JAVA_OPTS for Master and TACHYON_WORKER_JAVA_OPTS for Worker configuration. In the template file, TACHYON_JAVA_OPTS is included in both TACHYON_MASTER_JAVA_OPTS and TACHYON_WORKER_JAVA_OPTS.

For example if you would like to enable Java remote debugging at port 7001 in the Master you can modify TACHYON_MASTER_JAVA_OPTS like this:

export TACHYON_MASTER_JAVA_OPTS="$TACHYON_JAVA_OPTS -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=7001"

Common Configuration

The common configuration contains constants which specify paths and the log appender name.

Property NameDefaultMeaning
tachyon.home "/mnt/tachyon_default_home" Tachyon installation folder.
tachyon.underfs.address $tachyon.home + "/underfs" Tachyon folder in the underlayer file system.
tachyon.data.folder $tachyon.underfs.address + "/tmp/tachyon/data" Tachyon's data folder in the underlayer file system.
tachyon.workers.folder $tachyon.underfs.address + "/tmp/tachyon/workers" Tachyon's workers folders in the underlayer file system.
tachyon.usezookeeper false If setup master fault tolerant mode using ZooKeeper.
tachyon.zookeeper.address null ZooKeeper address for master fault tolerance.
tachyon.zookeeper.election.path "/election" Election folder in ZooKeeper.
tachyon.zookeeper.leader.path "/leader" Leader folder in ZooKeeper.
tachyon.underfs.hdfs.impl "org.apache.hadoop.hdfs.DistributedFileSystem" The implementation class of the HDFS, if using it as the under FS.
tachyon.max.columns 1000 Maximum number of columns allowed in RawTable, must be set on the client and server side
tachyon.table.metadata.byte 5242880 Maximum amount of bytes allowed to be store as RawTable metadata, must be set on the server side
fs.s3n.awsAccessKeyId null S3 aws access key id if using S3 as the under FS.
fs.s3n.awsSecretAccessKey null S3 aws secret access key id if using S3 as the under FS.
tachyon.underfs.glusterfs.mounts null Glusterfs volume mount points, e.g. /vol
tachyon.underfs.glusterfs.volumes null Glusterfs volume names, e.g. tachyon_vol
tachyon.underfs.glusterfs.mapred.system.dir glusterfs:///mapred/system Optionally specify subdirectory under GLusterfs for intermediary MapReduce data.
tachyon.underfs.hadoop.prefixes hdfs:// s3:// s3n:// glusterfs:/// Optionally specify which prefixes should run through the Apache Hadoop's implementation of UnderFileSystem. The delimiter is any whitespace and/or ','
tachyon.master.retry 29 How many times to try to reconnect with master.

Master Configuration

The master configuration specifies information regarding the master node, such as address and port number.

Property NameDefaultMeaning
tachyon.master.journal.folder $tachyon.home + "/journal/" The folder to store master journal log.
tachyon.master.hostname localhost The externally visible hostname of Tachyon's master address.
tachyon.master.port 19998 The port Tachyon's master node runs on.
tachyon.master.web.port 19999 The port Tachyon's web interface runs on.
tachyon.master.whitelist / The comma-separated list of prefixes of the paths which are cacheable, separated by semi-colons. Tachyon will try to cache the cacheable file when it is read for the first time.
tachyon.master.web.threads 1 How many threads to use for the web server.
tachyon.master.keytab.file Kerberos keytab file for Tachyon master.
tachyon.master.principal Kerberos principal for Tachyon master.

Worker Configuration

The worker configuration specifies information regarding the worker nodes, such as address and port number.

Property NameDefaultMeaning
tachyon.worker.port 29998 The port Tachyon's worker node runs on.
tachyon.worker.data.port 29999 The port Tachyon's worker's data server runs on.
tachyon.worker.data.folder /tachyonworker/ The relative path in each storage directory as the data folder for Tachyon's worker nodes.
tachyon.worker.memory.size 128 MB Memory capacity of each worker node.
tachyon.worker.hierarchystore.level.max 1 The max level of storage layers.
tachyon.worker.hierarchystore.level0.alias MEM The alias of top storage layer.
tachyon.worker.hierarchystore.level0.dirs.path /mnt/ramdisk/ The path of storage directory path for top storage layer. Note for macs the value should be "/Volumes/"
tachyon.worker.hierarchystore.level0.dirs.quota ${tachyon.worker.memory.size} The capacity of top storage layer.
tachyon.worker.allocate.strategy MAX_FREE The strategy that worker allocate space among storage directories in certain storage layer.
tachyon.worker.evict.strategy LRU The strategy that worker evict block files when a storage layer runs out of space.
tachyon.worker.network.type NETTY Selects networking stack to run the worker with. Valid options are NETTY and NIO.
tachyon.worker.network.netty.channel EPOLL Selects netty's channel implementation. On linux, epoll is used; valid options are NIO and EPOLL.
tachyon.worker.network.netty.boss.threads 1 How many threads to use for accepting new requests.
tachyon.worker.network.netty.worker.threads 0 How many threads to use for processing requests. Zero defaults to #cpuCores * 2
tachyon.worker.network.netty.file.transfer MAPPED When returning files to the user, select how the data is transferred; valid options are MAPPED (uses java MappedByteBuffer) and TRANSFER (uses Java FileChannel.transferTo).
tachyon.worker.network.netty.watermark.high 32768 Determines how many bytes can be in the write queue before channels isWritable is set to false.
tachyon.worker.network.netty.watermark.low 8192 Once the high watermark limit is reached, the queue must be flushed down to the low watermark before switching back to writable.
tachyon.worker.network.netty.backlog 128 on linux How many requests can be queued up before new requests are rejected; this value is platform dependent.
tachyon.worker.network.netty.buffer.send platform specific Sets SO_SNDBUF for the socket; more details can be found in the socket man page.
tachyon.worker.network.netty.buffer.receive platform specific Sets SO_RCVBUF for the socket; more details can be found in the socket man page.
tachyon.worker.keytab.file Kerberos keytab file for Tachyon worker.
tachyon.worker.principal Kerberos principal for Tachyon worker.

User Configuration

The user configuration specifies values regarding file system access.

Property NameDefaultMeaning
tachyon.user.failed.space.request.limits 3 The number of times to request space from the file system before aborting
tachyon.user.file.writetype.default CACHE_THROUGH Default write type for Tachyon files in CLI copyFromLocal and Hadoop compatitable interface. It can be any type in WriteType.
tachyon.user.quota.unit.bytes 8 MB The minimum number of bytes that will be requested from a client to a worker at a time
tachyon.user.file.buffer.bytes 1 MB The size of the file buffer to use for file system reads/writes.
tachyon.user.default.block.size.byte 1 GB Default block size for Tachyon files.
tachyon.user.remote.read.buffer.size.byte 8 MB The size of the file buffer to read data from remote Tachyon worker.
tachyon.worker.network.netty.process.threads 16 How many threads to use to process block requests.