Configuration Settings
Tachyon configuration parameters fall into four categories: Master, Worker, Common (Master and
Worker), and User configurations. The environment configuration file responsible for setting system
properties is under conf/tachyon-env.sh
. These variables should be set as variables under the
TACHYON_JAVA_OPTS
definition. A template is provided with the zip: conf/tachyon-env.sh.template
.
Additional Java VM options can be added to TACHYON_MASTER_JAVA_OPTS
for Master and
TACHYON_WORKER_JAVA_OPTS
for Worker configuration. In the template file, TACHYON_JAVA_OPTS
is
included in both TACHYON_MASTER_JAVA_OPTS
and TACHYON_WORKER_JAVA_OPTS
.
For example if you would like to enable Java remote debugging at port 7001 in the Master you can modify
TACHYON_MASTER_JAVA_OPTS
like this:
export TACHYON_MASTER_JAVA_OPTS="$TACHYON_JAVA_OPTS -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=7001"
Common Configuration
The common configuration contains constants which specify paths and the log appender name.
Property Name | Default | Meaning |
---|---|---|
tachyon.home | "/mnt/tachyon_default_home" | Tachyon installation folder. |
tachyon.underfs.address | $tachyon.home + "/underfs" | Tachyon folder in the underlayer file system. |
tachyon.data.folder | $tachyon.underfs.address + "/tmp/tachyon/data" | Tachyon's data folder in the underlayer file system. |
tachyon.workers.folder | $tachyon.underfs.address + "/tmp/tachyon/workers" | Tachyon's workers folders in the underlayer file system. |
tachyon.usezookeeper | false | If setup master fault tolerant mode using ZooKeeper. |
tachyon.zookeeper.address | null | ZooKeeper address for master fault tolerance. |
tachyon.zookeeper.election.path | "/election" | Election folder in ZooKeeper. |
tachyon.zookeeper.leader.path | "/leader" | Leader folder in ZooKeeper. |
tachyon.underfs.hdfs.impl | "org.apache.hadoop.hdfs.DistributedFileSystem" | The implementation class of the HDFS, if using it as the under FS. |
tachyon.max.columns | 1000 | Maximum number of columns allowed in RawTable, must be set on the client and server side |
tachyon.table.metadata.byte | 5242880 | Maximum amount of bytes allowed to be store as RawTable metadata, must be set on the server side |
fs.s3n.awsAccessKeyId | null | S3 aws access key id if using S3 as the under FS. |
fs.s3n.awsSecretAccessKey | null | S3 aws secret access key id if using S3 as the under FS. |
tachyon.underfs.glusterfs.mounts | null | Glusterfs volume mount points, e.g. /vol |
tachyon.underfs.glusterfs.volumes | null | Glusterfs volume names, e.g. tachyon_vol |
tachyon.underfs.glusterfs.mapred.system.dir | glusterfs:///mapred/system | Optionally specify subdirectory under GLusterfs for intermediary MapReduce data. |
tachyon.underfs.hadoop.prefixes | hdfs:// s3:// s3n:// glusterfs:/// | Optionally specify which prefixes should run through the Apache Hadoop's implementation of UnderFileSystem. The delimiter is any whitespace and/or ',' |
tachyon.master.retry | 29 | How many times to try to reconnect with master. |
Master Configuration
The master configuration specifies information regarding the master node, such as address and port number.
Property Name | Default | Meaning |
---|---|---|
tachyon.master.journal.folder | $tachyon.home + "/journal/" | The folder to store master journal log. |
tachyon.master.hostname | localhost | The externally visible hostname of Tachyon's master address. |
tachyon.master.port | 19998 | The port Tachyon's master node runs on. |
tachyon.master.web.port | 19999 | The port Tachyon's web interface runs on. |
tachyon.master.whitelist | / | The comma-separated list of prefixes of the paths which are cacheable, separated by semi-colons. Tachyon will try to cache the cacheable file when it is read for the first time. |
tachyon.master.web.threads | 1 | How many threads to use for the web server. |
tachyon.master.keytab.file | Kerberos keytab file for Tachyon master. | |
tachyon.master.principal | Kerberos principal for Tachyon master. |
Worker Configuration
The worker configuration specifies information regarding the worker nodes, such as address and port number.
Property Name | Default | Meaning |
---|---|---|
tachyon.worker.port | 29998 | The port Tachyon's worker node runs on. |
tachyon.worker.data.port | 29999 | The port Tachyon's worker's data server runs on. |
tachyon.worker.data.folder | /tachyonworker/ | The relative path in each storage directory as the data folder for Tachyon's worker nodes. |
tachyon.worker.memory.size | 128 MB | Memory capacity of each worker node. |
tachyon.worker.hierarchystore.level.max | 1 | The max level of storage layers. |
tachyon.worker.hierarchystore.level0.alias | MEM | The alias of top storage layer. |
tachyon.worker.hierarchystore.level0.dirs.path | /mnt/ramdisk/ | The path of storage directory path for top storage layer. Note for macs the value should be "/Volumes/" |
tachyon.worker.hierarchystore.level0.dirs.quota | ${tachyon.worker.memory.size} | The capacity of top storage layer. |
tachyon.worker.allocate.strategy | MAX_FREE | The strategy that worker allocate space among storage directories in certain storage layer. |
tachyon.worker.evict.strategy | LRU | The strategy that worker evict block files when a storage layer runs out of space. |
tachyon.worker.network.type | NETTY | Selects networking stack to run the worker with. Valid options are NETTY and NIO. |
tachyon.worker.network.netty.channel | EPOLL | Selects netty's channel implementation. On linux, epoll is used; valid options are NIO and EPOLL. |
tachyon.worker.network.netty.boss.threads | 1 | How many threads to use for accepting new requests. |
tachyon.worker.network.netty.worker.threads | 0 | How many threads to use for processing requests. Zero defaults to #cpuCores * 2 |
tachyon.worker.network.netty.file.transfer | MAPPED | When returning files to the user, select how the data is transferred; valid options are MAPPED (uses java MappedByteBuffer) and TRANSFER (uses Java FileChannel.transferTo). |
tachyon.worker.network.netty.watermark.high | 32768 | Determines how many bytes can be in the write queue before channels isWritable is set to false. |
tachyon.worker.network.netty.watermark.low | 8192 | Once the high watermark limit is reached, the queue must be flushed down to the low watermark before switching back to writable. |
tachyon.worker.network.netty.backlog | 128 on linux | How many requests can be queued up before new requests are rejected; this value is platform dependent. |
tachyon.worker.network.netty.buffer.send | platform specific | Sets SO_SNDBUF for the socket; more details can be found in the socket man page. |
tachyon.worker.network.netty.buffer.receive | platform specific | Sets SO_RCVBUF for the socket; more details can be found in the socket man page. |
tachyon.worker.keytab.file | Kerberos keytab file for Tachyon worker. | |
tachyon.worker.principal | Kerberos principal for Tachyon worker. |
User Configuration
The user configuration specifies values regarding file system access.
Property Name | Default | Meaning |
---|---|---|
tachyon.user.failed.space.request.limits | 3 | The number of times to request space from the file system before aborting |
tachyon.user.file.writetype.default | CACHE_THROUGH | Default write type for Tachyon files in CLI copyFromLocal and Hadoop compatitable interface. It can be any type in WriteType. |
tachyon.user.quota.unit.bytes | 8 MB | The minimum number of bytes that will be requested from a client to a worker at a time |
tachyon.user.file.buffer.bytes | 1 MB | The size of the file buffer to use for file system reads/writes. |
tachyon.user.default.block.size.byte | 1 GB | Default block size for Tachyon files. |
tachyon.user.remote.read.buffer.size.byte | 8 MB | The size of the file buffer to read data from remote Tachyon worker. |
tachyon.worker.network.netty.process.threads | 16 | How many threads to use to process block requests. |