.. _Configtoml:

The Config.toml File
====================

Rather than passing individual parameters when starting Driverless AI, admins can instead reference a config.toml file. This file includes all possible configuration options that would otherwise be specified in the ``nvidia-docker run`` command. Simply place this file in a folder on the container (for example, in /tmp), then set the desired environment variables. 

After all of the environment variables are set, start Driverless AI using the following command:

::

  nvidia-docker run \
    --rm \
    -u `id -u`:`id -g` \
    -e DRIVERLESS_AI_CONFIG_FILE_PATH="tmp/config.toml" \
    -v `pwd`/data:/data \
    -v `pwd`/log:/log \
    -v `pwd`/license:/license \
    -v `pwd`/tmp:/tmp \
    opsh2oai/h2oai-runtime


Sample config.toml File
-----------------------

::

	# ----------------------------------------------------------------------------
	#                      DRIVERLESS AI CONFIGURATION FILE
	#
	#
	# This file is authored in TOML (see https://github.com/toml-lang/toml)
	# 
	# The variables in this file can be overriden by corresponding environment
	#   variables named DRIVERLESS_AI_* (e.g. "max_cores" can be overridden by
	#   the environment variable "DRIVERLESS_AI_MAX_CORES").
	#
	# ----------------------------------------------------------------------------

	# IP address and port for Driverless AI HTTP server.
	ip = "127.0.0.1"
	port = 12345

	# Max number of CPU cores to use per experiment. Set to <= 0 to use all cores.
	max_cores = 0

	# Number of GPUs to use per model training task.  Set to -1 for all GPUs.
	# Currently n_gpus!=1 disables GPU locking, so is only recommended for single experiments and single users.
	# Ignored if GPUs disabled or no GPUs on system.
	num_gpus = 1

	# Which gpu_id to start with
	# If use CUDA_VISIBLE_DEVICES=... to control GPUs, gpu_id=0 is still the first in that list of devices.
	# E.g. if CUDA_VISIBLE_DEVICES="4,5" then gpu_id_start=0 will refer to the device #4.
	gpu_id_start = 0

	# Maximum number of workers for DriverlessAI server pool (only 1 needed currently)
	max_workers = 1

	# Minimum amount of disk space in GB needed to run experiments.
	# Experiments will fail if this limit is crossed.
	disk_limit_gb = 5

	# Minimum amount of system memory in GB needed to start experiments
	memory_limit_gb = 5

	# IP address and port of process proxy.
	process_server_ip = "127.0.0.1"
	process_server_port = 8080

	# IP address and port of H2O instance.
	h2o_ip = "127.0.0.1"
	h2o_port = 54321

	# Data directory. All application data and files related datasets and experiments
	#   are stored in this directory.
	data_directory = "./tmp"

	# Start HTTP server in debug mode (DO NOT enable in production).
	debug = false

	# Whether to run quick performance benchmark at start of application and each experiment
	enable_benchmark = false

	# Minimum number of rows needed to run experiments (values lower than 100 might not work)
	min_num_rows = 100

	# Internal threshold for number of rows to trigger certain statistical techniques to increase statistical fidelity
	statistical_threshold_num_rows_small = 10000

	# Internal threshold for number of rows to trigger certain statistical techniques that can speed up modeling
	statistical_threshold_num_rows_large = 1000000

	# Maximum number of columns
	max_cols = 10000

	# Threshold of rows * columns for which GPUs are disabled for speed purposes
	gpu_small_data_size = 100000

	# Maximum number of uniques allowed in fold column
	max_fold_uniques = 100000

	# Maximum number of classes
	max_num_classes = 100

	# Minimum allowed seconds for time column
	min_time_value = 5e8 # ~ > 1986

	# Minimum number of rows above which try to detect time series
	min_rows_detected_time = 10000

	# relative standard deviation of hold-out score below which early stopping is tirggered for accuracy~5
	stop_early_rel_std = 1e-3

	# Variable importance below which feature is dropped (with possible replacement found that is better)
	# This also sets overall scale for lower interpretability thresholds
	varimp_threshold_at_interpretability_10 = 0.05

	# Maximum number of GBM trees (early-stopping usually chooses much less)
	max_ntrees = 2000

	# Authentication
	#  unvalidated : Accepts user id and password, does not validate password
	#  none : Does not ask for user id or password, authenticated as admin
	#  pam :  Accepts user id and password, Validates against user against the operating system
	#  ldap : Accepts user id and password, Validates against an ldap server, look for additional settings under LDAP settings
	authentication_method = "unvalidated"

	# LDAP Settings
	ldap_server = ""
	ldap_port = ""
	ldap_dc = ""

	# Supported file formats (file name endings must match for files to show up in file browser): a comma separated list
	supported_file_types = "csv, tsv, txt, dat, tgz, zip, xz, xls, xlsx"

	# File System Support
	# Format: "file_system_1, file_system_2, file_system_3"
	# Allowed file systems:
	# file : local file system/server file system
	# hdfs : Hadoop file system, remember to configure the hadoop coresite and keytab below
	# s3 : Amazon S3, optionally configure secret and access key below

	enabled_file_systems = "file, hdfs, s3"

	# Configurations for a HDFS data source
	# Path of hdfs coresite.xml
	core_site_xml_path = ""
	# Path of the principal key tab file
	key_tab_path = ""

	# HDFS connector
	# Auth type can be Principal/keytab/keytabPrincipal
	# Specify HDFS Auth Type, allowed options are:
	#   noauth : No authentication needed
	#   principal : Authenticate with HDFS with a principal user
	#   keytab : Authenticate with a Key tab (recommended)
	#   keytabimpersonation : Login with impersonation using a keytab
	hdfs_auth_type = "noauth"

	# Kerberos app principal user (recommended)
	hdfs_app_principal_user = ""
	# Specify the user id of the current user here as user@realm
	hdfs_app_login_user = ""
	#
	hdfs_app_jvm_args = ""

	# AWS authentication settings
	#   True : Authenticated connection
	#   False : Unverified connection
	aws_auth = "False"

	# S3 Connector credentials
	aws_access_key_id = ""
	aws_secret_access_key = ""